Add comments This PR is to add comments to TestGenerateDataBackupValueFileInvalidObject in pkg/ddc/alluxio/backup_data_test.go.backup data invalid object#6005
Conversation
…dc/alluxio/backup_data_test.go.
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Hi @letyoudown. Thanks for your PR. I'm waiting for a fluid-cloudnative member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
There was a problem hiding this comment.
Code Review
This pull request introduces a three-stage MapReduce program in Java (InvertedIndex.java) to build a document inverted index, complete with helper bash scripts for building and running the tasks, and a minor test comment update in Go. The reviewer's feedback highlights critical performance optimizations for MapReduce, specifically recommending the reuse of Writable objects (SortKey and Text) within the mapper to minimize garbage collection overhead during execution.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| public SortKey() { | ||
| } | ||
|
|
||
| public SortKey(int total, String word) { | ||
| this.total.set(total); | ||
| this.word.set(word); | ||
| } |
There was a problem hiding this comment.
To avoid excessive garbage collection overhead in MapReduce, it is a best practice to reuse Writable objects instead of instantiating them for every record. Let's add a set method to SortKey so it can be reused in the Mapper.
| public SortKey() { | |
| } | |
| public SortKey(int total, String word) { | |
| this.total.set(total); | |
| this.word.set(word); | |
| } | |
| public SortKey() { | |
| } | |
| public SortKey(int total, String word) { | |
| this.total.set(total); | |
| this.word.set(word); | |
| } | |
| public void set(int total, String word) { | |
| this.total.set(total); | |
| this.word.set(word); | |
| } |
| public static class SortMapper extends Mapper<LongWritable, Text, SortKey, Text> { | ||
| private final Text outValue = new Text(); | ||
|
|
||
| @Override | ||
| protected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException { | ||
| String line = value.toString(); | ||
| if (line == null || line.isEmpty()) { | ||
| return; | ||
| } | ||
|
|
||
| int firstTab = line.indexOf('\t'); | ||
| if (firstTab < 0 || firstTab == line.length() - 1) { | ||
| return; | ||
| } | ||
|
|
||
| String word = line.substring(0, firstTab); | ||
| String remainder = line.substring(firstTab + 1); | ||
| int secondTab = remainder.indexOf('\t'); | ||
| String totalStr = secondTab >= 0 ? remainder.substring(0, secondTab) : remainder; | ||
| int total = Integer.parseInt(totalStr.trim()); | ||
|
|
||
| outValue.set(remainder); | ||
| context.write(new SortKey(total, word), outValue); | ||
| } | ||
| } |
There was a problem hiding this comment.
Reuse the SortKey and Text instances as class member variables to avoid creating a new SortKey object for every input record in the map method. This significantly reduces GC overhead during the sorting phase.
public static class SortMapper extends Mapper<LongWritable, Text, SortKey, Text> {
private final SortKey outKey = new SortKey();
private final Text outValue = new Text();
@Override
protected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
String line = value.toString();
if (line == null || line.isEmpty()) {
return;
}
int firstTab = line.indexOf('\t');
if (firstTab < 0 || firstTab == line.length() - 1) {
return;
}
String word = line.substring(0, firstTab);
String remainder = line.substring(firstTab + 1);
int secondTab = remainder.indexOf('\t');
String totalStr = secondTab >= 0 ? remainder.substring(0, secondTab) : remainder;
int total = Integer.parseInt(totalStr.trim());
outValue.set(remainder);
outKey.set(total, word);
context.write(outKey, outValue);
}
}|
The comment addition to |
| @@ -0,0 +1,465 @@ | |||
| import java.io.BufferedReader; | |||
There was a problem hiding this comment.
This PR adds an entire lab2/ directory (465-line Java MapReduce program, shell scripts, README) that has nothing to do with the Fluid project. It looks like homework files were accidentally committed. Please remove the lab2/ directory and keep only the doc comment change in pkg/ddc/alluxio/backup_data_test.go.



Ⅰ. Describe what this PR does
This PR adds comments to TestGenerateDataBackupValueFileInvalidObject in pkg/ddc/alluxio/backup_data_test.go. The comments explain that the test verifies generateDataBackupValueFile rejects invalid object types and does not create a value file.
Ⅱ. Does this pull request fix one issue?
fixes #6004
Ⅲ. Special notes for reviews
None.