|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectorg.apache.hadoop.mapreduce.Reducer<KEYIN,VALUEIN,KEYOUT,VALUEOUT>
public class Reducer<KEYIN,VALUEIN,KEYOUT,VALUEOUT>
Reduces a set of intermediate values which share a key to a smaller set of values.
Reducer implementations
can access the Configuration for the job via the
JobContext.getConfiguration() method.
Reducer has 3 primary phases:
The Reducer copies the sorted output from each
Mapper using HTTP across the network.
The framework merge sorts Reducer inputs by
keys
(since different Mappers may have output the same key).
The shuffle and sort phases occur simultaneously i.e. while outputs are being fetched they are merged.
To achieve a secondary sort on the values returned by the value
iterator, the application should extend the key with the secondary
key and define a grouping comparator. The keys will be sorted using the
entire key, but will be grouped using the grouping comparator to decide
which keys and values are sent in the same call to reduce.The grouping
comparator is specified via
Job.setGroupingComparatorClass(Class). The sort order is
controlled by
Job.setSortComparatorClass(Class).
In this phase the
reduce(Object, Iterable, Context)
method is called for each <key, (collection of values)> in
the sorted inputs.
The output of the reduce task is typically written to a
RecordWriter via
TaskInputOutputContext.write(Object, Object).
The output of the Reducer is not re-sorted.
Example:
public class IntSumReducerextends Reducer { private IntWritable result = new IntWritable(); public void reduce(Key key, Iterable values, Context context) throws IOException { int sum = 0; for (IntWritable val : values) { sum += val.get(); } result.set(sum); context.collect(key, result); } }
Mapper,
Partitioner| Nested Class Summary | |
|---|---|
class |
Reducer.Context
|
| Constructor Summary | |
|---|---|
Reducer()
|
|
| Method Summary | |
|---|---|
protected void |
cleanup(Reducer.Context context)
Called once at the end of the task. |
protected void |
reduce(KEYIN key,
Iterable<VALUEIN> values,
Reducer.Context context)
This method is called once for each key. |
void |
run(Reducer.Context context)
Advanced application writers can use the run(org.apache.hadoop.mapreduce.Reducer.Context) method to
control how the reduce task works. |
protected void |
setup(Reducer.Context context)
Called once at the start of the task. |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Constructor Detail |
|---|
public Reducer()
| Method Detail |
|---|
protected void setup(Reducer.Context context)
throws IOException,
InterruptedException
IOException
InterruptedException
protected void reduce(KEYIN key,
Iterable<VALUEIN> values,
Reducer.Context context)
throws IOException,
InterruptedException
IOException
InterruptedException
protected void cleanup(Reducer.Context context)
throws IOException,
InterruptedException
IOException
InterruptedException
public void run(Reducer.Context context)
throws IOException,
InterruptedException
run(org.apache.hadoop.mapreduce.Reducer.Context) method to
control how the reduce task works.
IOException
InterruptedException
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||