琐碎-hadoop1.X和2.X的区别

时间:2022-10-29 13:32:51

1.  jobtracker做了分离,分成了resourceManager和nodemanager;

2.  MR变成了和HBase和Hive等一样的yarn上面的一个应用;

3.  1.x的默认块大小为64M,2.x的默认块大小为128M;

琐碎-hadoop1.X和2.X的区别

琐碎-hadoop1.X和2.X的区别

4.  在2.x中除了datanode要向namenode报告status,nodemanager也要向ResourceManager报告status

5. MR API差别

旧的WordCount

 package org.apache.hadoop.mapred;

 ... ...

 public class WordCount extends Configured implements Tool {

   public static class MapClass extends MapReduceBase
implements Mapper<LongWritable, Text, Text, IntWritable> { ... ... public void map(LongWritable key, Text value,
OutputCollector<Text, IntWritable> output,
Reporter reporter) throws IOException {
... ...
}
} public static class Reduce extends MapReduceBase
implements Reducer<Text, IntWritable, Text, IntWritable> { public void reduce(Text key, Iterator<IntWritable> values,
OutputCollector<Text, IntWritable> output,
Reporter reporter) throws IOException {
... ...
}
} static int printUsage() {
System.out.println("wordcount [-m <maps>] [-r <reduces>] <input> <output>");
ToolRunner.printGenericCommandUsage(System.out);
return -1;
} public int run(String[] args) throws Exception {
... ...
return 0;
}
public static void main(String[] args) throws Exception {
int res = ToolRunner.run(new Configuration(), new WordCount(), args);
System.exit(res);
} }

新的WordCount

 package org.apache.hadoop.examples;

 ... ...

 public class WordCount {

   public static class TokenizerMapper
extends Mapper<Object, Text, Text, IntWritable>{ ... ... public void map(Object key, Text value, Context context
) throws IOException, InterruptedException {
... ...
}
} public static class IntSumReducer
extends Reducer<Text,IntWritable,Text,IntWritable> {
private IntWritable result = new IntWritable(); public void reduce(Text key, Iterable<IntWritable> values,
Context context
) throws IOException, InterruptedException {
... ...
}
} public static void main(String[] args) throws Exception {
... ...
System.exit(job.waitForCompletion(true) ? 0 : 1);
}
}

6.