Hadoop-hdfs和mapreduce学习笔记二

时间:2022-12-27 07:43:58

Hadoop-hdfs和mapreduce

一、初始hdfs

1、HDFS管理界面。

http://192.168.0.205:50070/

Hadoop-hdfs和mapreduce学习笔记二

 Hadoop-hdfs和mapreduce学习笔记二

Hadoop-hdfs和mapreduce学习笔记二

上传文件

hadoop fs -put jdk-7u65-linux-i586.tar.gzhdfs://node1:9000/

再次网页上刷新查看。

Hadoop-hdfs和mapreduce学习笔记二

点击可以下载

Hadoop-hdfs和mapreduce学习笔记二

命令行下载

hadoop fs -gethdfs://node1:9000/jdk-7u65-linux-i586.tar.gz

Hadoop-hdfs和mapreduce学习笔记二

二、初始mapreduce

在安装目录下有个share目录下面的hadoop里面有默认的一些jar包。

计算圆周率运行jar

hadoop jar hadoop-mapreduce-examples-2.4.1.jar pi 5 5

Number of Maps  = 5

Samples per Map = 5

Wrote input for Map #0

Wrote input for Map #1

Wrote input for Map #2

Wrote input for Map #3

Wrote input for Map #4

Starting Job

16/01/25 13:18:15 INFO client.RMProxy:Connecting to ResourceManager at node1/192.168.0.205:8032

16/01/25 13:18:16 INFOinput.FileInputFormat: Total input paths to process : 5

16/01/25 13:18:16 INFOmapreduce.JobSubmitter: number of splits:5

16/01/25 13:18:17 INFOmapreduce.JobSubmitter: Submitting tokens for job: job_1453697466705_0001

16/01/25 13:18:18 INFO impl.YarnClientImpl:Submitted application application_1453697466705_0001

16/01/25 13:18:18 INFO mapreduce.Job: Theurl to track the job: http://node1:8088/proxy/application_1453697466705_0001/

16/01/25 13:18:18 INFO mapreduce.Job:Running job: job_1453697466705_0001

16/01/25 13:18:26 INFO mapreduce.Job: Jobjob_1453697466705_0001 running in uber mode : false

16/01/25 13:18:26 INFO mapreduce.Job:  map 0% reduce 0%

16/01/25 13:18:54 INFO mapreduce.Job:  map 100% reduce 0%

16/01/25 13:19:15 INFO mapreduce.Job:  map 100% reduce 100%

16/01/25 13:19:16 INFO mapreduce.Job: Jobjob_1453697466705_0001 completed successfully

16/01/25 13:19:17 INFO mapreduce.Job:Counters: 49

       File System Counters

                FILE: Number of bytes read=116

                FILE: Number of byteswritten=559761

                FILE: Number of readoperations=0

                FILE: Number of large readoperations=0

                FILE: Number of writeoperations=0

                HDFS: Number of bytes read=1305

                HDFS: Number of byteswritten=215

                HDFS: Number of readoperations=23

                HDFS: Number of large readoperations=0

                HDFS: Number of writeoperations=3

       Job Counters

                Launched map tasks=5

                Launched reduce tasks=1

                Data-local map tasks=5

                Total time spent by all maps inoccupied slots (ms)=151444

                Total time spent by all reducesin occupied slots (ms)=12933

                Total time spent by all maptasks (ms)=151444

                Total time spent by all reducetasks (ms)=12933

                Total vcore-seconds taken byall map tasks=151444

                Total vcore-seconds taken byall reduce tasks=12933

                Total megabyte-seconds taken byall map tasks=155078656

                Total megabyte-seconds taken byall reduce tasks=13243392

       Map-Reduce Framework

                Map input records=5

                Map output records=10

                Map output bytes=90

                Map output materializedbytes=140

                Input split bytes=715

                Combine input records=0

                Combine output records=0

                Reduce input groups=2

                Reduce shuffle bytes=140

                Reduce input records=10

                Reduce output records=0

                Spilled Records=20

                Shuffled Maps =5

                Failed Shuffles=0

                Merged Map outputs=5

                GC time elapsed (ms)=321

                CPU time spent (ms)=7610

                Physical memory (bytes)snapshot=1077932032

                Virtual memory (bytes)snapshot=3148177408

                Total committed heap usage(bytes)=850395136

       Shuffle Errors

                BAD_ID=0

                CONNECTION=0

                IO_ERROR=0

                WRONG_LENGTH=0

                WRONG_MAP=0

                WRONG_REDUCE=0

       File Input Format Counters

                Bytes Read=590

       File Output Format Counters

                Bytes Written=97

Job Finished in 62.334 seconds

Estimated value of Pi is3.68000000000000000000

 

三、基本命令

创建目录命令(中间红色部分可以省略)

hadoop fs -mkdir hdfs://node1:9000/wordcout

hadoop fs -mkdir /wordcout/input

新建一个test.txt文件

vi test.txt

hello world

hello tom

hello jim

hello kitty

hello angelabay

hello wolf

hello laolan

 

上传test.txt

hadoop fs -put test.txt /wordcout/input

计算

hadoop jarhadoop-mapreduce-examples-2.4.1.jar wordcount /wordcout/input /wordcunt/output

Hadoop-hdfs和mapreduce学习笔记二

Hadoop-hdfs和mapreduce学习笔记二

跑完了。

 

使用命令查看目录

hadoop fs -ls /wordcout/input

hadoop fs -ls /wordcunt/output

Hadoop-hdfs和mapreduce学习笔记二

查看hdfs上面的文件(上面的计算结果)

hadoop fs -cat/wordcunt/output/part-r-00000

命令详解

hadoop fs

Hadoop-hdfs和mapreduce学习笔记二

如:

hadoop fs -ls /

Hadoop-hdfs和mapreduce学习笔记二

1表示副本数量。

修改权限及组

hadoop fs -chown angelababy:mygirls/jdk-7u65-linux-i586.tar.gz

Hadoop-hdfs和mapreduce学习笔记二

查看结果

Hadoop权限检查比较弱

修改权限

fs -chmod 777 /jdk-7u65-linux-i586.tar.gz

Hadoop-hdfs和mapreduce学习笔记二

Copy文件(和put一样)

hadoop fs -copyFromLocal./hadoop-mapreduce-client-app-2.4.1.jar /

Hadoop-hdfs和mapreduce学习笔记二

hdfs上面cp

hadoop fs -cp/hadoop-mapreduce-client-app-2.4.1.jar /wordcout

Hadoop-hdfs和mapreduce学习笔记二

查看空间大小

hadoop fs -df -h /

Hadoop-hdfs和mapreduce学习笔记二

查看文件大小

hadoop fs -du -s -h /

Hadoop-hdfs和mapreduce学习笔记二

hadoop fs -du -s -h hdfs://node1:9000/*

Hadoop-hdfs和mapreduce学习笔记二

创建目录

hadoop fs -mkdir -p /aa/bb

Hadoop-hdfs和mapreduce学习笔记二

删除

hadoop fs -rm -r /aa/bb

Hadoop-hdfs和mapreduce学习笔记二

可以看到删除的时候是到回收站,有提示。

Hadoop-hdfs和mapreduce学习笔记二

hadoop jar

hdfs不支持修改,可以追加。

Hadoop-hdfs和mapreduce学习笔记二

四、hdfs读写过程讲解及思想

1、读取过程(多个副本)

Hadoop-hdfs和mapreduce学习笔记二

2、基本思想

Hadoop-hdfs和mapreduce学习笔记二