Hadoop-2.4.1学习之edits和fsimage查看器

在hadoop中edits和fsimage是两个至关关键的文件。当中edits负责保存自最新检查点后命名空间的变化。起着日志的作用，而fsimage则保存了最新的检查点信息。这个两个文件里的内容使用普通文本编辑器是无法直接查看的，幸运的是hadoop为此准备了专门的工具用于查看文件的内容。这些工具分别为oev和oiv。能够使用hdfs调用运行。

oev是offline edits viewer（离线edits查看器）的缩写，该工具仅仅操作文件因而并不须要hadoop集群处于执行状态。该工具提供了几个输出处理器。用于将输入文件转换为相关格式的输出文件，能够使用參数-p指定。眼下支持的输出格式有binary（hadoop使用的二进制格式）、xml（在不使用參数p时的默认输出格式）和stats（输出edits文件的统计信息）。该工具支持的输入格式为binary和xml，当中的xml文件为该工具使用xml处理器的输出文件。因为没有与stats格式相应的输入文件，所以一旦输出为stats格式将不能够再转换为原有格式。比方输入格式为bianry。输出格式为xml。能够通过将输入文件指定为原来的输出文件，将输出文件指定为原来的输入文件实现binary和xml的转换，而stats则不能够。

该工具的详细使用语法为：

Usage: bin/hdfs oev [OPTIONS] -i INPUT_FILE -o OUTPUT_FILE

Parse a Hadoop edits log file INPUT_FILE and save results

in OUTPUT_FILE.

Required command line arguments:

-i,--inputFile <arg>   edits file to process, xml (case insensitive) extension means XML format, any other filename means binary format

-o,--outputFile <arg>  Name of output file. If the specified file exists, it will be overwritten, format of the file is determined by -p option

Optional command line arguments:

-p,--processor <arg>   Select which type of processor to apply against image file, currently supported processors are: binary (native binary format that Hadoop uses), xml (default, XML format), stats (prints statistics about edits file)

-h,--help            Display usage information and exit

-f,--fix-txids         Renumber the transaction IDs in the input,so that there are no gaps or invalid transaction IDs.

-r,--recover          When reading binary edit logs, use recovery mode.  This will give you the chance to skip corrupt parts of the edit log.

-v,--verbose         More verbose output, prints the input and output filenames, for processors that write to a file, also output to screen. On large image files this will dramatically increase processing time (default is false).

该工具使用的演示样例及输出文件的部分文件内容例如以下：

$ hdfs oev -i edits_0000000000000000081-0000000000000000089 -o edits.xml

<?

xml version="1.0" encoding="UTF-8"?

>

<EDITS>

  <EDITS_VERSION>-56</EDITS_VERSION>

  <RECORD>

    <OPCODE>OP_DELETE</OPCODE>

    <DATA>

      <TXID>88</TXID>

      <LENGTH>0</LENGTH>

      <PATH>/user/hive/test</PATH>

      <TIMESTAMP>1413794973949</TIMESTAMP>

      <RPC_CLIENTID>a52277d8-a855-41ee-9ca2-a5d0bc7d298a</RPC_CLIENTID>

      <RPC_CALLID>3</RPC_CALLID>

    </DATA>

  </RECORD>

</EDITS>

在输出文件里。每一个RECORD记录了一次操作。在该演示样例中运行的是删除操作。当edits文件破损进而导致hadoop集群出现故障时。保存edits文件里正确的部分是可能的，能够通过将原有的bianry文件转换为xml文件，并手动编辑xml文件然后转回bianry文件来实现。

最常见的edits文件破损情况是丢失关闭记录的部分（OPCODE为-1），关闭记录例如以下所看到的。

假设在xml文件里没有关闭记录。能够在最后正确的记录后面加入关闭记录。关闭记录后面的记录都将被忽略。

<RECORD>

    <OPCODE>-1</OPCODE>

    <DATA>

    </DATA>

</RECORD>

oiv是offline image viewer的缩写。用于将fsimage文件的内容转储到指定文件里以便于阅读，该工具还提供了仅仅读的WebHDFS API以同意离线分析和检查hadoop集群的命名空间。

oiv在处理很大的fsimage文件时是相当快的，假设该工具不可以处理fsimage。它会直接退出。该工具不具备向后兼容性，比方使用hadoop-2.4版本号的oiv不能处理hadoop-2.3版本号的fsimage，仅仅能使用hadoop-2.3版本号的oiv。同oev一样。就像它的名称所提示的（offline），oiv也不须要hadoop集群处于执行状态。oiv详细语法可以通过在命令行输入hdfs
oiv查看。

oiv支持三种输出处理器，分别为Ls、XML和FileDistribution。通过选项-p指定。Ls是默认的处理器，该处理器的输出与lsr命令的输出极其相似，以同样的顺序输出同样的字段。比方文件夹或文件的标志、权限、副本数量、全部者、组、文件大小、改动日期和全路径等。与lsr不同的是，该处理器的输出包括根路径/。还有一个重要的不同是该处理器的输出不是依照文件夹名称和内容排序的，而是依照在fsimage中的顺序显示。除非命名空间包括较少的信息。否则不太可能直接比較该处理器和lsr命令的输出。Ls使用INode块中的信息计算文件大小并忽略-skipBlocks选项。示比例如以下：

[hadoop@hadoop current]$ hdfs oiv -i fsimage_0000000000000000115 -o fsimage.ls

[hadoop@hadoop current]$ cat fsimage.ls

drwxr-xr-x  -   hadoop supergroup 1412832662162          0 /

drwxr-xr-x  -   hadoop supergroup 1413795010372          0 /user

drwxr-xr-x  -   hadoop supergroup 1414032848858          0 /user/hadoop

drwxr-xr-x  -   hadoop supergroup 1411626881217          0 /user/hadoop/input

drwxr-xr-x  -   hadoop supergroup 1413770138964          0 /user/hadoop/output

XML处理器输出fsimage的xml文档，包括了fsimage中的全部信息。比方inodeid等。该处理器的输出支持XML工具的自己主动化处理和分析，因为XML语法格式的冗长，该处理器的输出也最大。示比例如以下：

[hadoop@hadoop current]$ hdfs oiv -i fsimage_0000000000000000115 -p XML -o fsimage.xml

[hadoop@hadoop current]$ cat fsimage.xml

<?

xml version="1.0"?>

<fsimage>

	<NameSection>

		<genstampV1>1000</genstampV1>

		<genstampV2>1004</genstampV2>

		<genstampV1Limit>0</genstampV1Limit>

		<lastAllocatedBlockId>1073741828</lastAllocatedBlockId>

		<txid>115</txid>

	</NameSection>

	<INodeSection>

		<lastInodeId>16418</lastInodeId>

		<inode>

			<id>16385</id>

			<type>DIRECTORY</type>

			<name></name>

			<mtime>1412832662162</mtime>

			<permission>hadoop:supergroup:rwxr-xr-x</permission>

			<nsquota>9223372036854775807</nsquota>

			<dsquota>-1</dsquota>

		</inode>

		<inode>

			<id>16386</id>

			<type>DIRECTORY</type>

			<name>user</name>

			<mtime>1413795010372</mtime>

			<permission>hadoop:supergroup:rwxr-xr-x</permission>

			<nsquota>-1</nsquota>

			<dsquota>-1</dsquota>

		</inode>

		</INodeSection>

</fsimage>

FileDistribution是分析命名空间中文件大小的工具。为了执行该工具须要通过指定最大文件大小和段数定义一个整数范围[0,maxSize]，该整数范围依据段数切割为若干段[0, s[1], ..., s[n-1], maxSize]，处理器计算有多少文件落入每一个段中（[s[i-1], s[i]），大于maxSize的文件总是落入最后的段中。即s[n-1], maxSize。输出文件被格式化为由tab分隔的包括Size列和NumFiles列的表，当中Size表示段的起始，NumFiles表示文件大小落入该段的文件数量。在使用FileDistribution处理器时还须要指定该处理器的參数maxSize和step。若未指定默觉得0。示比例如以下：

[hadoop@hadoop current]$ hdfs oiv -i fsimage_0000000000000000115 -o fsimage.fd -p FileDistribution maxSize 1000 step 5

[hadoop@hadoop current]$ cat fsimage.fd

Processed 0 inodes.

Size	NumFiles

2097152	2

totalFiles = 2

totalDirectories = 11

totalBlocks = 2

totalSpace = 4112

maxFileSize = 1366