ascii字符串与二进制文件的“grep”偏移量

时间:2021-10-29 06:26:58

I'm generating binary data files that are simply a series of records concatenated together. Each record consists of a (binary) header followed by binary data. Within the binary header is an ascii string 80 characters long. Somewhere along the way, my process of writing the files got a little messed up and I'm trying to debug this problem by inspecting how long each record actually is.

我正在生成二进制数据文件,这些文件只是一系列连接在一起的记录。每个记录包含一个(二进制)标题,后跟二进制数据。在二进制头内是一个长度为80个字符的ascii字符串。在某个地方,我编写文件的过程有点搞砸了,我试图通过检查每条记录的实际长度来调试这个问题。

This seems extremely related, but I don't understand perl, so I haven't been able to get the accepted answer there to work. The other answer points to bgrep which I've compiled, but it wants me to feed it a hex string and I'd rather just have a tool where I can give it the ascii string and it will find it in the binary data, print the string and the byte offset where it was found.

这似乎非常相关,但我不理解perl,所以我无法在那里得到公认的答案。另一个答案指向我编译的bgrep,但它希望我用十六进制字符串提供它,我宁愿只有一个工具,我可以给它ascii字符串,它会在二进制数据中找到它,打印字符串和找到它的字节偏移量。

In other words, I'm looking for some tool which acts like this:

换句话说,我正在寻找一些像这样的工具:

tool foobar filename

or

要么

tool foobar < filename

and its output is something like this:

它的输出是这样的:

foobar:10
foobar:410
foobar:810
foobar:1210
...

e.g. the string which matched and a byte offset in the file where the match started. In this example case, I can infer that each record is 400 bytes long.

例如匹配的字符串和匹配开始的文件中的字节偏移量。在这个示例中,我可以推断每条记录的长度为400字节。

Other constraints:

其他限制:

  • ability to search by regex is cool, but I don't need it for this problem
  • 通过正则表达式搜索的能力很酷,但我不需要它来解决这个问题
  • My binary files are big (3.5Gb), so I'd like to avoid reading the whole file into memory if possible.
  • 我的二进制文件很大(3.5Gb),所以我想尽可能避免将整个文件读入内存。

3 个解决方案

#1


23  

You could use strings for this:

您可以使用字符串:

strings -a -t x filename | grep foobar

Tested with GNU binutils.

用GNU binutils测试。

For example, where in /bin/ls does --help occur:

例如,/ bin / ls中的地址--help发生:

strings -a -t x /bin/ls | grep -- --help

Output:

输出:

14938 Try `%s --help' for more information.
162f0       --help     display this help and exit

#2


23  

grep --byte-offset --only-matching --text foobar filename

The --byte-offset option prints the offset of each matching line.

--byte-offset选项打印每个匹配行的偏移量。

The --only-matching option makes it print offset for each matching instance instead of each matching line.

--only-matching选项使其为每个匹配实例而不是每个匹配行打印偏移量。

The --text option makes grep treat the binary file as a text file.

--text选项使grep将二进制文件视为文本文件。

You can shorten it to:

您可以将其缩短为:

grep -oba foobar filename

It works in the GNU version of grep, which comes with linux by default. It won't work in BSD grep (which comes with Mac by default).

它适用于GNU版本的grep,默认情况下它带有linux。它不适用于BSD grep(默认情况下附带Mac)。

#3


0  

I wanted to do the same task. Though strings | grep worked, I found gsar was the very tool I needed.

我想做同样的任务。虽然字符串| grep工作,我发现gsar是我需要的工具。

http://tjaberg.com/

http://tjaberg.com/

The output looks like:

输出如下:

>gsar.exe -bic -sfoobar filename.bin
filename.bin: 0x34b5: AAA foobar BBB
filename.bin: 0x56a0: foobar DDD
filename.bin: 2 matches found

#1


23  

You could use strings for this:

您可以使用字符串:

strings -a -t x filename | grep foobar

Tested with GNU binutils.

用GNU binutils测试。

For example, where in /bin/ls does --help occur:

例如,/ bin / ls中的地址--help发生:

strings -a -t x /bin/ls | grep -- --help

Output:

输出:

14938 Try `%s --help' for more information.
162f0       --help     display this help and exit

#2


23  

grep --byte-offset --only-matching --text foobar filename

The --byte-offset option prints the offset of each matching line.

--byte-offset选项打印每个匹配行的偏移量。

The --only-matching option makes it print offset for each matching instance instead of each matching line.

--only-matching选项使其为每个匹配实例而不是每个匹配行打印偏移量。

The --text option makes grep treat the binary file as a text file.

--text选项使grep将二进制文件视为文本文件。

You can shorten it to:

您可以将其缩短为:

grep -oba foobar filename

It works in the GNU version of grep, which comes with linux by default. It won't work in BSD grep (which comes with Mac by default).

它适用于GNU版本的grep,默认情况下它带有linux。它不适用于BSD grep(默认情况下附带Mac)。

#3


0  

I wanted to do the same task. Though strings | grep worked, I found gsar was the very tool I needed.

我想做同样的任务。虽然字符串| grep工作,我发现gsar是我需要的工具。

http://tjaberg.com/

http://tjaberg.com/

The output looks like:

输出如下:

>gsar.exe -bic -sfoobar filename.bin
filename.bin: 0x34b5: AAA foobar BBB
filename.bin: 0x56a0: foobar DDD
filename.bin: 2 matches found