如何根据cygwin中的开始和结束行号裁剪(剪切)文本文件?

I have few log files around 100MBs each. Personally I find it cumbersome to deal with such big files. I know that log lines that are interesting to me are only between 200 to 400 lines or so.

我几乎没有日志文件，每个大约100mb。我个人觉得处理这么大的文件很麻烦。我知道我感兴趣的对数行只有200到400行左右。

What would be a good way to extract relavant log lines from these files ie I just want to pipe the range of line numbers to another file.

从这些文件中提取相关日志行的好方法是什么?例如，我只想将行号的范围传输到另一个文件。

For example, the inputs are:

例如，输入是:

filename: MyHugeLogFile.log
Starting line number: 38438
Ending line number:   39276

Is there a command that I can run in cygwin to cat out only that range in that file? I know that if I can somehow display that range in stdout then I can also pipe to an output file.

是否有一个命令我可以在cygwin中运行，只在该文件中查找该范围?我知道，如果我可以以某种方式在stdout中显示这个范围，那么我也可以通过管道传输到输出文件。

Note: Adding Linux tag for more visibility, but I need a solution that might work in cygwin. (Usually linux commands do work in cygwin).

注意:添加Linux标记以获得更多的可见性，但是我需要一个在cygwin中可以工作的解决方案。(通常linux命令在cygwin中是有效的)。

6 个解决方案

#1

Sounds like a job for sed:

听起来像是sed的工作:

sed -n '8,12p' yourfile

...will send lines 8 through 12 of yourfile to standard out.

…将把文件的第8行到第12行发送到standard out。

If you want to prepend the line number, you may wish to use cat -n first:

如果你想要预置行号，你可以先用cat -n:

cat -n yourfile | sed -n '8,12p'

#2

You can use wc -l to figure out the total # of lines.

您可以使用wc -l来计算行总数。

You can then combine head and tail to get at the range you want. Let's assume the log is 40,000 lines, you want the last 1562 lines, then of those you want the first 838. So:

然后你可以把头部和尾部结合在一起，达到你想要的范围。假设log是4万行，你想要最后的1562行，然后是第838行。所以:

tail -1562 MyHugeLogFile.log | head -838 | ....

Or there's probably an easier way using sed or awk.

或者可能有更简单的方法使用sed或awk。

#3

I saw this thread when I was trying to split a file in files with 100 000 lines. A better solution than sed for that is:

当我试图将一个文件分割成10万行的文件时，我看到了这个线程。比sed更好的解决办法是:

split -l 100000 database.sql database-

It will give files like:

它会给出如下文件:

database-aaa
database-aab
database-aac
...

#4

How about this:

这个怎么样:

$ seq 1 100000 | tail -n +10000 | head -n 10
10000
10001
10002
10003
10004
10005
10006
10007
10008
10009

It uses tail to output from the 10,000th line and onwards and then head to only keep 10 lines.

它使用尾巴从第10000行开始输出，然后继续保持10行。

The same (almost) result with sed:

sed的结果几乎相同:

$ seq 1 100000 | sed -n '10000,10010p'
10000
10001
10002
10003
10004
10005
10006
10007
10008
10009
10010

This one has the advantage of allowing you to input the line range directly.

它的优点是允许您直接输入行范围。

#5

And if you simply want to cut part of a file - say from line 26 to 142 - and input it to a newfile : cat file-to-cut.txt | sed -n '26,142p' >> new-file.txt

如果您只是想将文件的一部分(比如从第26行到142行)剪切，并将其输入到一个新文件:cat file-to-cut。txt | sed -n '26,142p' > new-file.txt

#6

If you are interested only in the last X lines, you can use the "tail" command like this.

如果您只对最后的X行感兴趣，可以使用“tail”命令。

$ tail -n XXXXX yourlogfile.log >> mycroppedfile.txt

This will save the last XXXXX lines of your log file to a new file called "mycroppedfile.txt"

这将把日志文件的最后XXXXX行保存到一个名为“mycroppedfile.txt”的新文件中。

#1