如何在令牌上拆分文件?

时间:2021-02-20 21:34:57

Let's say you got a file containing texts (from 1 to N) separated by a $ How can a slit the file so the end result is N files?

假设您有一个包含文本(从1到N)的文件,用$分隔。如何切割文件,最终结果是N个文件?

text1 with newlines $
text2 $etc... $
textN

带有换行符的text1 $ text2 $ etc ... $ textN

I'm thinking something with awk or sed but is there any available unix app that already perform that kind of task?

我正在考虑使用awk或sed,但是有没有可用的unix应用程序已经完成了那种任务?

5 个解决方案

#1


2  

Maybe split -p pattern?

也许拆分-p模式?

Hmm. That may not be exactly what you want. It doesn't split a line, it only starts a new file when it sees the pattern. And it seems to be supported only on BSD-related systems.

嗯。这可能不是你想要的。它不会分割一行,只会在看到模式时启动一个新文件。它似乎只在BSD相关系统上得到支持。

You could use something like:

你可以使用类似的东西:

awk 'BEGIN {RS = "$"} { ... }'

edit: You might find some inspiration for the { ... } part here:

编辑:您可能会在{...}部分找到一些灵感:

http://www.gnu.org/manual/gawk/html_node/Split-Program.html

edit: Thanks to comment from dmckee, but csplit also seems to copy the whole line on which the pattern occurs.

编辑:感谢dmckee的评论,但csplit似乎也复制了模式发生的整行。

#2


3  

awk 'BEGIN{RS="$"; ORS=""} { textNumber++; print $0 > "text"textNumber".out" }' fileName

awk'BEGIN {RS =“$”; ORS =“”} {textNumber ++; print $ 0>“text”textNumber“.out”}'fileName

Thank to Bill Karwin for the idea.

感谢Bill Karwin的想法。

Edit : Add the ORS="" to avoid printing a newline at the end of each files.

编辑:添加ORS =“”以避免在每个文件的末尾打印换行符。

#3


1  

If I'm reading this right, the UNIX cut command can be used for this.

如果我正确阅读,可以使用UNIX cut命令。

cut -d $ -f 1- filename

I might have the syntax slightly off, but that should tell cut that you're using $ separated fields and to return fields 1 through the end.

我可能稍微关闭了语法,但是这应该告诉cut您使用$ separate字段并将字段1返回到结尾。

You may need to escape the $.

你可能需要逃避$。

#4


1  

awk -vRS="$" '{ print $0 > "text"t++".out" }' ORS="" file

#5


1  

using split command we can split using strings.

使用split命令我们可以使用字符串进行拆分

but csplit command will allow you to slit files basing on regular expressions as well.

但csplit命令将允许您基于正则表达式切割文件。

#1


2  

Maybe split -p pattern?

也许拆分-p模式?

Hmm. That may not be exactly what you want. It doesn't split a line, it only starts a new file when it sees the pattern. And it seems to be supported only on BSD-related systems.

嗯。这可能不是你想要的。它不会分割一行,只会在看到模式时启动一个新文件。它似乎只在BSD相关系统上得到支持。

You could use something like:

你可以使用类似的东西:

awk 'BEGIN {RS = "$"} { ... }'

edit: You might find some inspiration for the { ... } part here:

编辑:您可能会在{...}部分找到一些灵感:

http://www.gnu.org/manual/gawk/html_node/Split-Program.html

edit: Thanks to comment from dmckee, but csplit also seems to copy the whole line on which the pattern occurs.

编辑:感谢dmckee的评论,但csplit似乎也复制了模式发生的整行。

#2


3  

awk 'BEGIN{RS="$"; ORS=""} { textNumber++; print $0 > "text"textNumber".out" }' fileName

awk'BEGIN {RS =“$”; ORS =“”} {textNumber ++; print $ 0>“text”textNumber“.out”}'fileName

Thank to Bill Karwin for the idea.

感谢Bill Karwin的想法。

Edit : Add the ORS="" to avoid printing a newline at the end of each files.

编辑:添加ORS =“”以避免在每个文件的末尾打印换行符。

#3


1  

If I'm reading this right, the UNIX cut command can be used for this.

如果我正确阅读,可以使用UNIX cut命令。

cut -d $ -f 1- filename

I might have the syntax slightly off, but that should tell cut that you're using $ separated fields and to return fields 1 through the end.

我可能稍微关闭了语法,但是这应该告诉cut您使用$ separate字段并将字段1返回到结尾。

You may need to escape the $.

你可能需要逃避$。

#4


1  

awk -vRS="$" '{ print $0 > "text"t++".out" }' ORS="" file

#5


1  

using split command we can split using strings.

使用split命令我们可以使用字符串进行拆分

but csplit command will allow you to slit files basing on regular expressions as well.

但csplit命令将允许您基于正则表达式切割文件。