Let's say you got a file containing texts (from 1 to N) separated by a $ How can a slit the file so the end result is N files?
假设您有一个包含文本(从1到N)的文件,用$分隔。如何切割文件,最终结果是N个文件?
text1 with newlines $
text2 $etc... $
textN带有换行符的text1 $ text2 $ etc ... $ textN
I'm thinking something with awk or sed but is there any available unix app that already perform that kind of task?
我正在考虑使用awk或sed,但是有没有可用的unix应用程序已经完成了那种任务?
5 个解决方案
#1
2
Maybe split -p
pattern?
也许拆分-p模式?
Hmm. That may not be exactly what you want. It doesn't split a line, it only starts a new file when it sees the pattern. And it seems to be supported only on BSD-related systems.
嗯。这可能不是你想要的。它不会分割一行,只会在看到模式时启动一个新文件。它似乎只在BSD相关系统上得到支持。
You could use something like:
你可以使用类似的东西:
awk 'BEGIN {RS = "$"} { ... }'
edit: You might find some inspiration for the { ... }
part here:
编辑:您可能会在{...}部分找到一些灵感:
http://www.gnu.org/manual/gawk/html_node/Split-Program.html
edit: Thanks to comment from dmckee, but csplit
also seems to copy the whole line on which the pattern occurs.
编辑:感谢dmckee的评论,但csplit似乎也复制了模式发生的整行。
#2
3
awk 'BEGIN{RS="$"; ORS=""} { textNumber++; print $0 > "text"textNumber".out" }' fileName
awk'BEGIN {RS =“$”; ORS =“”} {textNumber ++; print $ 0>“text”textNumber“.out”}'fileName
Thank to Bill Karwin for the idea.
感谢Bill Karwin的想法。
Edit : Add the ORS="" to avoid printing a newline at the end of each files.
编辑:添加ORS =“”以避免在每个文件的末尾打印换行符。
#3
1
If I'm reading this right, the UNIX cut command can be used for this.
如果我正确阅读,可以使用UNIX cut命令。
cut -d $ -f 1- filename
I might have the syntax slightly off, but that should tell cut that you're using $ separated fields and to return fields 1 through the end.
我可能稍微关闭了语法,但是这应该告诉cut您使用$ separate字段并将字段1返回到结尾。
You may need to escape the $.
你可能需要逃避$。
#4
1
awk -vRS="$" '{ print $0 > "text"t++".out" }' ORS="" file
#5
1
using split command we can split using strings.
使用split命令我们可以使用字符串进行拆分
but csplit command will allow you to slit files basing on regular expressions as well.
但csplit命令将允许您基于正则表达式切割文件。
#1
2
Maybe split -p
pattern?
也许拆分-p模式?
Hmm. That may not be exactly what you want. It doesn't split a line, it only starts a new file when it sees the pattern. And it seems to be supported only on BSD-related systems.
嗯。这可能不是你想要的。它不会分割一行,只会在看到模式时启动一个新文件。它似乎只在BSD相关系统上得到支持。
You could use something like:
你可以使用类似的东西:
awk 'BEGIN {RS = "$"} { ... }'
edit: You might find some inspiration for the { ... }
part here:
编辑:您可能会在{...}部分找到一些灵感:
http://www.gnu.org/manual/gawk/html_node/Split-Program.html
edit: Thanks to comment from dmckee, but csplit
also seems to copy the whole line on which the pattern occurs.
编辑:感谢dmckee的评论,但csplit似乎也复制了模式发生的整行。
#2
3
awk 'BEGIN{RS="$"; ORS=""} { textNumber++; print $0 > "text"textNumber".out" }' fileName
awk'BEGIN {RS =“$”; ORS =“”} {textNumber ++; print $ 0>“text”textNumber“.out”}'fileName
Thank to Bill Karwin for the idea.
感谢Bill Karwin的想法。
Edit : Add the ORS="" to avoid printing a newline at the end of each files.
编辑:添加ORS =“”以避免在每个文件的末尾打印换行符。
#3
1
If I'm reading this right, the UNIX cut command can be used for this.
如果我正确阅读,可以使用UNIX cut命令。
cut -d $ -f 1- filename
I might have the syntax slightly off, but that should tell cut that you're using $ separated fields and to return fields 1 through the end.
我可能稍微关闭了语法,但是这应该告诉cut您使用$ separate字段并将字段1返回到结尾。
You may need to escape the $.
你可能需要逃避$。
#4
1
awk -vRS="$" '{ print $0 > "text"t++".out" }' ORS="" file
#5
1
using split command we can split using strings.
使用split命令我们可以使用字符串进行拆分
but csplit command will allow you to slit files basing on regular expressions as well.
但csplit命令将允许您基于正则表达式切割文件。