如何使用正则表达式匹配双引号内的多行?

时间:2022-09-15 15:44:30

I have CSV file which contains following line.

我有CSV文件,其中包含以下行。

INPUT:

No,NAme,ID,Description
1,Stack,232,"ABCDEFGHIJKLMNO
 -- Jiuaslkm asdasdasd"
2,Queue,454,"PQRSTUVWXYZ
 -- Other 
 words here"
3,Que,4343,"sdfwerrew"

OUTPUT EXPECTED:

No,NAme,ID,Description
1,Stack,232,"ABCDEFGHIJKLMNO \n -- Jiuaslkm asdasdasd"
2,Queue,454,"PQRSTUVWXYZ \n -- Other \n  words here"
3,Que,4343,"sdfwerrew"

or

No,NAme,ID,Description
1,Stack,232,"ABCDEFGHIJKLMNO -- Jiuaslkm asdasdasd"
2,Queue,454,"PQRSTUVWXYZ -- Other  words here"
3,Que,4343,"sdfwerrew"

Is there any java regex pattern available to find and merge the lines based starting double quotes and end quotes?

是否有任何java正则表达式模式可用于查找和合并基于起始双引号和结束引号的行?

1 个解决方案

#1


3  

You are going down the wrong path. Not everything should be solved using regular expressions. CSV parsing is one of those things.

你走错了路。并非一切都应该使用正则表达式来解决。 CSV解析就是其中之一。

Seriously: you are about to re-invent the wheel. And the wheel you are about to create will be deficient, and prone to break over and over again.

说真的:你即将重新发明*。而你即将创造的*将是不足的,并且容易一次又一次地打破。

The sane approach: there are many existing CSV parsers for Java out there. They deal perfectly with multi-line values. So: use one of them (see here as starting point for the many choices you have)

理智的方法:那里有许多现有的Java解析器。它们完美地处理多线值。所以:使用其中一个(在这里作为你有很多选择的起点)

There is a nice rule of thumb: when your regex becomes so complicated that you can't write it down yourself; then consider doing things differently. You are the person who owns this code; you will have to maintain and maybe enhance it - not those folks here that are able to write down a regex that solves this one flavor of CSV example input.

有一个很好的经验法则:当你的正则表达式变得如此复杂以至于你无法自己写下来时;然后考虑采取不同的做法。您是拥有此代码的人;你将不得不维护并可能增强它 - 而不是那些能够写下正则表达式来解决这一种CSV示例输入的人。

#1


3  

You are going down the wrong path. Not everything should be solved using regular expressions. CSV parsing is one of those things.

你走错了路。并非一切都应该使用正则表达式来解决。 CSV解析就是其中之一。

Seriously: you are about to re-invent the wheel. And the wheel you are about to create will be deficient, and prone to break over and over again.

说真的:你即将重新发明*。而你即将创造的*将是不足的,并且容易一次又一次地打破。

The sane approach: there are many existing CSV parsers for Java out there. They deal perfectly with multi-line values. So: use one of them (see here as starting point for the many choices you have)

理智的方法:那里有许多现有的Java解析器。它们完美地处理多线值。所以:使用其中一个(在这里作为你有很多选择的起点)

There is a nice rule of thumb: when your regex becomes so complicated that you can't write it down yourself; then consider doing things differently. You are the person who owns this code; you will have to maintain and maybe enhance it - not those folks here that are able to write down a regex that solves this one flavor of CSV example input.

有一个很好的经验法则:当你的正则表达式变得如此复杂以至于你无法自己写下来时;然后考虑采取不同的做法。您是拥有此代码的人;你将不得不维护并可能增强它 - 而不是那些能够写下正则表达式来解决这一种CSV示例输入的人。