使用python中的一行中的正则表达式提取字符串

时间:2022-09-13 11:10:52

I have such types of sentences,

我有这样的句子,

"Hello world, Hi[12312] 234 32423 234
23423 232 23423
2223 234 23234
223 2332 2323
I am a programmer, How[54321]
23 2 12 112
12 1212 121
This is a program, Okay[123123] 12123
1232 12312 1231
323 123 23423
...this continues"

“你好世界,你好[12312] 234 32423 234 23423 232 23423 2223 234 23234 223 2332 2323我是程序员,如何[54321] 23 2 12 112 12 1212 121这是一个程序,好的[123123] 12123 1232 12312 1231 323 123 23423 ......这继续“

I would like to get all the string until '['

我想得到所有的字符串,直到'['

My output should be
"Hello world, Hi
I am a programmer, How
This is a program, Okay" `

我的输出应该是“Hello world,你好我是程序员,这是一个程序,好吧”`

How can I do this using regex?

我怎么能用正则表达式做到这一点?

1 个解决方案

#1


1  

You don't need to use regular expressions for this, you could just use split to split on '[', and then take the first returned item.

您不需要为此使用正则表达式,您可以使用split来拆分'[',然后获取第一个返回的项目。

line1 = "Hello world, Hi[12312]"
line2 = "I am a programmer, How[54321]"

print(line1.split('[')[0])
print(line2.split('[')[0])

Produces:

生产:

Hello world, Hi
I am a programmer, How

EDIT

编辑

To loop through all lines, and only print those that have '[':

循环遍历所有行,并仅打印那些具有'['的行:

string1 = '''Hello world, Hi[12312] 234 32423 234
23423 232 23423
2223 234 23234
223 2332 2323
I am a programmer, How[54321]
23 2 12 112
12 1212 121
This is a program, Okay[123123] 12123
1232 12312 1231
323 123 23423'''

lines = string1.split('\n')

for line in lines:
    if '[' in line:
        print(line.split('[')[0])

Prdouces:

Prdouces:

Hello world, Hi
I am a programmer, How
This is a program, Okay

#1


1  

You don't need to use regular expressions for this, you could just use split to split on '[', and then take the first returned item.

您不需要为此使用正则表达式,您可以使用split来拆分'[',然后获取第一个返回的项目。

line1 = "Hello world, Hi[12312]"
line2 = "I am a programmer, How[54321]"

print(line1.split('[')[0])
print(line2.split('[')[0])

Produces:

生产:

Hello world, Hi
I am a programmer, How

EDIT

编辑

To loop through all lines, and only print those that have '[':

循环遍历所有行,并仅打印那些具有'['的行:

string1 = '''Hello world, Hi[12312] 234 32423 234
23423 232 23423
2223 234 23234
223 2332 2323
I am a programmer, How[54321]
23 2 12 112
12 1212 121
This is a program, Okay[123123] 12123
1232 12312 1231
323 123 23423'''

lines = string1.split('\n')

for line in lines:
    if '[' in line:
        print(line.split('[')[0])

Prdouces:

Prdouces:

Hello world, Hi
I am a programmer, How
This is a program, Okay