Python RegEx匹配2个字符串字段

时间:2021-12-22 04:26:22

I want to create a RegEx that will help me with my process. I have to see if 2 fields that contain text match or not, e.g

我想创建一个RegEx来帮助我完成我的流程。我必须看看包含文本的2个字段是否匹配,例如

fA="John Cohen" and fB="Jackie Cohen SRL"

First: I want to see how many words there are ==> 2 in fA and 3 in fB. Sometimes fA or fB have only 1 word and I need to see if fA = fB

第一:我想看看有多少单词在fA中有==> 2,在fB中有3。有时fA或fB只有1个字,我需要查看fA = fB

Second: I need to see if fA is included in fB or if is the same, and what is different?

第二:我需要看看fA中是否包含fA,或者是否相同,有什么不同?

Please let me know if you need more info.

如果您需要更多信息,请告诉我。

1 个解决方案

#1


0  

Regex won't be the right tool for what you want.

正则表达式不适合您想要的工具。

You can start by splitting your strings (which gives you lists of words), and making sets from them.

你可以先分割你的字符串(它给你单词列表),然后从中创建集合。

Then, you can easily and quickly get the common elements, differences and so on, see the documentation for all possible operations.

然后,您可以轻松快速地获取常见元素,差异等,查看所有可能操作的文档。

Some of the things you want could be accomplished this way:

你想要的一些东西可以用这种方式完成:

fA="John Cohen" 
fB="Jackie Cohen SRL"

set_A = set(fA.split())
# {'Cohen', 'John'}
set_B = set(fB.split())
# {'Cohen', 'Jackie', 'SRL'}

if len(set_A) == 1 and set_A == set_B:
    print("Both strings are {}".format(fA))

# set_A included in (or equal to) set_B?
if set_A <= set_B:
    print("{} included in {}".format(set_A, set_B))

# common elements
print(set_A & set_B)
# {'Cohen'}

# in set_B but not in set_A
print(set_B - set_A)
# {'SRL', 'Jackie'}

# in set_A or set_B, but not in both
print(set_A ^ set_B)
# {'John', 'SRL', 'Jackie'}

You could print the elements of a set like this:

你可以像这样打印一组元素:

print('/'.join(set_A ^ set_B))
# John/SRL/Jackie

but note that the order will be random, as sets are not ordered. You could also turn the sets into lists with list(set_A) and sort them if you need.

但请注意,订单将是随机的,因为订单不是订购的。您还可以使用列表(set_A)将集合转换为列表,并根据需要对它们进行排序。

#1


0  

Regex won't be the right tool for what you want.

正则表达式不适合您想要的工具。

You can start by splitting your strings (which gives you lists of words), and making sets from them.

你可以先分割你的字符串(它给你单词列表),然后从中创建集合。

Then, you can easily and quickly get the common elements, differences and so on, see the documentation for all possible operations.

然后,您可以轻松快速地获取常见元素,差异等,查看所有可能操作的文档。

Some of the things you want could be accomplished this way:

你想要的一些东西可以用这种方式完成:

fA="John Cohen" 
fB="Jackie Cohen SRL"

set_A = set(fA.split())
# {'Cohen', 'John'}
set_B = set(fB.split())
# {'Cohen', 'Jackie', 'SRL'}

if len(set_A) == 1 and set_A == set_B:
    print("Both strings are {}".format(fA))

# set_A included in (or equal to) set_B?
if set_A <= set_B:
    print("{} included in {}".format(set_A, set_B))

# common elements
print(set_A & set_B)
# {'Cohen'}

# in set_B but not in set_A
print(set_B - set_A)
# {'SRL', 'Jackie'}

# in set_A or set_B, but not in both
print(set_A ^ set_B)
# {'John', 'SRL', 'Jackie'}

You could print the elements of a set like this:

你可以像这样打印一组元素:

print('/'.join(set_A ^ set_B))
# John/SRL/Jackie

but note that the order will be random, as sets are not ordered. You could also turn the sets into lists with list(set_A) and sort them if you need.

但请注意,订单将是随机的,因为订单不是订购的。您还可以使用列表(set_A)将集合转换为列表,并根据需要对它们进行排序。