Python RegEx匹配2个字符串字段

时间:2021-12-22 04:26:22

I want to create a RegEx that will help me with my process. I have to see if 2 fields that contain text match or not, e.g


fA="John Cohen" and fB="Jackie Cohen SRL"

First: I want to see how many words there are ==> 2 in fA and 3 in fB. Sometimes fA or fB have only 1 word and I need to see if fA = fB

第一:我想看看有多少单词在fA中有==> 2,在fB中有3。有时fA或fB只有1个字,我需要查看fA = fB

Second: I need to see if fA is included in fB or if is the same, and what is different?


Please let me know if you need more info.


1 个解决方案



Regex won't be the right tool for what you want.


You can start by splitting your strings (which gives you lists of words), and making sets from them.


Then, you can easily and quickly get the common elements, differences and so on, see the documentation for all possible operations.


Some of the things you want could be accomplished this way:


fA="John Cohen" 
fB="Jackie Cohen SRL"

set_A = set(fA.split())
# {'Cohen', 'John'}
set_B = set(fB.split())
# {'Cohen', 'Jackie', 'SRL'}

if len(set_A) == 1 and set_A == set_B:
    print("Both strings are {}".format(fA))

# set_A included in (or equal to) set_B?
if set_A <= set_B:
    print("{} included in {}".format(set_A, set_B))

# common elements
print(set_A & set_B)
# {'Cohen'}

# in set_B but not in set_A
print(set_B - set_A)
# {'SRL', 'Jackie'}

# in set_A or set_B, but not in both
print(set_A ^ set_B)
# {'John', 'SRL', 'Jackie'}

You could print the elements of a set like this:


print('/'.join(set_A ^ set_B))
# John/SRL/Jackie

but note that the order will be random, as sets are not ordered. You could also turn the sets into lists with list(set_A) and sort them if you need.




Regex won't be the right tool for what you want.


You can start by splitting your strings (which gives you lists of words), and making sets from them.


Then, you can easily and quickly get the common elements, differences and so on, see the documentation for all possible operations.


Some of the things you want could be accomplished this way:


fA="John Cohen" 
fB="Jackie Cohen SRL"

set_A = set(fA.split())
# {'Cohen', 'John'}
set_B = set(fB.split())
# {'Cohen', 'Jackie', 'SRL'}

if len(set_A) == 1 and set_A == set_B:
    print("Both strings are {}".format(fA))

# set_A included in (or equal to) set_B?
if set_A <= set_B:
    print("{} included in {}".format(set_A, set_B))

# common elements
print(set_A & set_B)
# {'Cohen'}

# in set_B but not in set_A
print(set_B - set_A)
# {'SRL', 'Jackie'}

# in set_A or set_B, but not in both
print(set_A ^ set_B)
# {'John', 'SRL', 'Jackie'}

You could print the elements of a set like this:


print('/'.join(set_A ^ set_B))
# John/SRL/Jackie

but note that the order will be random, as sets are not ordered. You could also turn the sets into lists with list(set_A) and sort them if you need.
