Python 3正则表达式 - 如何匹配所有unicode字母字符和空格?

时间:2022-06-08 01:10:57

I am trying to validate place names in python 3/ django forms. I want to get matches with strings like: Los Angeles, Canada, 中国, and Россия. That is, the string contains:

我试图在python 3 / django表单中验证地名。我希望得到如下字符串的匹配:洛杉矶,加拿大,中国和Россия。也就是说,该字符串包含:

  • spaces
  • alphabetic characters (from any language)
  • 字母字符(来自任何语言)

  • no numbers
  • no special characters (punctuation, symbols etc.)
  • 没有特殊字符(标点,符号等)

The pattern I am currently using is r'^[^\W\d]+$' as suggested in this stack overflow question. However it only seems to match like the pattern r'^[a-zA-Z]+$. That is, Россия, Los Angeles and 中国 do not match , only Canada does.

我正在使用的模式是r'^ [^ \ W \ d] + $',如此堆栈溢出问题中所建议的那样。然而它似乎只匹配模式r'^ [a-zA-Z] + $。也就是说,Россия,洛杉矶和中国不匹配,只有加拿大。

An example of my code:

我的代码示例:

import re
re.search(r'^[^\W\d]+$', 'Россия')

Which returns nothing.

什么都不返回。

1 个解决方案

#1


2  

Your example works for me, but will find underscores and not spaces. This works:

您的示例适用于我,但会找到下划线而不是空格。这有效:

>>> re.search(r'^(?:[^\W\d_]| )+$', 'Los Angeles')
<_sre.SRE_Match object at 0x0000000003C612A0>
>>> re.search(r'^(?:[^\W\d_]| )+$', 'Россия')
<_sre.SRE_Match object at 0x0000000003A0D030>
>>> re.search(r'^(?:[^\W\d_]| )+$', 'Los_Angeles') # not found
>>> re.search(r'^(?:[^\W\d_]| )+$', '中国')
<_sre.SRE_Match object at 0x0000000003C612A0>

#1


2  

Your example works for me, but will find underscores and not spaces. This works:

您的示例适用于我,但会找到下划线而不是空格。这有效:

>>> re.search(r'^(?:[^\W\d_]| )+$', 'Los Angeles')
<_sre.SRE_Match object at 0x0000000003C612A0>
>>> re.search(r'^(?:[^\W\d_]| )+$', 'Россия')
<_sre.SRE_Match object at 0x0000000003A0D030>
>>> re.search(r'^(?:[^\W\d_]| )+$', 'Los_Angeles') # not found
>>> re.search(r'^(?:[^\W\d_]| )+$', '中国')
<_sre.SRE_Match object at 0x0000000003C612A0>