Python Regex股票期权不匹配

时间:2021-04-19 12:38:29

I'm trying to create a regex to find option symbols in broker data. Per Wikipedia the format is:

我正在尝试创建一个regex来查找代理数据中的选项符号。每个*的格式是:

  1. Root symbol of the underlying stock or ETF, padded with spaces to 6 characters
  2. 基础股票或ETF的根符号,填充了6个字符的空格
  3. Expiration date, 6 digits in the format yymmdd
  4. 有效期,格式yymmdd为6位
  5. Option type, either P or C, for put or call
  6. 选项类型,P或C,用于put或call
  7. Strike price, as the price x 1000, front padded with 0s to 8 digits
  8. 执行价格,如x 1000,前垫0到8位数

So I created this regex:

所以我创建了这个regex:

option_regex = re.compile(r'''(
(\w{1,6})            # beginning ticker, 1 to 6 word characters
(\s)?                # optional separator
(\d{6})              # 6 digits for yymmdd
([cp])               # C or P for call or put
(\d{8})              # 8 digits for strike price
)''', re.VERBOSE | re.IGNORECASE)

But when I test it out I get an error:

但是当我测试它的时候,我得到了一个错误:

import re

option_regex = re.compile(r'''(
(\w{1,6})            # beginning ticker, 1 to 6 word characters
(\s)?                # optional separator
(\d{6})              # 6 digits for yymmdd
([cp])               # C or P for call or put
(\d{8})              # 8 digits for strike price
)''', re.VERBOSE | re.IGNORECASE)

result = option_regex.search('AAPL  170818C00155000')

result.group()
Traceback (most recent call last):

  File "<ipython-input-4-0273c989d990>", line 1, in <module>
    result.group()

AttributeError: 'NoneType' object has no attribute 'group'

1 个解决方案

#1


3  

From python documentation on re.search():

从python文档关于re.search():

Scan through string looking for the first location where the regular expression pattern produces a match, and return a corresponding MatchObject instance. Return None if no position in the string matches the pattern; note that this is different from finding a zero-length match at some point in the string.

扫描字符串,查找正则表达式模式产生匹配的第一个位置,并返回相应的MatchObject实例。如果字符串中没有与模式匹配的位置,则返回None;注意,这与在字符串的某个点找到零长度匹配是不同的。

Your code throws this exception, because the subroutine didn't found anything. Basically, you are trying to run .group() on None. It would be a good idea to defend against it:

您的代码抛出这个异常,因为子例程没有发现任何东西。基本上,您试图在None上运行.group()。最好的办法是:

if not result:
    ... # Pattern didn't match the string
    return

Your pattern doesn't match the string you typed in, because it has lengthier separator than what you assumed it to be: it has 2 spaces instead of one. You can fix that by adding a + ("at-least-once") to the rule:

你的模式与你输入的字符串不匹配,因为它的分隔符比你想象的要长:它有两个空格而不是一个空格。您可以通过在规则中添加一个+(“至少一次”)来解决这个问题:

(\s+)?                # optional separator

#1


3  

From python documentation on re.search():

从python文档关于re.search():

Scan through string looking for the first location where the regular expression pattern produces a match, and return a corresponding MatchObject instance. Return None if no position in the string matches the pattern; note that this is different from finding a zero-length match at some point in the string.

扫描字符串,查找正则表达式模式产生匹配的第一个位置,并返回相应的MatchObject实例。如果字符串中没有与模式匹配的位置,则返回None;注意,这与在字符串的某个点找到零长度匹配是不同的。

Your code throws this exception, because the subroutine didn't found anything. Basically, you are trying to run .group() on None. It would be a good idea to defend against it:

您的代码抛出这个异常,因为子例程没有发现任何东西。基本上,您试图在None上运行.group()。最好的办法是:

if not result:
    ... # Pattern didn't match the string
    return

Your pattern doesn't match the string you typed in, because it has lengthier separator than what you assumed it to be: it has 2 spaces instead of one. You can fix that by adding a + ("at-least-once") to the rule:

你的模式与你输入的字符串不匹配,因为它的分隔符比你想象的要长:它有两个空格而不是一个空格。您可以通过在规则中添加一个+(“至少一次”)来解决这个问题:

(\s+)?                # optional separator