提取括号内的文本并存储在字典中

时间:2022-09-13 13:39:54

I am trying to separate all the functions within square brackets and store them in a dictionary. However, the output strips the closing bracket from all the outputs except the last one.

我试图将方括号内的所有函数分开并将它们存储在字典中。但是,除最后一个输出外,输出从所有输出中剥离右括号。

import re
line="[f(x,y),g(y,z),f1(x1,y1)]"
matches = re.match(r"(.*)(\[)(.*)(\])(.*)", line)
if matches:
    all_action_labels = matches.group(3)
    sep_action_labels = re.split(r'\),',all_action_labels)
    j=0
    for x in sep_action_labels:
        print(f'Function #{j+1} : {x}')

All the outputs, as you can see, are missing the closing bracket')' except last one :

如您所见,所有输出都缺少结束括号')',除了最后一个:

Function #1 : f(x,y
Function #1 : g(y,z
Function #1 : f1(x1,y1)

What regular expression should I use?

我应该使用什么正则表达式?

Further, how can I store these output in a dictionary?

此外,如何将这些输出存储在字典中?

2 个解决方案

#1


0  

If your not required to use regular expressions, it might be easier to do this. This is easy to follow, it just travels through the string, and putting the function strings into a list, and, it keeps track of brackets so functions with multiple commas will be handled just fine.

如果您不需要使用正则表达式,则可能更容易执行此操作。这很容易理解,它只是遍历字符串,并将函数字符串放入列表中,并且它跟踪括号,因此具有多个逗号的函数将被处理得很好。

def getFuncList(line):
  """
  Assumes comma seperated, and opends and closes with square brackets
  """
  line = line[1:-1] # strip square brackets
  funcs = []

  current = ""
  brack_stack = 0 # we don't want to follow comma's if they are in a function
  for char in line:
    if char == "(":
      brack_stack += 1 
    elif char == ")":
      brack_stack -= 1 

    if char == "," and brack_stack == 0:
      # new function, clear current and append to list
      funcs.append(current)
      current = ""
    else:
      current += char
  funcs.append(current)
  return funcs


line="[f(x,y),g(y,z),f1(x1,y1)]"
func_list = (getFuncList(line))
print({"Function "+str(x+1): func_list[x] for x in range(len(func_list))}) # make and print the dictionary
# {'Function 1': 'f(x,y)', 'Function 2': 'g(y,z)', 'Function 3': 'f1(x1,y1)'}

#2


0  

My general rule for extracting data is to call re.findall() with fairly simple regular expressions.

我提取数据的一般规则是使用相当简单的正则表达式调用re.findall()。

Perhaps this meets your needs:

也许这符合您的需求:

import re
line="[f(x,y),g(y,z),f1(x1,y1)]"
all_action_labels = re.findall(r"\[(.*?)]", line)
for all_action_label in all_action_labels:
    sep_action_labels = re.findall(r"[a-z0-9]+\(.*?\)", all_action_label)
    for j, x in enumerate(sep_action_labels, 1):
        print(f'Function #{j} : {x}')

I use one simple regular expression to extract data from [] and another to extract the individual function calls.

我使用一个简单的正则表达式从[]中提取数据,另一个用于提取单个函数调用。

#1


0  

If your not required to use regular expressions, it might be easier to do this. This is easy to follow, it just travels through the string, and putting the function strings into a list, and, it keeps track of brackets so functions with multiple commas will be handled just fine.

如果您不需要使用正则表达式,则可能更容易执行此操作。这很容易理解,它只是遍历字符串,并将函数字符串放入列表中,并且它跟踪括号,因此具有多个逗号的函数将被处理得很好。

def getFuncList(line):
  """
  Assumes comma seperated, and opends and closes with square brackets
  """
  line = line[1:-1] # strip square brackets
  funcs = []

  current = ""
  brack_stack = 0 # we don't want to follow comma's if they are in a function
  for char in line:
    if char == "(":
      brack_stack += 1 
    elif char == ")":
      brack_stack -= 1 

    if char == "," and brack_stack == 0:
      # new function, clear current and append to list
      funcs.append(current)
      current = ""
    else:
      current += char
  funcs.append(current)
  return funcs


line="[f(x,y),g(y,z),f1(x1,y1)]"
func_list = (getFuncList(line))
print({"Function "+str(x+1): func_list[x] for x in range(len(func_list))}) # make and print the dictionary
# {'Function 1': 'f(x,y)', 'Function 2': 'g(y,z)', 'Function 3': 'f1(x1,y1)'}

#2


0  

My general rule for extracting data is to call re.findall() with fairly simple regular expressions.

我提取数据的一般规则是使用相当简单的正则表达式调用re.findall()。

Perhaps this meets your needs:

也许这符合您的需求:

import re
line="[f(x,y),g(y,z),f1(x1,y1)]"
all_action_labels = re.findall(r"\[(.*?)]", line)
for all_action_label in all_action_labels:
    sep_action_labels = re.findall(r"[a-z0-9]+\(.*?\)", all_action_label)
    for j, x in enumerate(sep_action_labels, 1):
        print(f'Function #{j} : {x}')

I use one simple regular expression to extract data from [] and another to extract the individual function calls.

我使用一个简单的正则表达式从[]中提取数据,另一个用于提取单个函数调用。