使用Python Regex在File中查找C函数

时间:2022-09-13 08:30:51

I am trying to get a Python regex to search through a .c file and get the function(s) inside it.

我试图让Python正则表达式搜索.c文件并获取其中的函数。

For example:

例如:

int blahblah(
  struct _reent *ptr __attribute__((unused)),
  const char    *old,
  const char    *new
)
{
...

I would want to get blahblah as the function.

我想把blahblah作为功能。

This regex doesn't work for me, it keeps on giving me None: r"([a-zA-Z0-9]*)\s*\([^()]*\)\s*{"

这个正则表达式对我不起作用,它一直给我无:r“([a-zA-Z0-9] *)\ s * \([^()] * \)\ s * {”

3 个解决方案

#1


3  

(?<=(int\s)|(void\s)|(string\s)|(double\s)|(float\s)|(char\s)).*?(?=\s?\()

(<=(INT \ S)|?(空隙\ S)|(串\ S)|(双\ S)|(浮子\ S)|(炭\ S))。???*(= \ S \ ()

http://regexr.com?3332t

http://regexr.com?3332t

This should work for what you want. Just keep adding types that you need to catch.

这应该适合你想要的。只需继续添加您需要捕获的类型。

re.findall(r'(?<=(?<=int\s)|(?<=void\s)|(?<=string\s)|(?<=double\s)|(?<=float\s‌​)|(?<=char\s)).*?(?=\s?\()', string) will work for python.

re.findall(R'(<=(<= INT \ S)|???(<=空隙\ S)|(<=串\ S)|?(<=双\ S)|?(<= float \ s)|(?<= char \ s))。*?(?= \ s?\()',string)适用于python。

#2


3  

The regular expression isn't catching it because of the parentheses in the arguments (specifically, the parentheses in __attribute__((unused))). You might be able to adapt the regular expression for this case, but in general, regular expressions cannot parse languages like C. You may want to use a full-fledged parser like pycparser.

由于参数中的括号(特别是__attribute __((unused))中的括号,正则表达式没有捕获它。您可能能够为这种情况调整正则表达式,但一般来说,正则表达式无法解析像C这样的语言。您可能希望使用像pycparser这样的完整解析器。

#3


0  

Regexps are not a proper tool for extracting some semantic information from source code files (though they're good for syntax highlighting - because syntax is often expressed through regular expressions). Regexps can't handle nested constructions, track what is going on, distingiush types and symbols.

正则表达式不是从源代码文件中提取某些语义信息的适当工具(尽管它们有利于语法高亮 - 因为语法通常通过正则表达式表达)。 Regexp无法处理嵌套构造,跟踪正在发生的事情,distingiush类型和符号。

I'd recommend some specialized tool that is really aware of the language structure, like ctags or python-pygccxml.

我推荐一些真正了解语言结构的专用工具,比如ctags或python-pygccxml。

ctags is a program that generates a list of entities in a C source with with their places (used to assist navigation through C code bases in text editors like vi and emacs). python-pygccxml is a Python binding to C library libgccxml that uses gcc internals to analyze the code and produces rich and structured output about program semantics.

ctags是一个程序,它可以生成C源中的实体列表及其位置(用于帮助在文本编辑器(如vi和emacs)中通过C代码库进行导航)。 python-pygccxml是一个Python库绑定到C库libgccxml,它使用gcc内部来分析代码并生成有关程序语义的丰富和结构化输出。

#1


3  

(?<=(int\s)|(void\s)|(string\s)|(double\s)|(float\s)|(char\s)).*?(?=\s?\()

(<=(INT \ S)|?(空隙\ S)|(串\ S)|(双\ S)|(浮子\ S)|(炭\ S))。???*(= \ S \ ()

http://regexr.com?3332t

http://regexr.com?3332t

This should work for what you want. Just keep adding types that you need to catch.

这应该适合你想要的。只需继续添加您需要捕获的类型。

re.findall(r'(?<=(?<=int\s)|(?<=void\s)|(?<=string\s)|(?<=double\s)|(?<=float\s‌​)|(?<=char\s)).*?(?=\s?\()', string) will work for python.

re.findall(R'(<=(<= INT \ S)|???(<=空隙\ S)|(<=串\ S)|?(<=双\ S)|?(<= float \ s)|(?<= char \ s))。*?(?= \ s?\()',string)适用于python。

#2


3  

The regular expression isn't catching it because of the parentheses in the arguments (specifically, the parentheses in __attribute__((unused))). You might be able to adapt the regular expression for this case, but in general, regular expressions cannot parse languages like C. You may want to use a full-fledged parser like pycparser.

由于参数中的括号(特别是__attribute __((unused))中的括号,正则表达式没有捕获它。您可能能够为这种情况调整正则表达式,但一般来说,正则表达式无法解析像C这样的语言。您可能希望使用像pycparser这样的完整解析器。

#3


0  

Regexps are not a proper tool for extracting some semantic information from source code files (though they're good for syntax highlighting - because syntax is often expressed through regular expressions). Regexps can't handle nested constructions, track what is going on, distingiush types and symbols.

正则表达式不是从源代码文件中提取某些语义信息的适当工具(尽管它们有利于语法高亮 - 因为语法通常通过正则表达式表达)。 Regexp无法处理嵌套构造,跟踪正在发生的事情,distingiush类型和符号。

I'd recommend some specialized tool that is really aware of the language structure, like ctags or python-pygccxml.

我推荐一些真正了解语言结构的专用工具,比如ctags或python-pygccxml。

ctags is a program that generates a list of entities in a C source with with their places (used to assist navigation through C code bases in text editors like vi and emacs). python-pygccxml is a Python binding to C library libgccxml that uses gcc internals to analyze the code and produces rich and structured output about program semantics.

ctags是一个程序,它可以生成C源中的实体列表及其位置(用于帮助在文本编辑器(如vi和emacs)中通过C代码库进行导航)。 python-pygccxml是一个Python库绑定到C库libgccxml,它使用gcc内部来分析代码并生成有关程序语义的丰富和结构化输出。