在bison中使用字符文字作为终端

时间:2022-08-16 19:30:10

I'm trying to understand flex/bison, but the documentation is a bit difficult for me, and I've probably grossly misunderstood something. Here's a test case: http://namakajiri.net/misc/bison_charlit_test/

我试图理解flex/bison,但是文档对我来说有点困难,我可能非常误解了一些东西。这里有一个测试用例:http://namakajiri.net/misc/bison_charlit_test/

File "a" contains the single character 'a'. "foo.y" has a trivial grammar like this:

文件“a”包含单个字符“a”。“foo。“y”有这样一个简单的语法:

%%

file: 'a' ;

The generated parser can't parse file "a"; it gives a syntax error.

生成的解析器不能解析文件“a”;它会产生语法错误。

The grammar "bar.y" is almost the same, only I changed the character literal for a named token:

语法”酒吧。y"几乎是一样的,只是我改变了字符文字为一个命名的标记:

%token TOK_A;

%%

file: TOK_A;

and then in bar.lex:

然后在bar.lex:

a       { return TOK_A; }

This one works just fine.

这个还行。

What am I doing wrong in trying to use character literals directly as bison terminals, like in the docs?

像在文档中那样,我直接使用字符文字作为bison终端有什么不对吗?

I'd like my grammar to look like "statement: selector '{' property ':' value ';' '}'" and not "statement: selector LBRACE property COLON value SEMIC RBRACE"...

我希望我的语法看起来像"语句:选择器'{'属性':' value ';'}'"而不是"语句:选择器lsqul属性冒号值SEMIC rspan "…

I'm running bison 2.5 and flex 2.5.35 in debian wheezy.

我用debian wheezy运行bison 2.5和flex 2.5.35。

1 个解决方案

#1


3  

Rewrite

重写

The problem is a runtime problem, not a compile time problem.

问题是运行时问题,而不是编译时问题。

The trouble is that you have two radically different lexical analyzers.

问题是你有两个截然不同的词汇分析器。

The bar.lex analyzer recognizes an a in the input and returns it as a TOK_A and ignores everything else.

酒吧。lex分析器识别输入中的a,并将其作为TOK_A返回,并忽略其他所有内容。

The foo.lex analyzer echoes every single character, but that's all.

foo。lex analyzer模仿每一个字符,但仅此而已。

foo.lex — as written

%{
#include "foo.tab.h"
%}

%%

foo.lex — equivalent

%{
#include "foo.tab.h"
%}

%%
. { ECHO; }

foo.lex — required

%{
#include "foo.tab.h"
%}

%%
. { return *yytext; }

Working code

Here's some working code with diagnostic printing in place.

这里有一些带有诊断打印的工作代码。

foo-lex.l

%%
. { printf("Flex: %d\n", *yytext); return *yytext; }

foo.y

%{
#include <stdio.h>
void yyerror(char *s);
%}

%%

file: 'a' { printf("Bison: got file!\n") }
    ;

%%

int main(void)
{
    yyparse();
}

void yyerror(char *s)
{
    fprintf(stderr, "%s\n", s);
}

Compilation and execution

$ flex foo-lex.l
$ bison foo.y
$ gcc -o foo foo.tab.c lex.yy.c -lfl
$ echo a | ./foo
Flex: 97
Bison: got file!

$

Point of detail: how did that blank line get into the output? Answer: the lexical analyzer put it there. The pattern . does not match a newline, so the newline was treated as if there was a rule:

细节点:空白行是如何进入输出的?答:词法分析器把它放在那里。这种模式。不匹配换行符,所以换行符被视为有一个规则:

\n    { ECHO; }

This is why the input was accepted. If you change the foo-lex.l file to:

这就是输入被接受的原因。如果你换掉了foo-lex。l文件:

%%
.       { printf("Flex-1: %d\n", *yytext); return *yytext; }
\n      { printf("Flex-2: %d\n", *yytext); return *yytext; }

and then recompile and run again, the output is:

再重新编译运行,输出为:

$ echo a | ./foo
Flex-1: 97
Bison: got file!
Flex-2: 10
syntax error
$

with no blank lines. This is because the grammar doesn't allow a newline to appear in a valid 'file'.

没有空行。这是因为语法不允许在有效的“文件”中出现换行。

#1


3  

Rewrite

重写

The problem is a runtime problem, not a compile time problem.

问题是运行时问题,而不是编译时问题。

The trouble is that you have two radically different lexical analyzers.

问题是你有两个截然不同的词汇分析器。

The bar.lex analyzer recognizes an a in the input and returns it as a TOK_A and ignores everything else.

酒吧。lex分析器识别输入中的a,并将其作为TOK_A返回,并忽略其他所有内容。

The foo.lex analyzer echoes every single character, but that's all.

foo。lex analyzer模仿每一个字符,但仅此而已。

foo.lex — as written

%{
#include "foo.tab.h"
%}

%%

foo.lex — equivalent

%{
#include "foo.tab.h"
%}

%%
. { ECHO; }

foo.lex — required

%{
#include "foo.tab.h"
%}

%%
. { return *yytext; }

Working code

Here's some working code with diagnostic printing in place.

这里有一些带有诊断打印的工作代码。

foo-lex.l

%%
. { printf("Flex: %d\n", *yytext); return *yytext; }

foo.y

%{
#include <stdio.h>
void yyerror(char *s);
%}

%%

file: 'a' { printf("Bison: got file!\n") }
    ;

%%

int main(void)
{
    yyparse();
}

void yyerror(char *s)
{
    fprintf(stderr, "%s\n", s);
}

Compilation and execution

$ flex foo-lex.l
$ bison foo.y
$ gcc -o foo foo.tab.c lex.yy.c -lfl
$ echo a | ./foo
Flex: 97
Bison: got file!

$

Point of detail: how did that blank line get into the output? Answer: the lexical analyzer put it there. The pattern . does not match a newline, so the newline was treated as if there was a rule:

细节点:空白行是如何进入输出的?答:词法分析器把它放在那里。这种模式。不匹配换行符,所以换行符被视为有一个规则:

\n    { ECHO; }

This is why the input was accepted. If you change the foo-lex.l file to:

这就是输入被接受的原因。如果你换掉了foo-lex。l文件:

%%
.       { printf("Flex-1: %d\n", *yytext); return *yytext; }
\n      { printf("Flex-2: %d\n", *yytext); return *yytext; }

and then recompile and run again, the output is:

再重新编译运行,输出为:

$ echo a | ./foo
Flex-1: 97
Bison: got file!
Flex-2: 10
syntax error
$

with no blank lines. This is because the grammar doesn't allow a newline to appear in a valid 'file'.

没有空行。这是因为语法不允许在有效的“文件”中出现换行。