从堆栈创建抽象语法树

时间:2023-01-25 09:23:49

I'm working on a project where I need to do evaluate a reverse polish notation expression or convert the rpn expression to infix notation. I am doing this by pushing all of the elements of the expression onto a stack then popping each element from the stack and inserting it into an abstract syntax tree. From there I will traverse the tree to complete the evaluate and convert operations. Here is what I have so far

我正在开展一个项目,我需要做一个评估反向抛光表示法表达式或将rpn表达式转换为中缀表示法的项目。我这样做是通过将表达式的所有元素推送到堆栈然后从堆栈中弹出每个元素并将其插入到抽象语法树中。从那里我将遍历树以完成评估和转换操作。这是我到目前为止所拥有的

#include <stdlib.h>
#include <ctype.h>
#include <stdio.h>
#include <string.h>

struct snode 
{
  char datum;
  struct snode* bottom;
};

struct tnode
{
  char datum;
  struct tnode* left;
  struct tnode*right;
};

struct snode* 
push(struct snode* stack, char x) {
  struct snode *S = (struct snode*)malloc(sizeof(struct snode));
  S->datum = x;
  S->bottom = stack;
  return S;
}

struct snode* 
pop(struct snode* stack) {
  struct snode *S;
  if (stack == NULL)
    return NULL;
  S = stack->bottom;
  free(stack);
  return S;
}

char
peek(struct snode* stack){
  return stack->datum;
}


struct tnode*
create_node(char x){
  struct tnode* tmp;
  tmp = (struct tnode*)malloc(sizeof(struct tnode));
  tmp->datum = x;
  tmp->right = NULL;
  tmp->left = NULL;
  return tmp;
}

void
print_table(struct tnode *AST){
  if(AST !=NULL){
    print_table(AST->left);
    printf("%c ", AST->datum);
    print_table(AST->right);
  }
}



struct tnode*
build_tree(struct snode *S)
{

  struct tnode* root;
  if (S == NULL)
    return NULL;

  char top = peek(S);

  if (top == 'A' || top == 'S' || top == 'X' || top == 'D' || top == 'M')
    {
      root = create_node(top);
      S = pop(S); 
      root->right = build_tree(S);
      S = pop(S);
      root->left = build_tree(S);
      return root;
    } 

  root= create_node(top);

  return root;
}


int
main(int argc, const char *argv[])
{

  int i = 1;
  struct tnode *tree = NULL;
  struct snode *stack = NULL;

  char value;
  while (argv[i]!= NULL)
    {
      value = argv[i][0];
      stack = push(stack, value);
      i++;
    }


  tree =  build_tree(stack);
  print_table(tree);
  printf("\n");

  return EXIT_SUCCESS;
}

My problem is that this code only doesnt work for every rpn expression. For Example:

我的问题是这段代码不适用于每个rpn表达式。例如:

./project 7 4 A 3 S

outputs

7 A 4 S 3

which is correct. but

哪个是对的。但

./project 1 2 X 3 4 X A  

outputs

A 3 X 4

which is clearly not correct. Im pretty sure my problem is in my build_tree function, but I have no idea on how I could do a build_tree in a different way. Also, when I have an expression such as

这显然是不正确的。我很确定我的问题出在我的build_tree函数中,但我不知道如何以不同的方式构建build_tree。此外,当我有一个表达,如

./project 12 3 D 2 D

how come the output is

为什么输出是

1 D 3 D 2

instead of

12 D 3 D 2

Thank you for all help. Any input on how I could change my code to make the rest of the project simpler is also appreciated!

谢谢你的帮助。关于如何更改代码以使项目的其余部分更简单的任何输入也值得赞赏!

1 个解决方案

#1


The main problem is that the "updates" to the stack are not remembered when you descend more than one level in your build_tree() function. When you do for example root->right = build_tree(S); and you pop the stack in context of that call to build_tree(), the stack hasn't actually been popped in context of the caller.

主要问题是当你在build_tree()函数中下降多个级别时,不会记住堆栈的“更新”。例如root-> right = build_tree(S);并且在调用build_tree()的上下文中弹出堆栈时,堆栈实际上并未在调用者的上下文中弹出。

I would suggest that you alter your functions according to something like this:

我建议你改变你的功能,如下所示:

void
pop(struct snode **stack) {
  struct snode *S;
  if (*stack == NULL)
    return;

  S = (*stack)->bottom;
  free(*stack);
  *stack = S;
}


struct tnode*
build_tree(struct snode **S)
{

  struct tnode* root;
  if (*S == NULL)
    return NULL;

  char top = peek(*S);

  if (top == 'A' || top == 'S' || top == 'X' || top == 'D' || top == 'M')
    {
      root = create_node(top);
      pop(S); 
      root->right = build_tree(S);
      pop(S);
      root->left = build_tree(S);
      return root;
    } 

  root = create_node(top);

  return root;
}

int
main(int argc, const char *argv[])
{
  ...

  tree =  build_tree(&stack);

  ...
}

The difference becomes more obvious if you add a printf() call to peek() where you output the datum of the visited stack node.

如果向peek()添加printf()调用,输出被访问堆栈节点的数据,则差异会变得更加明显。

For your other error, where ./project 12 3 D 2 D doesn't work, and 12 is replaced with 1, I would suggest that you use the atoi() function to convert a string to an integer, instead of just value = argv[i][0]. You could test if value is a digit, and if so do value = atoi(argv[i]) afterwards.

对于你的其他错误,./ project 12 3 D 2 D不起作用,12替换为1,我建议你使用atoi()函数将字符串转换为整数,而不仅仅是value =的argv [I] [0]。您可以测试值是否为数字,如果是,则执行value = atoi(argv [i])。

In addition, as already commented on by Nikolai, you need to define operator precedence rules, and add parentheses as necessary. The simplest approach would be to add a left parenthesis before each time you go left, and add a right parenthesis after each time you return from the right.

此外,正如Nikolai已经评论过的那样,您需要定义运算符优先级规则,并根据需要添加括号。最简单的方法是在每次离开前添加左括号,并在每次从右边返回后添加右括号。

Edit

Also note that 1 2 X 3 4 X A * isn't valid rpn, whereas 1 2 X 3 4 X A is (because the extra * has no matching right operator).

另请注意,1 2 X 3 4 X A *无效rpn,而1 2 X 3 4 X A(因为额外*没有匹配的右运算符)。

Edit 2

A solution for supporting multi-digit numbers could be to use strings instead of characters to represent all tokens. You would need to change all occurences of char with char * in:

支持多位数字的解决方案可以是使用字符串而不是字符来表示所有令牌。您需要使用char *更改所有出现的char:

  • The definitions of struct snode and struct tnode
  • struct snode和struct tnode的定义

  • The argument to push() and create_node()
  • push()和create_node()的参数

  • The return value from peek().
  • peek()的返回值。

  • The variable value in main() and top in build_tree().
  • main()中的变量值和build_tree()中的top。

In addition, you would need to change value = argv[i][0] to simply value = argv[i] (I would also suggest removing the const qualifier from the argv argument to main(), unless you want to be const-correct). You would also need to change the operator equality check. For example, you could do:

另外,你需要将value = argv [i] [0]更改为value = argv [i](我还建议将const限定符从argv参数中删除到main(),除非你想成为const-正确)。您还需要更改运算符等式检查。例如,您可以这样做:

int
is_operator(char *tok)
{
  return !strcmp(tok, "A") || !strcmp(tok, "S") || !strcmp(tok, "X") || !strcmp(tok, "D") || !strcmp(tok, "M");
}

and then simply write if (is_operator(top)) in build_tree().

然后简单地在build_tree()中写if(is_operator(top))。

Finally, you would need to change the printf() format string from %c to %s.

最后,您需要将printf()格式字符串从%c更改为%s。

#1


The main problem is that the "updates" to the stack are not remembered when you descend more than one level in your build_tree() function. When you do for example root->right = build_tree(S); and you pop the stack in context of that call to build_tree(), the stack hasn't actually been popped in context of the caller.

主要问题是当你在build_tree()函数中下降多个级别时,不会记住堆栈的“更新”。例如root-> right = build_tree(S);并且在调用build_tree()的上下文中弹出堆栈时,堆栈实际上并未在调用者的上下文中弹出。

I would suggest that you alter your functions according to something like this:

我建议你改变你的功能,如下所示:

void
pop(struct snode **stack) {
  struct snode *S;
  if (*stack == NULL)
    return;

  S = (*stack)->bottom;
  free(*stack);
  *stack = S;
}


struct tnode*
build_tree(struct snode **S)
{

  struct tnode* root;
  if (*S == NULL)
    return NULL;

  char top = peek(*S);

  if (top == 'A' || top == 'S' || top == 'X' || top == 'D' || top == 'M')
    {
      root = create_node(top);
      pop(S); 
      root->right = build_tree(S);
      pop(S);
      root->left = build_tree(S);
      return root;
    } 

  root = create_node(top);

  return root;
}

int
main(int argc, const char *argv[])
{
  ...

  tree =  build_tree(&stack);

  ...
}

The difference becomes more obvious if you add a printf() call to peek() where you output the datum of the visited stack node.

如果向peek()添加printf()调用,输出被访问堆栈节点的数据,则差异会变得更加明显。

For your other error, where ./project 12 3 D 2 D doesn't work, and 12 is replaced with 1, I would suggest that you use the atoi() function to convert a string to an integer, instead of just value = argv[i][0]. You could test if value is a digit, and if so do value = atoi(argv[i]) afterwards.

对于你的其他错误,./ project 12 3 D 2 D不起作用,12替换为1,我建议你使用atoi()函数将字符串转换为整数,而不仅仅是value =的argv [I] [0]。您可以测试值是否为数字,如果是,则执行value = atoi(argv [i])。

In addition, as already commented on by Nikolai, you need to define operator precedence rules, and add parentheses as necessary. The simplest approach would be to add a left parenthesis before each time you go left, and add a right parenthesis after each time you return from the right.

此外,正如Nikolai已经评论过的那样,您需要定义运算符优先级规则,并根据需要添加括号。最简单的方法是在每次离开前添加左括号,并在每次从右边返回后添加右括号。

Edit

Also note that 1 2 X 3 4 X A * isn't valid rpn, whereas 1 2 X 3 4 X A is (because the extra * has no matching right operator).

另请注意,1 2 X 3 4 X A *无效rpn,而1 2 X 3 4 X A(因为额外*没有匹配的右运算符)。

Edit 2

A solution for supporting multi-digit numbers could be to use strings instead of characters to represent all tokens. You would need to change all occurences of char with char * in:

支持多位数字的解决方案可以是使用字符串而不是字符来表示所有令牌。您需要使用char *更改所有出现的char:

  • The definitions of struct snode and struct tnode
  • struct snode和struct tnode的定义

  • The argument to push() and create_node()
  • push()和create_node()的参数

  • The return value from peek().
  • peek()的返回值。

  • The variable value in main() and top in build_tree().
  • main()中的变量值和build_tree()中的top。

In addition, you would need to change value = argv[i][0] to simply value = argv[i] (I would also suggest removing the const qualifier from the argv argument to main(), unless you want to be const-correct). You would also need to change the operator equality check. For example, you could do:

另外,你需要将value = argv [i] [0]更改为value = argv [i](我还建议将const限定符从argv参数中删除到main(),除非你想成为const-正确)。您还需要更改运算符等式检查。例如,您可以这样做:

int
is_operator(char *tok)
{
  return !strcmp(tok, "A") || !strcmp(tok, "S") || !strcmp(tok, "X") || !strcmp(tok, "D") || !strcmp(tok, "M");
}

and then simply write if (is_operator(top)) in build_tree().

然后简单地在build_tree()中写if(is_operator(top))。

Finally, you would need to change the printf() format string from %c to %s.

最后,您需要将printf()格式字符串从%c更改为%s。