在“gcc -O2”上优化为无限循环的函数

时间:2022-09-06 18:46:13

Context
I was asked the following puzzle by one of my friends:

我的一个朋友问我:

void fn(void)
{
  /* write something after this comment so that the program output is 10 */
  /* write something before this comment */
}

int main()
{
  int i = 5;
  fn();
  printf("%d\n", i);
  return 0;
}

I know there can be multiple solutions, some involving macro and some assuming something about the implementation and violating C.

我知道可以有多个解决方案,有些涉及宏,有些假设实现和违反C。

One particular solution I was interested in is to make certain assumptions about stack and write following code: (I understand it is undefined behavior, but may work as expected on many implementations)

我感兴趣的一个特别的解决方案是对堆栈进行某些假设,并编写以下代码:(我理解它是未定义的行为,但可能在许多实现中都可以工作)

void fn(void)
{
  /* write something after this comment so that the program output is 10 */
  int a[1] = {0};
  int j = 0;
  while(a[j] != 5) ++j;  /* Search stack until you find 5 */
  a[j] = 10;             /* Overwrite it with 10 */
  /* write something before this comment */
}

Problem
This program worked fine in MSVC and gcc without optimization. But when I compiled it with gcc -O2 flag or tried on ideone, it loops infinitely in function fn.

这个程序在MSVC和gcc中运行良好,没有优化。但是当我用gcc -O2标记编译或者在ideone上尝试时,它在函数fn中无限循环。

My Observation
When I compiled the file with gcc -S vs gcc -S -O2 and compared, it clearly shows gcc kept an infinite loop in function fn.

当我用gcc -S和gcc -S -O2编译这个文件并进行比较时,我的观察清楚地表明,在函数fn中gcc保持着一个无限循环。

Question
I understand because the code invokes undefined behavior, one can not call it a bug. But why and how does compiler analyze the behavior and leave an infinite loop at O2?

我理解这个问题,因为代码调用了未定义的行为,所以不能称之为bug。但是编译器为什么以及如何分析这种行为并在O2上留下一个无限循环呢?


Many people commented to know the behavior if some of the variables are changed to volatile. The result as expected is:

许多人评论说,如果某些变量变成了易失性,就会知道这种行为。如预期的结果是:

  • If i or j is changed to volatile, program behavior remains same.
  • 如果i或j被更改为volatile,则程序行为保持不变。
  • If array a is made volatile, program does not suffer infinite loop.
  • 如果数组a是可变的,程序就不会出现无限循环。
  • Moreover if I apply the following patch
  • 此外,如果我应用以下补丁
-  int a[1] = {0};
+  int aa[1] = {0};
+  int *a = aa;

The program behavior remains same (infinite loop)

程序行为保持不变(无限循环)

If I compile the code with gcc -O2 -fdump-tree-optimized, I get the following intermediate file:

如果我使用gcc -O2 -fdump-tree优化的代码编译,我得到以下中间文件:

;; Function fn (fn) (executed once)

Removing basic block 3
fn ()
{
<bb 2>:

<bb 3>:
  goto <bb 3>;

}



;; Function main (main) (executed once)

main ()
{
<bb 2>:
  fn ();

}
Invalid sum of incoming frequencies 0, should be 10000

This verifies the assertions made after the answers below.

这将验证在以下答案之后所做的断言。

3 个解决方案

#1


50  

This is undefined behavior so the compiler can really do anything at all, we can find a similar example in GCC pre-4.8 Breaks Broken SPEC 2006 Benchmarks, where gcc takes a loop with undefined behavior and optimizes it to:

这是未定义的行为,因此编译器实际上可以做任何事情,我们可以在GCC pre-4.8 break SPEC 2006基准中找到一个类似的例子,在这个例子中,GCC执行一个带有未定义行为的循环,并将其优化为:

L2:
    jmp .L2

The article says (emphasis mine):

这篇文章说(强调我的):

Of course this is an infinite loop. Since SATD() unconditionally executes undefined behavior (it’s a type 3 function), any translation (or none at all) is perfectly acceptable behavior for a correct C compiler. The undefined behavior is accessing d[16] just before exiting the loop. In C99 it is legal to create a pointer to an element one position past the end of the array, but that pointer must not be dereferenced. Similarly, the array cell one element past the end of the array must not be accessed.

当然这是一个无限循环。由于SATD()无条件地执行未定义的行为(它是一个类型3的函数),对于正确的C编译器来说,任何转换(或者根本没有)都是完全可以接受的行为。在退出循环之前,未定义的行为正在访问d[16]。在C99中,可以在数组结束后的一个位置创建一个指向元素的指针,但是不能取消对该指针的引用。类似地,数组单元格在数组结束后的一个元素不能被访问。

which if we examine your program with godbolt we see:

如果我们用godbolt检验你的程序

fn:
.L2:
    jmp .L2

The logic being used by the optimizer probably goes something like this:

优化器使用的逻辑可能是这样的:

  • All the elements of a are initialized to zero
  • a的所有元素都被初始化为零
  • a is never modified before or within the loop
  • a在循环之前或循环中都不会被修改
  • So a[j] != 5 is always true -> infinite loop
  • 所以a[j] != 5总是正确的->无限循环
  • Because of the infinite, the a[j] = 10; is unreachable and so that can be optimized away, so can a and j since they are no longer needed to determine the loop condition.
  • 因为无穷大,a[j] = 10;是不可达的,所以可以对其进行优化,a和j也可以,因为不再需要它们来确定循环条件。

which is similar to the case in the article which given:

这与文中给出的情况类似:

int d[16];

analyzes the following loop:

分析以下循环:

for (dd=d[k=0]; k<16; dd=d[++k]) 

like this:

是这样的:

upon seeing d[++k], is permitted to assume that the incremented value of k is within the array bounds, since otherwise undefined behavior occurs. For the code here, GCC can infer that k is in the range 0..15. A bit later, when GCC sees k<16, it says to itself: “Aha– that expression is always true, so we have an infinite loop.”

在看到d[++k]时,允许假设k的增量值在数组范围内,因为其他未定义的行为发生。对于这里的代码,GCC可以推断k在0..15范围内。稍后,当GCC看到k<16时,它对自己说:“啊哈——这个表达式总是正确的,所以我们有一个无限循环。”

Perhaps an interesting secondary point, is whether an infinite loop is considered observable behavior(w.r.t. to the as-if rule) or not, which effects whether an infinite loop can also be optimized away. We can see from C Compilers Disprove Fermat’s Last Theorem that before C11 there was at least some room for interpretation:

一个有趣的次要问题是,无限循环是否被认为是可观察的行为(w.r.t)。对于as-if规则)或not,它会影响一个无限循环是否也可以被优化。我们可以从C编译器中看到证明费马的最后一个定理在C11之前至少有一些解释的空间

Many knowledgeable people (including me) read this as saying that the termination behavior of a program must not be changed. Obviously some compiler writers disagree, or else don’t believe that it matters. The fact that reasonable people disagree on the interpretation would seem to indicate that the C standard is flawed.

许多有知识的人(包括我)读到这句话,认为程序的终止行为不能改变。显然,有些编译器作者不同意,或者不相信这一点。理性的人对这种解释持不同意见,这一事实似乎表明C标准是有缺陷的。

C11 adds clarification to section 6.8.5 Iteration statements and is covered in more detail in this answer.

C11对6.8.5迭代语句进行了澄清,并在此答案中详细介绍。

#2


20  

In the optimized version, the compiler has decided a few things:

在优化的版本中,编译器决定了一些事情:

  1. The array a doesn't change before that test.
  2. 数组a在测试之前不会改变。
  3. The array a doesn't contain a 5.
  4. 数组a不包含5。

Therefore, we can rewrite the code as:

因此,我们可以把代码重写为:

void fn(void) {
  int a[1] = {0};
  int j = 0;
  while(true) ++j;
  a[j] = 10;
}

Now, we can make further decisions:

现在,我们可以做进一步的决定:

  1. All the code after the while loop is dead code (unreachable).
  2. while循环之后的所有代码都是死代码(不可访问)。
  3. j is written but never read. So we can get rid of it.
  4. j是写的,但从不读。我们可以把它消掉。
  5. a is never read.
  6. 一个从来没有阅读。

At this point, your code has been reduced to:

此时,您的代码已简化为:

void fn(void) {
  int a[1] = {0};
  while(true);
}

And we can make the note that a is now never read, so let's get rid of it as well:

我们可以注意到a现在从来没有读过,所以我们也把它去掉:

void fn(void) {
  while(true);
}

Now, the unoptimized code:

In unoptimized generated code, the array will remain in memory. And you'll literally walk it at runtime. And it's possible that there will be a 5 thats readable after it once you walk past the end of the array.

在未优化的生成代码中,数组将保留在内存中。你会在运行时完成。当你走过数组的末尾时,有可能会有一个5是可读的。

Which is why the unoptimized version sometimes doesn't crash and burn.

这就是为什么没有优化的版本有时不会崩溃和烧毁。

#3


7  

If the loop does get optimized out into an infinite loop, it could be due to static code analyzis seeing that your array is

如果循环被优化成一个无限循环,这可能是由于静态代码分析发现你的数组是

  1. not volatile

    不挥发性

  2. contains only 0

    只包含0

  3. never gets written to

    不会被写入

and thus it is not possible for it to contain the number 5. Which means an infinite loop.

因此它不可能包含数字5。也就是无限循环。

Even if it didn't do this, your approach could fail easily. For example, it's possible that some compiler would optimize your code without making your loop infinite, but would stuff the contents of i into a register, making it unavailable from the stack.

即使它没有这样做,您的方法也很容易失败。例如,一些编译器可能会优化您的代码,而不会使您的循环成为无限的,但是会将i的内容填充到寄存器中,使它从堆栈中不可用。

As a side note, I bet what your friend actually expected was this:

顺便说一句,我敢打赌你朋友真正想要的是:

void fn(void)
{
  /* write something after this comment so that the program output is 10 */
  printf("10\n"); /* Output 10 */
  while(1); /* Endless loop, function won't return, i won't be output */
  /* write something before this comment */
}

or this (if stdlib.h is included):

或(如果stdlib。h是包括):

void fn(void)
{
  /* write something after this comment so that the program output is 10 */
  printf("10\n"); /* Output 10 */
  exit(0); /* Exit gracefully */
  /* write something before this comment */
}

#1


50  

This is undefined behavior so the compiler can really do anything at all, we can find a similar example in GCC pre-4.8 Breaks Broken SPEC 2006 Benchmarks, where gcc takes a loop with undefined behavior and optimizes it to:

这是未定义的行为,因此编译器实际上可以做任何事情,我们可以在GCC pre-4.8 break SPEC 2006基准中找到一个类似的例子,在这个例子中,GCC执行一个带有未定义行为的循环,并将其优化为:

L2:
    jmp .L2

The article says (emphasis mine):

这篇文章说(强调我的):

Of course this is an infinite loop. Since SATD() unconditionally executes undefined behavior (it’s a type 3 function), any translation (or none at all) is perfectly acceptable behavior for a correct C compiler. The undefined behavior is accessing d[16] just before exiting the loop. In C99 it is legal to create a pointer to an element one position past the end of the array, but that pointer must not be dereferenced. Similarly, the array cell one element past the end of the array must not be accessed.

当然这是一个无限循环。由于SATD()无条件地执行未定义的行为(它是一个类型3的函数),对于正确的C编译器来说,任何转换(或者根本没有)都是完全可以接受的行为。在退出循环之前,未定义的行为正在访问d[16]。在C99中,可以在数组结束后的一个位置创建一个指向元素的指针,但是不能取消对该指针的引用。类似地,数组单元格在数组结束后的一个元素不能被访问。

which if we examine your program with godbolt we see:

如果我们用godbolt检验你的程序

fn:
.L2:
    jmp .L2

The logic being used by the optimizer probably goes something like this:

优化器使用的逻辑可能是这样的:

  • All the elements of a are initialized to zero
  • a的所有元素都被初始化为零
  • a is never modified before or within the loop
  • a在循环之前或循环中都不会被修改
  • So a[j] != 5 is always true -> infinite loop
  • 所以a[j] != 5总是正确的->无限循环
  • Because of the infinite, the a[j] = 10; is unreachable and so that can be optimized away, so can a and j since they are no longer needed to determine the loop condition.
  • 因为无穷大,a[j] = 10;是不可达的,所以可以对其进行优化,a和j也可以,因为不再需要它们来确定循环条件。

which is similar to the case in the article which given:

这与文中给出的情况类似:

int d[16];

analyzes the following loop:

分析以下循环:

for (dd=d[k=0]; k<16; dd=d[++k]) 

like this:

是这样的:

upon seeing d[++k], is permitted to assume that the incremented value of k is within the array bounds, since otherwise undefined behavior occurs. For the code here, GCC can infer that k is in the range 0..15. A bit later, when GCC sees k<16, it says to itself: “Aha– that expression is always true, so we have an infinite loop.”

在看到d[++k]时,允许假设k的增量值在数组范围内,因为其他未定义的行为发生。对于这里的代码,GCC可以推断k在0..15范围内。稍后,当GCC看到k<16时,它对自己说:“啊哈——这个表达式总是正确的,所以我们有一个无限循环。”

Perhaps an interesting secondary point, is whether an infinite loop is considered observable behavior(w.r.t. to the as-if rule) or not, which effects whether an infinite loop can also be optimized away. We can see from C Compilers Disprove Fermat’s Last Theorem that before C11 there was at least some room for interpretation:

一个有趣的次要问题是,无限循环是否被认为是可观察的行为(w.r.t)。对于as-if规则)或not,它会影响一个无限循环是否也可以被优化。我们可以从C编译器中看到证明费马的最后一个定理在C11之前至少有一些解释的空间

Many knowledgeable people (including me) read this as saying that the termination behavior of a program must not be changed. Obviously some compiler writers disagree, or else don’t believe that it matters. The fact that reasonable people disagree on the interpretation would seem to indicate that the C standard is flawed.

许多有知识的人(包括我)读到这句话,认为程序的终止行为不能改变。显然,有些编译器作者不同意,或者不相信这一点。理性的人对这种解释持不同意见,这一事实似乎表明C标准是有缺陷的。

C11 adds clarification to section 6.8.5 Iteration statements and is covered in more detail in this answer.

C11对6.8.5迭代语句进行了澄清,并在此答案中详细介绍。

#2


20  

In the optimized version, the compiler has decided a few things:

在优化的版本中,编译器决定了一些事情:

  1. The array a doesn't change before that test.
  2. 数组a在测试之前不会改变。
  3. The array a doesn't contain a 5.
  4. 数组a不包含5。

Therefore, we can rewrite the code as:

因此,我们可以把代码重写为:

void fn(void) {
  int a[1] = {0};
  int j = 0;
  while(true) ++j;
  a[j] = 10;
}

Now, we can make further decisions:

现在,我们可以做进一步的决定:

  1. All the code after the while loop is dead code (unreachable).
  2. while循环之后的所有代码都是死代码(不可访问)。
  3. j is written but never read. So we can get rid of it.
  4. j是写的,但从不读。我们可以把它消掉。
  5. a is never read.
  6. 一个从来没有阅读。

At this point, your code has been reduced to:

此时,您的代码已简化为:

void fn(void) {
  int a[1] = {0};
  while(true);
}

And we can make the note that a is now never read, so let's get rid of it as well:

我们可以注意到a现在从来没有读过,所以我们也把它去掉:

void fn(void) {
  while(true);
}

Now, the unoptimized code:

In unoptimized generated code, the array will remain in memory. And you'll literally walk it at runtime. And it's possible that there will be a 5 thats readable after it once you walk past the end of the array.

在未优化的生成代码中,数组将保留在内存中。你会在运行时完成。当你走过数组的末尾时,有可能会有一个5是可读的。

Which is why the unoptimized version sometimes doesn't crash and burn.

这就是为什么没有优化的版本有时不会崩溃和烧毁。

#3


7  

If the loop does get optimized out into an infinite loop, it could be due to static code analyzis seeing that your array is

如果循环被优化成一个无限循环,这可能是由于静态代码分析发现你的数组是

  1. not volatile

    不挥发性

  2. contains only 0

    只包含0

  3. never gets written to

    不会被写入

and thus it is not possible for it to contain the number 5. Which means an infinite loop.

因此它不可能包含数字5。也就是无限循环。

Even if it didn't do this, your approach could fail easily. For example, it's possible that some compiler would optimize your code without making your loop infinite, but would stuff the contents of i into a register, making it unavailable from the stack.

即使它没有这样做,您的方法也很容易失败。例如,一些编译器可能会优化您的代码,而不会使您的循环成为无限的,但是会将i的内容填充到寄存器中,使它从堆栈中不可用。

As a side note, I bet what your friend actually expected was this:

顺便说一句,我敢打赌你朋友真正想要的是:

void fn(void)
{
  /* write something after this comment so that the program output is 10 */
  printf("10\n"); /* Output 10 */
  while(1); /* Endless loop, function won't return, i won't be output */
  /* write something before this comment */
}

or this (if stdlib.h is included):

或(如果stdlib。h是包括):

void fn(void)
{
  /* write something after this comment so that the program output is 10 */
  printf("10\n"); /* Output 10 */
  exit(0); /* Exit gracefully */
  /* write something before this comment */
}