ftrace：通过echo从function_graph更改current_tracer时系统崩溃

I have been playing with ftrace recently to monitor some behavior characteristics of my system. I've been handling switching the trace on/off via a small script. After running the script, my system would crash and reboot itself. Initially, I believed that there might be an error with the script itself, but I have since determined that the crash and reboot is a result of echoing some tracer to /sys/kernel/debug/tracing/current_tracer when current_tracer is set to function_graph.

我最近一直在玩ftrace来监控我系统的某些行为特征。我一直在处理通过一个小脚本打开/关闭跟踪。运行脚本后，我的系统会崩溃并自行重启。最初，我认为脚本本身可能存在错误，但我已经确定崩溃和重启是由于当current_tracer设置为function_graph时将一些跟踪器回显到/ sys / kernel / debug / tracing / current_tracer。

That is, the following sequence of commands will produce the crash/reboot:

也就是说，以下命令序列将产生崩溃/重启：

echo "function_graph" > /sys/kernel/debug/tracing/current_tracer
echo "function" > /sys/kernel/debug/tracing/current_tracer

Durning the reboot after the crash caused by the above echo statements, I see a lot of output that reads:

在上述echo语句导致崩溃后重启，我看到很多输出内容如下：

clearing orphaned inode <inode>

清除孤立的inode

I tried to reproduce this problem by replacing the current_tracer value from function_graph to something else in a C program:

我试图通过将function_graph中的current_tracer值替换为C程序中的其他内容来重现此问题：

#include <stdio.h>
#include <fcntl.h>
#include <unistd.h>
#include <string.h>
#include <stdlib.h>

int openCurrentTracer()
{
        int fd = open("/sys/kernel/debug/tracing/current_tracer", O_WRONLY);
        if(fd < 0)
                exit(1);

        return fd;
}

int writeTracer(int fd, char* tracer)
{
        if(write(fd, tracer, strlen(tracer)) != strlen(tracer)) {
                printf("Failure writing %s\n", tracer);
                return 0;
        }

        return 1;
}

int main(int argc, char* argv[])
{
        int fd = openCurrentTracer();

        char* blockTracer = "blk";
        if(!writeTracer(fd, blockTracer))
                return 1;
        close(fd);

        fd = openCurrentTracer();
        char* graphTracer = "function_graph";
        if(!writeTracer(fd, graphTracer))
                return 1;
        close(fd);

        printf("Preparing to fail!\n");

        fd = openCurrentTracer();
        if(!writeTracer(fd, blockTracer))
                return 1;
        close(fd);

        return 0;
}

Oddly enough, the C program does not crash my system.

奇怪的是，C程序不会崩溃我的系统。

I originally encountered this problem while using Ubuntu (Unity environment) 16.04 LTS and confirmed it to be an issue on the 4.4.0 and 4.5.5 kernels. I have also tested this issue on a machine running Ubuntu (Mate environment) 15.10, on the 4.2.0 and 4.5.5 kernels, but was unable to reproduce the issue. This has only confused me further.

我最初在使用Ubuntu（Unity环境）16.04 LTS时遇到了这个问题，并确认它是4.4.0和4.5.5内核的问题。我还在运行Ubuntu（Mate环境）15.10的机器上测试了这个问题，在4.2.0和4.5.5内核上，但无法重现该问题。这让我更加困惑。

Can anyone give me insight on what is happening? Specifically, why would I be able to write() but not echo to /sys/kernel/debug/tracing/current_tracer?

谁能让我了解正在发生的事情？具体来说，为什么我能写（）但不能回显到/ sys / kernel / debug / tracing / current_tracer？

Update

更新

As vielmetti pointed out, others have had a similar issue (as seen here).

正如vielmetti指出的那样，其他人也有类似的问题（如此处所见）。

The ftrace_disable_ftrace_graph_caller() modifies jmp instruction at ftrace_graph_call assuming it's a 5 bytes near jmp (e9 ). However it's a short jmp consisting of 2 bytes only (eb ). And ftrace_stub() is located just below the ftrace_graph_caller so modification above breaks the instruction resulting in kernel oops on the ftrace_stub() with the invalid opcode like below:

ftrace_disable_ftrace_graph_caller（）在ftrace_graph_call处修改jmp指令，假设它在jmp（e9）附近是5个字节。然而，它是一个短的jmp，只包含2个字节（eb）。并且ftrace_stub（）位于ftrace_graph_caller的正下方，因此上面的修改会破坏导致ftrace_stub（）上的内核oops的指令，如下所示：

The patch (shown below) solved the echo issue, but I still do not understand why echo was breaking previously when write() was not.

补丁（如下所示）解决了回声问题，但我仍然不明白为什么当write（）不是时回声破坏了。

diff --git a/arch/x86/kernel/mcount_64.S b/arch/x86/kernel/mcount_64.S
index ed48a9f465f8..e13a695c3084 100644
--- a/arch/x86/kernel/mcount_64.S
+++ b/arch/x86/kernel/mcount_64.S
@@ -182,7 +182,8 @@ GLOBAL(ftrace_graph_call)
    jmp ftrace_stub
  #endif

 -GLOBAL(ftrace_stub)
 +/* This is weak to keep gas from relaxing the jumps */
 +WEAK(ftrace_stub)
    retq
  END(ftrace_caller)

via https://lkml.org/lkml/2016/5/16/493

通过https://lkml.org/lkml/2016/5/16/493

1 个解决方案

#1

Looks like you are not the only person to notice this behavior. I see

看起来你并不是唯一注意到这种行为的人。我懂了

https://lkml.org/lkml/2016/5/13/327
https://lkml.org/lkml/2016/5/13/327

as a report of the problem, and

作为问题的报告，和

https://lkml.org/lkml/2016/5/16/493
https://lkml.org/lkml/2016/5/16/493

as a patch to the kernel that addresses it. Reading through that whole thread it appears that the issue is some compiler optimizations.

作为解决它的内核的补丁。通过整个线程阅读，问题似乎是一些编译器优化。

#1