在fork和exec之间运行的线程阻止其他线程读取

时间:2022-12-06 09:07:11

While studying the possibility of improving Recoll performance by using vfork() instead of fork(), I've encountered a fork() issue which I can't explain.

在研究使用vfork()而不是fork()改进Recoll性能的可能性时,我遇到了无法解释的fork()问题。

Recoll repeatedly execs external commands to translate files, so that's what the sample program does: it starts threads which repeatedly execute "ls" and read back the output.

Recoll重复执行外部命令来翻译文件,这就是示例程序的工作:它启动反复执行“ls”并读取输出的线程。

The following problem is not a "real" one, in the sense that an actual program would not do what triggers the issue. I just stumbled on it while having a look at what threads were stopped or not between fork()/vfork() and exec().

下面的问题不是“真正的”问题,因为实际的程序不会做引发问题的事情。我只是在查看fork()/vfork()和exec()之间停止了什么线程时偶然发现了它。

When I have one of the threads busy-looping between fork() and exec(), the other thread never completes the data reading: the last read(), which should indicate eof, is blocked forever or until the other thread's looping ends (at which point everything resumes normally, which you can see by replacing the infinite loop with one which completes). While read() is blocked, the "ls" command has exited (ps shows <defunct>, a zombie).

当我有一个线程之间busy-looping fork()和exec(),其他线程没有完成数据阅读:过去读(),这将表明eof,永远被阻塞,直到另一个线程的循环结束(此时一切恢复正常,你可以看到通过替换一个完成的无限循环)。当read()被阻塞时,“ls”命令已经退出(ps显示 <死> ,一个僵尸)。

There is a random aspect to the issue, but the sample program "succeeds" most of the time. I tested with Linux kernels 3.2.0 (Debian), 3.13.0 (Ubuntu) and 3.19 (Ubuntu). Works on a VM, but you need at least 2 procs, I could not make it work with one processor.

这个问题有一个随机的方面,但是示例程序在大多数情况下“成功”。我测试了Linux内核3.2.0 (Debian)、3.13.0 (Ubuntu)和3.19 (Ubuntu)。可以在VM上工作,但是至少需要两个进程,我不能让它在一个处理器上工作。

Here follows the sample program, I can't see what I'm doing wrong.

下面是示例程序,我看不出我做错了什么。

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <errno.h>
#include <memory.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <pthread.h>
#include <iostream>

using namespace std;

struct thread_arg {
    int tnum;
    int loopcount;
    const char *cmd;
};

void* task(void *rarg)
{
    struct thread_arg *arg = (struct thread_arg *)rarg;
    const char *cmd = arg->cmd;

    for (int i = 0; i < arg->loopcount; i++) {
        pid_t pid;
        int pipefd[2];

        if (pipe(pipefd)) {
            perror("pipe");
            exit(1);
        }
        pid = fork();
        if (pid) {
            cerr << "Thread " << arg->tnum << " parent " << endl;
            if (pid < 0) {
                perror("fork");
                exit(1);
            }
        } else {
            // Child code. Either exec ls or loop (thread 1)
            if (arg->tnum == 1) {
                cerr << "Thread " << arg->tnum << " looping" <<endl;
                for (;;);
                //for (int cc = 0; cc < 1000 * 1000 * 1000; cc++);
            } else {
                cerr << "Thread " << arg->tnum << " child" <<endl;
            }

            close(pipefd[0]);
            if (pipefd[1] != 1) {
                dup2(pipefd[1], 1);
                close(pipefd[1]);
            }
            cerr << "Thread " << arg->tnum << " child calling exec" <<
                endl;
            execlp(cmd, cmd, NULL);
            perror("execlp");
            _exit(255);
        }

        // Parent closes write side of pipe
        close(pipefd[1]);
        int ntot = 0, nread;
        char buf[1000];
        while ((nread = read(pipefd[0], buf, 1000)) > 0) {
            ntot += nread;
            cerr << "Thread " << arg->tnum << " nread " << nread << endl;
        }
        cerr << "Total " <<  ntot << endl;

        close(pipefd[0]);
        int status;
        cerr << "Thread " << arg->tnum << " waiting for process " << pid
             << endl;
        if (waitpid(pid, &status, 0) != -1) {
            if (status) {
                cerr << "Child exited with status " << status << endl;
            }
        } else {
            perror("waitpid");
        }
    }

    return 0;
}

int main(int, char **)
{
    int loopcount = 5;
    const char *cmd =  "ls";

    cerr << "cmd [" << cmd << "]" << " loopcount " << loopcount << endl;

    const int nthreads = 2;
    pthread_t threads[nthreads];

    for (int i = 0; i < nthreads; i++) {
        struct thread_arg *arg = new struct thread_arg;
        arg->tnum = i;
        arg->loopcount = loopcount;
        arg->cmd = cmd;
        int err;
        if ((err = pthread_create(&threads[i], 0, task, arg))) {
            cerr << "pthread_create failed, err " << err << endl;
            exit(1);
        }
    }

    void *status;
    for (int i = 0; i < nthreads; i++) {
        pthread_join(threads[i], &status);
        if (status) {
            cerr << "pthread_join: " << status << endl;
            exit(1);
        }
    }
}

1 个解决方案

#1


2  

What's happening is that your pipes are getting inherited by both child processes instead of just one.

所发生的是,您的管道被两个子进程继承,而不仅仅是一个。

What you want to do is:

你想做的是:

  1. Create pipe with 2 ends
  2. 用2个端口创建管道。
  3. fork(), child inherits both ends of the pipe
  4. fork(),子代继承管道的两端
  5. child closes the read end, parent closes the write end
  6. 子窗口关闭读端,父窗口关闭写端

...so that the child ends up with just one end of one pipe, which is dup2()'ed to stdout.

…所以子节点只剩下一个管道的一端,也就是dup2() ed stdout。

But your threads race with each other, so what can happen is this:

但是你的线程是相互竞争的,所以可能发生的是:

  1. Thread 1 creates pipe with 2 ends
  2. 线程1创建两端为2的管道
  3. Thread 0 creates pipe with 2 ends
  4. 线程0创建具有两个端点的管道
  5. Thread 1 fork()s. The child process has inherited 4 file descriptors, not 2!
  6. 线程1 fork()。子进程继承了4个文件描述符,而不是2!
  7. Thread 1's child closes the read end of the pipe that thread 1 opened, but it keeps a reference to the read end and write end of thread 0's pipe too.
  8. 线程1的子线程关闭了线程1打开的管道的读端,但是它保留了对读端的引用,并写入线程0的管道的末尾。

Later, thread 0 waits forever because it never gets an EOF on the pipe it is reading because the write end of that pipe is still held open by thread 1's child.

稍后,线程0将永远等待,因为它在正在读取的管道上从未获得EOF,因为该管道的写入端仍然由线程1的子线程保持打开。

You will need to define a critical section that starts before pipe(), encloses the fork(), and ends after close() in the parent, and enter that critical section from only one thread at a time using a mutex.

您将需要定义一个在pipe()之前开始的关键部分,将fork()封装起来,并在父类的close()之后结束,并在使用互斥锁的时候从一个线程中输入该临界段。

#1


2  

What's happening is that your pipes are getting inherited by both child processes instead of just one.

所发生的是,您的管道被两个子进程继承,而不仅仅是一个。

What you want to do is:

你想做的是:

  1. Create pipe with 2 ends
  2. 用2个端口创建管道。
  3. fork(), child inherits both ends of the pipe
  4. fork(),子代继承管道的两端
  5. child closes the read end, parent closes the write end
  6. 子窗口关闭读端,父窗口关闭写端

...so that the child ends up with just one end of one pipe, which is dup2()'ed to stdout.

…所以子节点只剩下一个管道的一端,也就是dup2() ed stdout。

But your threads race with each other, so what can happen is this:

但是你的线程是相互竞争的,所以可能发生的是:

  1. Thread 1 creates pipe with 2 ends
  2. 线程1创建两端为2的管道
  3. Thread 0 creates pipe with 2 ends
  4. 线程0创建具有两个端点的管道
  5. Thread 1 fork()s. The child process has inherited 4 file descriptors, not 2!
  6. 线程1 fork()。子进程继承了4个文件描述符,而不是2!
  7. Thread 1's child closes the read end of the pipe that thread 1 opened, but it keeps a reference to the read end and write end of thread 0's pipe too.
  8. 线程1的子线程关闭了线程1打开的管道的读端,但是它保留了对读端的引用,并写入线程0的管道的末尾。

Later, thread 0 waits forever because it never gets an EOF on the pipe it is reading because the write end of that pipe is still held open by thread 1's child.

稍后,线程0将永远等待,因为它在正在读取的管道上从未获得EOF,因为该管道的写入端仍然由线程1的子线程保持打开。

You will need to define a critical section that starts before pipe(), encloses the fork(), and ends after close() in the parent, and enter that critical section from only one thread at a time using a mutex.

您将需要定义一个在pipe()之前开始的关键部分,将fork()封装起来,并在父类的close()之后结束,并在使用互斥锁的时候从一个线程中输入该临界段。