产生多个运行完成的线程与单个线程等待工作之间是否存在差异?

时间:2021-09-24 21:02:00

I'm writing a piece of software that does a single very long task. To allow interruption, we have added a check-pointing function that periodically (on the order of minutes) dumps an image of the program state to disk. This takes some time, however, so I would like to switch to a model where the checkpoints are written on a separate thread rather than blocking the primary worker. (Yes, I know I need to keep it thread-safe.)

我正在编写一个可以完成一项非常长的任务的软件。为了允许中断,我们添加了一个检查指向功能,定期(大约几分钟)将程序状态的映像转储到磁盘。但是,这需要一些时间,所以我想切换到一个模型,在这个模型中,检查点写在一个单独的线程上,而不是阻塞主要工作者。 (是的,我知道我需要保持线程安全。)

As I see it, there are two primary methods of accomplishing this task:

在我看来,有两种主要方法可以完成这项任务:

  1. For each checkpoint, I pthread_create() a thread which will execute the checkpointing function once and then terminate.
  2. 对于每个检查点,我pthread_create()一个线程,它将执行一次检查点函数然后终止。
  3. For each checkpoint, I pthread_cond_signal() a single waiting thread that executes the checkpointing function and then returns to waiting.
  4. 对于每个检查点,我pthread_cond_signal()一个等待线程执行检查点函数,然后返回等待。

Both methods require making an atomic copy of my working state and passing it to the checkpoint thread, as well as ensuring that the checkpoint complete successfully before I try another.

这两种方法都需要制作我的工作状态的原子副本并将其传递给检查点线程,并确保检查点在我尝试另一个之前成功完成。

My question is if there is a compelling reason to use one method over the other.

我的问题是,是否有令人信服的理由使用一种方法而不是另一种方法。

2 个解决方案

#1


0  

Don't go with continually creating/terminating/destroying/joining threads if you can possibly avoid it. It's expensive in terms of latency and cycles, has the risk of unwanted multiple threads doing overlapping work and is difficult to debug.

如果可以避免,请不要继续创建/终止/销毁/加入线程。它在延迟和周期方面很昂贵,存在不需要的多线程执行重叠工作并且难以调试的风险。

Just create one thread once, at app startup, and don't terminate it. Loop it round some synchro object and sSignal it when you need to, or run a timer or sleep loop to perform your image dumps.

只需在应用启动时创建一个线程,并且不要终止它。将它循环到某个同步对象并在需要时sSignal,或运行计时器或睡眠循环来执行图像转储。

#2


2  

I would argue that pthreads are a bad fit for your requirements:
Regardless of whether you spawn a new thread for each backup or use a threadpool, you need to make a deep copy of your working-set, which is expensive. Also, you may need extensive synchronization if you go with the thread-pool. Instead, there's a much easier way to do it:

fork().

The child process inherits the entire memory-space of the parent, but on modern OSs, the copy is lazy (copy on write). Also, you don't need to worry about cleaning up the thread you started, because the fork()ed child releases its resources when it terminates. If your original program is already multithreaded, you may wish to make sure to only use async-safe functions in the child, but thankfully write() is async-safe (as is open() and unlink()). To avoid your child turning into a zombie, you need to call waitid(P_ALL, 0, siginfo_t *infop, WEXITED | WNOHANG) in a loop until it returns nonzero or the siginfo_t * indicates that the child has not yet exited. This avoids stalling the parent in case the child is not done with the backup before the next backup-point is reached.

我认为pthreads不适合您的要求:无论您是为每个备份生成新线程还是使用线程池,您都需要制作工作集的深层副本,这很昂贵。此外,如果您使用线程池,则可能需要进行大量同步。相反,有一种更简单的方法:fork()。子进程继承父进程的整个内存空间,但在现代操作系统上,副本是惰性的(写入时复制)。此外,您不必担心清理已启动的线程,因为fork()ed子进程终止时会释放其资源。如果你的原始程序已经是多线程的,你可能希望确保只在子进程中使用异步安全函数,但幸好write()是异步安全的(如open()和unlink())。为了避免你的孩子变成僵尸,你需要在循环中调用waitid(P_ALL,0,siginfo_t * infop,WEXITED | WNOHANG),直到它返回非零或siginfo_t *表示孩子还没有退出。这样可以避免在达到下一个备份点之前孩子没有完成备份的情况下拖延父项。

#1


0  

Don't go with continually creating/terminating/destroying/joining threads if you can possibly avoid it. It's expensive in terms of latency and cycles, has the risk of unwanted multiple threads doing overlapping work and is difficult to debug.

如果可以避免,请不要继续创建/终止/销毁/加入线程。它在延迟和周期方面很昂贵,存在不需要的多线程执行重叠工作并且难以调试的风险。

Just create one thread once, at app startup, and don't terminate it. Loop it round some synchro object and sSignal it when you need to, or run a timer or sleep loop to perform your image dumps.

只需在应用启动时创建一个线程,并且不要终止它。将它循环到某个同步对象并在需要时sSignal,或运行计时器或睡眠循环来执行图像转储。

#2


2  

I would argue that pthreads are a bad fit for your requirements:
Regardless of whether you spawn a new thread for each backup or use a threadpool, you need to make a deep copy of your working-set, which is expensive. Also, you may need extensive synchronization if you go with the thread-pool. Instead, there's a much easier way to do it:

fork().

The child process inherits the entire memory-space of the parent, but on modern OSs, the copy is lazy (copy on write). Also, you don't need to worry about cleaning up the thread you started, because the fork()ed child releases its resources when it terminates. If your original program is already multithreaded, you may wish to make sure to only use async-safe functions in the child, but thankfully write() is async-safe (as is open() and unlink()). To avoid your child turning into a zombie, you need to call waitid(P_ALL, 0, siginfo_t *infop, WEXITED | WNOHANG) in a loop until it returns nonzero or the siginfo_t * indicates that the child has not yet exited. This avoids stalling the parent in case the child is not done with the backup before the next backup-point is reached.

我认为pthreads不适合您的要求:无论您是为每个备份生成新线程还是使用线程池,您都需要制作工作集的深层副本,这很昂贵。此外,如果您使用线程池,则可能需要进行大量同步。相反,有一种更简单的方法:fork()。子进程继承父进程的整个内存空间,但在现代操作系统上,副本是惰性的(写入时复制)。此外,您不必担心清理已启动的线程,因为fork()ed子进程终止时会释放其资源。如果你的原始程序已经是多线程的,你可能希望确保只在子进程中使用异步安全函数,但幸好write()是异步安全的(如open()和unlink())。为了避免你的孩子变成僵尸,你需要在循环中调用waitid(P_ALL,0,siginfo_t * infop,WEXITED | WNOHANG),直到它返回非零或siginfo_t *表示孩子还没有退出。这样可以避免在达到下一个备份点之前孩子没有完成备份的情况下拖延父项。