在Linux下的segfault上的自重启程序。

时间:2023-01-26 15:16:00

Under Linux what would be the best way for a program to restart itself on a crash by catching the exception in a crashhandler (for example on a segfault)?

在Linux下,通过捕获crashhandler(例如segfault)中的异常,程序在崩溃时重启自己的最佳方式是什么?

7 个解决方案

#1


6  

You can have a loop in which you essentially fork(), do the real work in the child, and just wait on the child and check its exit status in the parent. You can also use a system which monitors and restarts programs in a similar fashion, such as daemontools, runit, etc.

您可以在循环中使用fork(),在子进程中执行真正的工作,然后等待子进程并检查它在父进程中的退出状态。您还可以使用以类似方式监视和重新启动程序的系统,如daemontools、runit等。

#2


9  

simplest is

最简单的是

while [ 1 ]; do ./program && break; done

basically, you run program until it is return 0, then you break.

基本上,你运行程序直到它返回0,然后你中断。

#3


7  

SIGSEGV can be caught (see man 3 signal or man 2 sigaction), and the program can call one of the exec family of function on itself in order to restart. Similarly for most runtime crashes (SIGFPE, SIGILL, SIGBUS, SIGSYS, ...).

可以捕获SIGSEGV(参见man 3信号或man 2信号),程序可以调用自身的exec函数族之一以重新启动。类似地,对于大多数运行时崩溃(SIGFPE、SIGILL、SIGBUS、SIGSYS…)。

I'd think a bit before doing this, though. It is a rather unusual strategy for a unix program, and you may surprise your users (not necessarily in a pleasant way, either).

不过,在做这件事之前,我想了一下。对于unix程序来说,这是一种非常不寻常的策略,您可能会让您的用户感到惊讶(也不一定是以令人愉快的方式)。

In any case, be sure to not auto-restart on SIGTERM if there are any resources you want to clean up before dying, otherwise angry users will use SIGKILL and you'll leave a mess.

在任何情况下,如果您希望在死之前清理任何资源,请确保不会在SIGTERM上自动重新启动,否则愤怒的用户将使用SIGKILL,您将会留下混乱。

#4


3  

As a complement to what was proposed here:

作为对这里提议的补充:

Another option is to do like it is done for getty daemon. Please see /etc/inittab and appropriate inittab(5) man page. It seems it is most system-wide mean ;-).

另一个选择是像盖蒂守护进程那样做。请参阅/etc/inittab和适当的inittab(5)手册页。它似乎是全系统范围的平均值;-)。

It could look like file fragment below. Obvious advantage this mean is pretty standard and it allows to control your daemon through run levels.

它可以看起来像下面的文件片段。这种方法的明显优势是非常标准的,它允许通过运行级别来控制守护进程。

# Run gettys in standard runlevels
1:2345:respawn:/sbin/mingetty tty1
2:2345:respawn:/sbin/mingetty tty2
3:2345:respawn:/sbin/mingetty tty3
4:2345:respawn:/sbin/mingetty tty4
5:2345:respawn:/sbin/mingetty tty5
6:2345:respawn:/sbin/mingetty tty6

#5


0  

Processes can't restart themselves, but you could use a utility like crontab(1) to schedule a script to check if the process is still alive at regular intervals.

进程不能自己重新启动,但是您可以使用crontab(1)之类的实用程序来调度脚本,以检查进程是否仍然正常运行。

#6


0  

The program itself obviously shouldn't check whether it is running or not running :)

显然,程序本身不应该检查它是否正在运行:)

Most enterprise solutions are actually just fancy ways of grepping the output from ps() for a given string, and performing an action in the event that certain criteria are satisfied - i.e. if your process is not found, then call the start script.

大多数企业解决方案实际上只是为给定字符串从ps()中提取输出,并在满足某些条件的情况下执行操作——例如,如果没有找到您的进程,那么调用start脚本。

#7


0  

Try the following code if its specific to segfault. This can be modified as required.

如果特定于segfault,请尝试下面的代码。这可以根据需要进行修改。

#include <stdio.h> 
#include <signal.h> 
#include <setjmp.h> 
#include <poll.h>

sigjmp_buf buf; 
void handler(int sig) { 
siglongjmp(buf, 1); 
} 
int main() { 
//signal(SIGINT, handler); 
//register all signals
struct sigaction new_action, old_action;
new_action.sa_handler = handler;
sigemptyset (&new_action.sa_mask);
new_action.sa_flags = 0;

sigaction (SIGSEGV, NULL, &old_action);
if (old_action.sa_handler != SIG_IGN)
sigaction (SIGSEGV, &new_action, NULL);

if (!sigsetjmp(buf, 1)){
printf("starting\n"); 
//code or function/method here
}
else{  
printf("restarting\n"); 
 //code or function/method here
}
while(1) {
poll(NULL,0,100); //ideally use usleep or nanosleep. for now using poll() as a timer
printf("processing...\n");
}
return 0; //or exit(SUCESS)
}

#1


6  

You can have a loop in which you essentially fork(), do the real work in the child, and just wait on the child and check its exit status in the parent. You can also use a system which monitors and restarts programs in a similar fashion, such as daemontools, runit, etc.

您可以在循环中使用fork(),在子进程中执行真正的工作,然后等待子进程并检查它在父进程中的退出状态。您还可以使用以类似方式监视和重新启动程序的系统,如daemontools、runit等。

#2


9  

simplest is

最简单的是

while [ 1 ]; do ./program && break; done

basically, you run program until it is return 0, then you break.

基本上,你运行程序直到它返回0,然后你中断。

#3


7  

SIGSEGV can be caught (see man 3 signal or man 2 sigaction), and the program can call one of the exec family of function on itself in order to restart. Similarly for most runtime crashes (SIGFPE, SIGILL, SIGBUS, SIGSYS, ...).

可以捕获SIGSEGV(参见man 3信号或man 2信号),程序可以调用自身的exec函数族之一以重新启动。类似地,对于大多数运行时崩溃(SIGFPE、SIGILL、SIGBUS、SIGSYS…)。

I'd think a bit before doing this, though. It is a rather unusual strategy for a unix program, and you may surprise your users (not necessarily in a pleasant way, either).

不过,在做这件事之前,我想了一下。对于unix程序来说,这是一种非常不寻常的策略,您可能会让您的用户感到惊讶(也不一定是以令人愉快的方式)。

In any case, be sure to not auto-restart on SIGTERM if there are any resources you want to clean up before dying, otherwise angry users will use SIGKILL and you'll leave a mess.

在任何情况下,如果您希望在死之前清理任何资源,请确保不会在SIGTERM上自动重新启动,否则愤怒的用户将使用SIGKILL,您将会留下混乱。

#4


3  

As a complement to what was proposed here:

作为对这里提议的补充:

Another option is to do like it is done for getty daemon. Please see /etc/inittab and appropriate inittab(5) man page. It seems it is most system-wide mean ;-).

另一个选择是像盖蒂守护进程那样做。请参阅/etc/inittab和适当的inittab(5)手册页。它似乎是全系统范围的平均值;-)。

It could look like file fragment below. Obvious advantage this mean is pretty standard and it allows to control your daemon through run levels.

它可以看起来像下面的文件片段。这种方法的明显优势是非常标准的,它允许通过运行级别来控制守护进程。

# Run gettys in standard runlevels
1:2345:respawn:/sbin/mingetty tty1
2:2345:respawn:/sbin/mingetty tty2
3:2345:respawn:/sbin/mingetty tty3
4:2345:respawn:/sbin/mingetty tty4
5:2345:respawn:/sbin/mingetty tty5
6:2345:respawn:/sbin/mingetty tty6

#5


0  

Processes can't restart themselves, but you could use a utility like crontab(1) to schedule a script to check if the process is still alive at regular intervals.

进程不能自己重新启动,但是您可以使用crontab(1)之类的实用程序来调度脚本,以检查进程是否仍然正常运行。

#6


0  

The program itself obviously shouldn't check whether it is running or not running :)

显然,程序本身不应该检查它是否正在运行:)

Most enterprise solutions are actually just fancy ways of grepping the output from ps() for a given string, and performing an action in the event that certain criteria are satisfied - i.e. if your process is not found, then call the start script.

大多数企业解决方案实际上只是为给定字符串从ps()中提取输出,并在满足某些条件的情况下执行操作——例如,如果没有找到您的进程,那么调用start脚本。

#7


0  

Try the following code if its specific to segfault. This can be modified as required.

如果特定于segfault,请尝试下面的代码。这可以根据需要进行修改。

#include <stdio.h> 
#include <signal.h> 
#include <setjmp.h> 
#include <poll.h>

sigjmp_buf buf; 
void handler(int sig) { 
siglongjmp(buf, 1); 
} 
int main() { 
//signal(SIGINT, handler); 
//register all signals
struct sigaction new_action, old_action;
new_action.sa_handler = handler;
sigemptyset (&new_action.sa_mask);
new_action.sa_flags = 0;

sigaction (SIGSEGV, NULL, &old_action);
if (old_action.sa_handler != SIG_IGN)
sigaction (SIGSEGV, &new_action, NULL);

if (!sigsetjmp(buf, 1)){
printf("starting\n"); 
//code or function/method here
}
else{  
printf("restarting\n"); 
 //code or function/method here
}
while(1) {
poll(NULL,0,100); //ideally use usleep or nanosleep. for now using poll() as a timer
printf("processing...\n");
}
return 0; //or exit(SUCESS)
}