当我启动apache并开始杀死我的机器时，如何防止大量的apache进程产生?

I have a highly trafficked application on one debian machine and apache has started acting strange.

我在一台debian机器上有一个高通信量的应用程序，apache已经开始变得奇怪了。

Every time I start apache, tons of apache processes are spawned, the app doesn't load at all, and very quickly the whole machine freezes and must be powercycled to reboot.

每次我启动apache时，都会生成大量的apache进程，应用程序根本不会加载，而且很快整个机器就会死机，必须进行powercycycle以重新启动。

Here is what I get for top immediately after starting apache:

下面是我刚开始使用apache时得到的结果:

top -   20:14:44    up         1:16,      2 users,    load average: 0.48, 0.10, 0.03
Tasks:  330 total,  5 running, 325 sleeping,   0 stopped,   0 zombie
Cpu(s): 12.0%us,    21.4%sy,   0.0%ni,        65.7%id,   0.2%wa,  0.1%hi,  0.7%si,  0.0%st
Mem:    8179920k    total,     404984k used,  7774936k free,    60716k buffers
Swap:   2097136k    total,     0k used,       2097136k free,    43424k cached


10251 www-data  15   0  467m 8100 4016 S    6  0.1   0:00.04 apache2
10262 www-data  15   0  467m 8092 4012 S    6  0.1   0:00.05 apache2
10360 www-data  15   0  468m 8296 4016 S    6  0.1   0:00.05 apache2
10428 www-data  15   0  468m 8272 3992 S    6  0.1   0:00.05 apache2
10241 www-data  15   0  467m 8256 4012 S    4  0.1   0:00.03 apache2
10259 www-data  15   0  467m 8092 4012 S    4  0.1   0:00.04 apache2
10274 www-data  15   0  467m 8056 4012 S    4  0.1   0:00.03 apache2
10291 www-data  15   0  468m 8292 4012 S    4  0.1   0:00.03 apache2
10293 www-data  15   0  468m 8292 4012 S    4  0.1   0:00.03 apache2
10308 www-data  15   0  468m 8296 4016 S    4  0.1   0:00.02 apache2
10317 www-data  15   0  468m 8292 4012 S    4  0.1   0:00.02 apache2
10320 www-data  15   0  468m 8292 4012 S    4  0.1   0:00.04 apache2
10325 www-data  15   0  468m 8292 4012 S    4  0.1   0:00.04 apache2

And so forth.. with more apache2 processes.

等等. .有更多的输入过程。

Less than a minute later, you can see below that the load has gone from 0.48 to 2.17. If I do not stop apache at this point, the load continues to rise over a few minutes or less until the machine dies.

不到一分钟后，您可以看到负载从0.48增加到2.17。如果我现在不停止apache，那么负载将继续上升几分钟或更少，直到机器停止。

top -    20:15:34 up 1:17,       2 users,  load average: 2.17, 0.62, 0.21
Tasks:   1850 total,  5 running, 1845 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.3%us,      2.1%sy,    0.0%ni, 96.4%id,  0.0%wa,  0.1%hi,  1.0%si,  0.0%st
Mem:     8179920k     total,     1938524k used,  6241396k free,    60860k buffers
Swap:    2097136k     total,     0k used,  2097136k free,    44196k cached

We have a firewall where we whitelist the addresses we know are allowed to hit our site.

我们有一个防火墙，在那里我们把我们知道的地址写进我们的网站。

Any ideas about what the problem might be are very welcome.

任何关于问题可能是什么的想法都是非常受欢迎的。

Thanks!

谢谢!

6 个解决方案

#1

You have probably made the error of configuring Apache to use far more than all of your ram. This is an easy mistake to make.

您可能已经错误地配置了Apache，使其使用的内存远远超过您的所有ram。这是一个很容易犯的错误。

I am assuming you are using a Prefork Apache, and an in-process application server (such as PHP or mod_perl). In this model, you will end up with a maximum of (MaxClients * max memory usage of your application per process) memory used. If you don't have nearly that much, it's time to decrease one, the other or both.

我假设您正在使用一个Prefork Apache和一个进程内应用程序服务器(比如PHP或mod_perl)。在这个模型中，您最终将得到所使用的最大内存(maxclient *每个进程的应用程序最大内存使用量)。如果你没有那么多，那么是时候减少一个，另一个，或者两者兼而有之了。

In the general case, this means decreasing MaxClients to the point where your server has enough ram to cope.

在一般情况下，这意味着将maxclient降低到您的服务器有足够的ram可以应付的程度。

The default values typically used for MaxClients (150 is typical) are not suitable for running an in-process heavyweight application server on a modest machine if you are using the Prefork model (Most application servers either don't support, or discourage, the use of threaded models).

如果您使用的是Prefork模型(大多数应用程序服务器都不支持或不鼓励使用线程模型)，那么通常用于MaxClients(150)的默认值不适合在普通机器上运行进程内重量级应用服务器。

However, decreasing MaxClients will eventually cause the application to become unavailable, particularly if you have keepalives on and the keepalive timeout too long. Processes which are just keeping a connection alive (state K in server-status) still use a lot of RAM, and that may be a problem - try to minimise keepalive timeout, or turn it off altogether.

但是，减少MaxClients最终会导致应用程序不可用，特别是如果您有keepalives和keepalive超时时间过长。正在保持连接的进程(状态K在服务器状态)仍然使用大量的RAM，这可能是一个问题——尽量减少keepalive超时，或者完全关闭它。

You need to keep an eye on server-status (as provided by mod_status).

您需要关注服务器状态(由mod_status提供)。

Of course you should only make ANY of these changes if you understand the consequences. Think twice, change the config once. If you have ANY ability to test the changes with simulated load on a similar spec non-production machine, do so.

当然，只有在理解了这些变化的后果之后，你才应该做这些改变。考虑再三，更改配置一次。如果您有能力在类似规格的非生产机器上使用模拟负载测试更改，请这样做。

#2

use ps -aux | grep apache to find out the number of processes that apache is running on. Look out for the "RSS" column which gives an estimate of the memory used by each process. Alternatively you can use "top", where you shift + f and then select the %MEM column to sort the processes by memory usage.

使用ps -aux | grep apache来查找apache正在运行的进程数量。查找“RSS”列，该列给出每个进程使用的内存的估计。您也可以使用“top”，其中您将shift + f，然后选择%MEM列，按内存使用对进程进行排序。

The number of processes is determined by "MaxClients" directive in your apache.conf file. The way you come to this figure is as described by this page;

进程的数量由apache中的“MaxClients”指令决定。conf文件。你得出这个数字的方式如本页所述;

SSH into your server as root.
以root身份登录到服务器。
Run top.
运行。
Press shift + m.
按下shift + m。
Note the highest RES memory used by httpd.
注意httpd使用的最高的RES内存。
Hit Q to exit top.
按Q退出顶部。
Execute: service httpd stop (In debian, sudo service apache2 stop)
执行:服务httpd停止(在debian中，sudo服务apache2停止)
Once httpd is stopped, execute: free -m
一旦停止httpd，执行:free -m
Note the memory listed under "used".
注意“used”下面列出的内存。
Find the guaranteed memory for your VPS plan. Support can tell you how much you have guaranteed if you cannot find it.
为您的VPS计划找到保证内存。支持可以告诉你，如果你找不到它，你保证了多少。
Subtract the memory USED from the memory that your plan is GUARANTEED. This will give you your base FREE MEMORY POOL.
从计划保证的内存中减去使用的内存。这将为您提供基本的空闲内存池。
Multiply the value of your FREE MEMORY POOL by 0.8 to find your average AVAILABLE APACHE POOL (this will allow you a 20% memory reserve for burst periods).
将您的空闲内存池的值乘以0.8，找到平均可用的APACHE池(这将允许您在短时间内存储20%的内存)。
Divide your AVAILABLE APACHE POOL by the highest RES memory used by httpd. This will give you the MaxClients value that should be set for your system. (Round it to the nearest integer less than this value if it has a fraction component.)
将可用的APACHE池除以httpd使用的最高RES内存。这将为您的系统设置MaxClients值。(四舍五入到小于这个值的最近整数，如果它有一个分数分量)

The right value for "MaxClients" will ensure the right memory allocation for your apache server. That's how I solved it.

“MaxClients”的正确值将确保apache服务器的正确内存分配。我就是这样解出来的。

In Debian, apache conf file is at /etc/apache2/apache2.conf

在Debian中，apache conf文件位于/etc/apache2/apache2.conf

#3

Have you changed your configuration file recently? If yes, I trust you keep the old version for diffing?

您最近更改了配置文件吗?如果是的话，我相信你保留旧版本的扩散?

If not, search for the "StartServers", "MaxSpareServers" and "MinSpareServers" directives. Generally you want to leave these at defaults, but it's possible that they were intentionally set high (bad idea) or accidentally set that way due to a bad config edit.

如果没有，搜索“StartServers”、“MaxSpareServers”和“MinSpareServers”指示。通常情况下，您希望在缺省情况下保留这些值，但可能是故意设置了high(坏主意)或意外地设置了这种方式，这是由于配置编辑错误造成的。

If this doesn't help, it's time to look outside Apache, for some process that's opening connections at a fast rate (could be that there's a testing process that's run amok).

如果这没有帮助，那么是时候看看Apache之外的一些进程了，这些进程正在以更快的速度打开连接(可能是有一个正在疯狂运行的测试进程)。

First step is the access log. Second step is to run netstat, to see where the connections might be coming from. And if it's running on the same system, you can look in /proc/*/fd to find the two ends of the connection.

第一步是访问日志。第二步是运行netstat，查看连接可能来自何处。如果它在同一个系统上运行，可以在/proc/ fd中查找连接的两端。

#4

As has been said (assuming Prefork Apache) - MaxClients = max processes at once.

如前所述(假设Prefork Apache) - MaxClients = max进程。

If you find you are getting hammered with real traffic (and not a mis-configured StartServers/Min/MaxSpareServers), there are some other things you can do:

如果你发现你受到了真实流量的困扰(而不是配置错误的StartServers/Min/MaxSpareServers)，你可以做一些其他的事情:

Set up a separate, lightweight apache process (or lighttpd) for your static content. That way all the small, static stuff doesn't "pollute" your heavy-weight app process. This can be on the same server, or a different one. Doesn't matter.
为静态内容设置一个单独的、轻量级的apache进程(或lighttpd)。这样，所有的小的、静态的东西都不会“污染”你沉重的应用程序进程。可以在同一台服务器上，也可以在另一台服务器上。没关系。
Put a reverse proxy like Squid in front of your Apache process. The reverse proxy will quickly suck down the content from Apache and store it in memory and then parcel it back out to the client. This way AOL users on 14.4kb modems don't hog one of your valuable Apache slots. As a bonus, such a setup can be configured to cache some of your content to reduce the load on your Apache processes.
在Apache进程前面放一个反向代理，如Squid。反向代理将快速地从Apache获取内容并将其存储在内存中，然后将其打包回客户机。这样的话，14.4kb调制解调器上的AOL用户就不会占用你宝贵的Apache资源。另外，可以配置这样的设置来缓存一些内容，以减少Apache进程的负载。

#5

This question is ancient, but I feel compelled to add an answer here because all of the existing answers are overlooking a key piece of information from the OP: After the load has begun to rise for a few minutes, top reports that there are still ample CPU & memory resources available. There is usually one culprit remaining, and that's I/O.

这个问题是古老的,但我觉得有必要添加一个答案,因为所有现有的答案都忽视了一个关键的信息来自OP:负载后开始上升,几分钟前报告,仍有足够的CPU和内存资源。通常还有一个原因，那就是I/O。

Check if there is a full partition with df -h. If not, see if your application is thrashing the disk using vmstat 1 10 or iostat 1 10 (these are provided by the 'sysstat' package on Debian/Ubuntu). If you still don't see an issue there, perhaps you have device level I/O errors or network trouble for network-mounted storage. Check the system and daemon log files.

检查是否有一个使用df -h的完整分区。如果没有，请查看您的应用程序是否正在使用vmstat 1 10或iostat 1 10来抖动磁盘(这些由Debian/Ubuntu上的“sysstat”包提供)。如果您仍然没有发现问题，那么可能存在设备级I/O错误或网络挂载存储的网络问题。检查系统和守护进程日志文件。

#6

Your 'top' output shows that you have plenty of free memory, so I don't think that MaxClients is an issue (unless there is some problem with Apache allocating more than 2GB of memory?) Your error log should show errors if it is having problems creating more children.

您的“top”输出显示您有大量的空闲内存，因此我不认为MaxClients是一个问题(除非Apache分配超过2GB内存存在问题)。如果您的错误日志在创建更多子日志时遇到问题，那么它应该显示错误。

Most likely, your Apache processes really are using a lot of resources. If you are running PHP apps, try installing eAccelerator which does a good job of optimizing and caching PHP code. Other things might include heavy MySQL queries, a slow DNS resolver, etc. Beyond that, it gets more into understanding what programs are being hit and what they are doing.

很可能，您的Apache进程确实在使用大量资源。如果您正在运行PHP应用程序，请尝试安装eAccelerator，它可以很好地优化和缓存PHP代码。其他的事情可能包括大量的MySQL查询、一个缓慢的DNS解析器等等。除此之外，它还会更深入地了解哪些程序正在被攻击以及它们在做什么。

#1