我在管理自己的服务器方面还很陌生,并且存在一些Apache2进程占用100%CPU的问题。这个过程持续约半小时,几个小时后,它将重新开始。重新启动Apache或重新启动服务器后,它甚至会立即发生。
该服务器用于为两个流量极低的网站提供服务。它使用了一个名为Pimcore的CMS,它本身是基于Symfony开发的。
DigitalOcean Droplet 1GB RAM
Ubuntu 18.04
PHP 7.2 fpm
MySql 14.14
Pimcore CMS(Symfony)
我之前在Ubuntu 17.10上运行了完全相同的设置,并且一切正常。由于我将安装程序移至Ubuntu 18.04(所有内容的全新安装)中的新服务器,因此我开始看到这些问题。
我怀疑它与某些PHP脚本的执行有关,但是我可以找出它的确切来源。
有人知道会发生什么吗?
Apache2 Config:
<IfModule mpm_prefork_module>
StartServers 2
MinSpareServers 3
MaxSpareServers 5
MaxRequestWorkers 20
MaxConnectionsPerChild 3000
MaxClients 15
</IfModule>
Apache模块:
Loaded Modules:
core_module (static)
so_module (static)
watchdog_module (static)
http_module (static)
log_config_module (static)
logio_module (static)
version_module (static)
unixd_module (static)
access_compat_module (shared)
alias_module (shared)
auth_basic_module (shared)
authn_core_module (shared)
authn_file_module (shared)
authz_core_module (shared)
authz_host_module (shared)
authz_user_module (shared)
autoindex_module (shared)
deflate_module (shared)
dir_module (shared)
env_module (shared)
filter_module (shared)
mime_module (shared)
mpm_prefork_module (shared)
negotiation_module (shared)
php7_module (shared)
proxy_module (shared)
proxy_fcgi_module (shared)
reqtimeout_module (shared)
rewrite_module (shared)
setenvif_module (shared)
socache_shmcb_module (shared)
ssl_module (shared)
status_module (shared)
wsgi_module (shared)
error.log
Fri May 1 11:01:48 2020 (1309): Error Cannot kill process 1069: Success!
[Fri May 01 11:01:49.207718 2020] [core:notice] [pid 923] AH00051: child pid 1309 exit signal Segmentation fault (11), possible coredump in /etc/apache2
[Fri May 01 11:13:15.899518 2020] [core:notice] [pid 923] AH00051: child pid 1333 exit signal Segmentation fault (11), possible coredump in /etc/apache2
[Fri May 01 11:13:15.899963 2020] [core:notice] [pid 923] AH00051: child pid 1383 exit signal Segmentation fault (11), possible coredump in /etc/apache2
[Fri May 01 11:13:15.899975 2020] [core:notice] [pid 923] AH00051: child pid 1406 exit signal Segmentation fault (11), possible coredump in /etc/apache2
[Fri May 01 11:13:15.900004 2020] [core:notice] [pid 923] AH00051: child pid 1305 exit signal Segmentation fault (11), possible coredump in /etc/apache2
[Fri May 01 11:13:15.900059 2020] [mpm_prefork:notice] [pid 923] AH00169: caught SIGTERM, shutting down
[Fri May 01 11:13:16.073253 2020] [mpm_prefork:notice] [pid 1605] AH00163: Apache/2.4.29 (Ubuntu) OpenSSL/1.1.1 mod_wsgi/4.5.17 Python/3.6 configured -- resuming normal operations
[Fri May 01 11:13:16.073329 2020] [core:notice] [pid 1605] AH00094: Command line: '/usr/sbin/apache2'
[Fri May 01 11:14:17.466068 2020] [core:notice] [pid 1605] AH00051: child pid 1613 exit signal Segmentation fault (11), possible coredump in /etc/apache2
[Fri May 01 11:14:17.466137 2020] [core:notice] [pid 1605] AH00051: child pid 1636 exit signal Segmentation fault (11), possible coredump in /etc/apache2
[Fri May 01 11:14:17.466181 2020] [mpm_prefork:notice] [pid 1605] AH00169: caught SIGTERM, shutting down
[Fri May 01 11:14:17.608696 2020] [mpm_prefork:notice] [pid 1685] AH00163: Apache/2.4.29 (Ubuntu) OpenSSL/1.1.1 mod_wsgi/4.5.17 Python/3.6 configured -- resuming normal operations
[Fri May 01 11:14:17.608770 2020] [core:notice] [pid 1685] AH00094: Command line: '/usr/sbin/apache2'
[Fri May 01 11:16:49.360625 2020] [core:notice] [pid 1685] AH00052: child pid 1696 exit signal Segmentation fault (11)
[Fri May 01 11:16:49.360697 2020] [core:notice] [pid 1685] AH00052: child pid 1717 exit signal Segmentation fault (11)
[Fri May 01 11:16:49.360708 2020] [core:notice] [pid 1685] AH00052: child pid 1719 exit signal Segmentation fault (11)
[Fri May 01 11:16:49.360724 2020] [core:notice] [pid 1685] AH00052: child pid 1722 exit signal Segmentation fault (11)
[Fri May 01 11:16:49.360780 2020] [mpm_prefork:notice] [pid 1685] AH00169: caught SIGTERM, shutting down
[Fri May 01 11:17:05.637473 2020] [mpm_prefork:notice] [pid 924] AH00163: Apache/2.4.29 (Ubuntu) OpenSSL/1.1.1 mod_wsgi/4.5.17 Python/3.6 configured -- resuming normal operations
[Fri May 01 11:17:05.651236 2020] [core:notice] [pid 924] AH00094: Command line: '/usr/sbin/apache2'
[Fri May 01 11:25:37.817879 2020] [core:notice] [pid 924] AH00051: child pid 946 exit signal Segmentation fault (11), possible coredump in /etc/apache2
[Fri May 01 11:25:37.817946 2020] [core:notice] [pid 924] AH00051: child pid 948 exit signal Segmentation fault (11), possible coredump in /etc/apache2
[Fri May 01 11:25:37.817960 2020] [core:notice] [pid 924] AH00051: child pid 1022 exit signal Segmentation fault (11), possible coredump in /etc/apache2
[Fri May 01 11:25:37.817972 2020] [core:notice] [pid 924] AH00051: child pid 1055 exit signal Segmentation fault (11), possible coredump in /etc/apache2
[Fri May 01 11:25:37.817984 2020] [core:notice] [pid 924] AH00051: child pid 1056 exit signal Segmentation fault (11), possible coredump in /etc/apache2
[Fri May 01 11:25:37.818020 2020] [mpm_prefork:notice] [pid 924] AH00169: caught SIGTERM, shutting down
[Fri May 01 11:25:37.957502 2020] [mpm_prefork:notice] [pid 1394] AH00163: Apache/2.4.29 (Ubuntu) OpenSSL/1.1.1 mod_wsgi/4.5.17 Python/3.6 configured -- resuming normal operations
[Fri May 01 11:25:37.957577 2020] [core:notice] [pid 1394] AH00094: Command line: '/usr/sbin/apache2'
顶部输出:
top - 11:34:04 up 17 min, 1 user, load average: 4.00, 3.92, 2.79
Tasks: 91 total, 5 running, 49 sleeping, 0 stopped, 0 zombie
%Cpu(s): 99.7 us, 0.3 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 1008848 total, 112120 free, 483364 used, 413364 buff/cache
KiB Swap: 0 total, 0 free, 0 used. 334772 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1405 www-data 20 0 561180 94488 43856 R 24.9 9.4 2:20.77 /usr/sbin/apache2 -k start
1427 www-data 20 0 550904 75304 36568 R 24.9 7.5 1:56.12 /usr/sbin/apache2 -k start
1429 www-data 20 0 552952 76684 36432 R 24.9 7.6 2:13.89 /usr/sbin/apache2 -k start
1437 www-data 20 0 550904 74748 36568 R 24.9 7.4 1:48.41 /usr/sbin/apache2 -k start
916 mysql 20 0 1410004 206444 17124 S 0.3 20.5 0:01.96 /usr/sbin/mysqld --daemonize --pid-file=/run/mysqld/mysqld.pid
1 root 20 0 159764 8756 6472 S 0.0 0.9 0:01.37 /sbin/init
2 root 20 0 0 0 0 S 0.0 0.0 0:00.00 [kthreadd]
3 root 20 0 0 0 0 I 0.0 0.0 0:00.00 [kworker/0:0]
4 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 [kworker/0:0H]
6 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 [mm_percpu_wq]
7 root 20 0 0 0 0 S 0.0 0.0 0:00.07 [ksoftirqd/0]
8 root 20 0 0 0 0 I 0.0 0.0 0:00.08 [rcu_sched]
9 root 20 0 0 0 0 I 0.0 0.0 0:00.00 [rcu_bh]
10 root rt 0 0 0 0 S 0.0 0.0 0:00.00 [migration/0]
11 root rt 0 0 0 0 S 0.0 0.0 0:00.00 [watchdog/0]
12 root 20 0 0 0 0 S 0.0 0.0 0:00.00 [cpuhp/0]
13 root 20 0 0 0 0 S 0.0 0.0 0:00.00 [kdevtmpfs]
14 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 [netns]
15 root 20 0 0 0 0 S 0.0 0.0 0:00.00 [rcu_tasks_kthre]
16 root 20 0 0 0 0 S 0.0 0.0 0:00.00 [kauditd]
17 root 20 0 0 0 0 S 0.0 0.0 0:00.00 [khungtaskd]
18 root 20 0 0 0 0 S 0.0 0.0 0:00.00 [oom_reaper]
19 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 [writeback]
20 root 20 0 0 0 0 S 0.0 0.0 0:00.00 [kcompactd0]
当使用strace跟踪其中一个CPU使用率很高的PID时,似乎无休止的流:
stat("/var/www/XYZ/XYZ/web/static/css/print.css", {st_mode=S_IFREG|0664, st_size=250, ...}) = 0
sendto(24, "\264\0\0\0\3SELECT id, \n (CASE"..., 184, MSG_DONTWAIT, NULL, 0) = 184
poll([{fd=24, events=POLLIN|POLLERR|POLLHUP}], 1, 86400000) = 1 ([{fd=24, revents=POLLIN}])
recvfrom(24, "\1\0\0\1\2/\0\0\2\3def\vXYZ\5cache\5"..., 3174, MSG_DONTWAIT, NULL, NULL) = 265
sendto(24, "<\0\0\0\3SELECT sourceId FROM docume"..., 64, MSG_DONTWAIT, NULL, 0) = 64
poll([{fd=24, events=POLLIN|POLLERR|POLLHUP}], 1, 86400000) = 1 ([{fd=24, revents=POLLIN}])
recvfrom(24, "\1\0\0\1\1]\0\0\2\3def\vXYZ\26docume"..., 2909, MSG_DONTWAIT, NULL, NULL) = 120
sendto(24, "\236\0\0\0\3SELECT id,language FROM doc"..., 162, MSG_DONTWAIT, NULL, 0) = 162
poll([{fd=24, events=POLLIN|POLLERR|POLLHUP}], 1, 86400000) = 1 ([{fd=24, revents=POLLIN}])
recvfrom(24, "\1\0\0\1\2\32\0\0\2\3def\0\0\0\2id\2id\f?\0\v\0\0\0\3!\0"..., 2789, MSG_DONTWAIT, NULL, NULL) = 95
sendto(24, "\264\0\0\0\3SELECT id, \n (CASE"..., 184, MSG_DONTWAIT, NULL, 0) = 184
poll([{fd=24, events=POLLIN|POLLERR|POLLHUP}], 1, 86400000) = 1 ([{fd=24, revents=POLLIN}])
recvfrom(24, "\1\0\0\1\2/\0\0\2\3def\vXYZ\5cache\5"..., 2694, MSG_DONTWAIT, NULL, NULL) = 265
sendto(24, "\271\0\0\0\3SELECT id, \n (CASE"..., 189, MSG_DONTWAIT, NULL, 0) = 189
poll([{fd=24, events=POLLIN|POLLERR|POLLHUP}], 1, 86400000) = 1 ([{fd=24, revents=POLLIN}])
recvfrom(24, "\1\0\0\1\2/\0\0\2\3def\vXYZ\5cache\5"..., 2429, MSG_DONTWAIT, NULL, NULL) = 104
access("/var/www/XYZ/XYZ/app/views/content/default.html.php", F_OK) = -1 ENOENT (No such file or directory)
access("/var/www/XYZ/XYZ/app/Resources/views/content/default.html.php", F_OK) = -1 ENOENT (No such file or directory)
access("/var/www/XYZ/XYZ/src/AppBundle/Resources/public/areas/blockquote/icon.png", F_OK) = -1 ENOENT (No such file or directory)
access("/var/www/XYZ/XYZ/src/AppBundle/Resources/public/areas/horizontal-line/icon.png", F_OK) = -1 ENOENT (No such file or directory)
access("/var/www/XYZ/XYZ/src/AppBundle/Resources/public/areas/gallery-single-images/icon.png", F_OK) = -1 ENOENT (No such file or directory)
...
感谢您发布数据。每秒速率= RPS
为您的Digital Ocean my.cnf [mysqld]部分考虑的建议
query_cache_type=OFF # from ON to conserve your 1G of RAM
query_cache_size=0 # to ensure QC is not in use
innodb_io_capacity=1900 # from 200 to enable more use of SSD capacity
innodb_lru_scan_depth=100 # from 1024 to conserve 90% of CPU cycles used for function
tmp_table_size=10M # from ~500M for 1% of RAM
max_heap_table_size=10M # from ~500M for 1% of RAM
innodb_log_file_size=64M # from 8M - size should always be GT innodb_log_buffer_size of 16M for you
innodb_thread_concurrency=0 # from 8 to allow auto calc of concurrency limit
通过这些更改,CPU使用率应稳定下来。请让我们知道您几天后的进度。