I have been experiencing this so many times and I can’t find a patter. The system boots normally but the few sites which I have on the local machine don’t load (404). Running ‘rcapache2 restart’ makes everything work but I can’t find why it sometimes happens and based on what factors.
Today I looked at /var/log/apache2/error_log:
[Sun Feb 19 16:28:30.000427 2017] :notice] [pid 1767] ModSecurity for Apache/2.9.0 (http://www.modsecurity.org/) configured.
[Sun Feb 19 16:28:30.469773 2017] :notice] [pid 1767] ModSecurity: APR compiled version="1.5.1"; loaded version="1.5.1"
[Sun Feb 19 16:28:30.469792 2017] :notice] [pid 1767] ModSecurity: PCRE compiled version="8.33 "; loaded version="8.39 2016-06-14"
[Sun Feb 19 16:28:30.469798 2017] :warn] [pid 1767] ModSecurity: Loaded PCRE do not match with compiled!
[Sun Feb 19 16:28:30.469802 2017] :notice] [pid 1767] ModSecurity: LUA compiled version="Lua 5.2"
[Sun Feb 19 16:28:30.469807 2017] :notice] [pid 1767] ModSecurity: LIBXML compiled version="2.9.1"
[Sun Feb 19 16:28:30.469811 2017] :notice] [pid 1767] ModSecurity: Status engine is currently disabled, enable it by set SecStatusEngine to On.
[Sun Feb 19 16:28:30.495914 2017] [core:error] [pid 1767] (EAI 2)Name or service not known: AH00547: Could not resolve host name site1.local -- ignoring!
[Sun Feb 19 16:28:30.495979 2017] [core:error] [pid 1767] (EAI 2)Name or service not known: AH00547: Could not resolve host name site2.local -- ignoring!
[Sun Feb 19 16:28:30.496037 2017] [core:error] [pid 1767] (EAI 2)Name or service not known: AH00547: Could not resolve host name site3.local -- ignoring!
[Sun Feb 19 16:28:31.018255 2017] [mpm_prefork:notice] [pid 1767] AH00163: Apache/2.4.16 (Linux/SUSE) OpenSSL/1.0.1i-fips configured -- resuming normal operations
[Sun Feb 19 16:28:31.018285 2017] [core:notice] [pid 1767] AH00094: Command line: '/usr/sbin/httpd-prefork -f /etc/apache2/httpd.conf -D SYSCONFIG -D SSL -C PidFile /var/run/httpd.pid -C Include /etc/apache2/sysconfig.d/ -D SYSTEMD -D FOREGROUND'
Then I ran ‘rcapache2 restart’ and in the same log I see:
[Sun Feb 19 16:33:38.465418 2017] [mpm_prefork:notice] [pid 1767] AH00170: caught SIGWINCH, shutting down gracefully
[Sun Feb 19 16:33:39.000468 2017] :notice] [pid 3473] ModSecurity for Apache/2.9.0 (http://www.modsecurity.org/) configured.
[Sun Feb 19 16:33:39.000561 2017] :notice] [pid 3473] ModSecurity: APR compiled version="1.5.1"; loaded version="1.5.1"
[Sun Feb 19 16:33:39.000568 2017] :notice] [pid 3473] ModSecurity: PCRE compiled version="8.33 "; loaded version="8.39 2016-06-14"
[Sun Feb 19 16:33:39.000572 2017] :warn] [pid 3473] ModSecurity: Loaded PCRE do not match with compiled!
[Sun Feb 19 16:33:39.000575 2017] :notice] [pid 3473] ModSecurity: LUA compiled version="Lua 5.2"
[Sun Feb 19 16:33:39.000579 2017] :notice] [pid 3473] ModSecurity: LIBXML compiled version="2.9.1"
[Sun Feb 19 16:33:39.000582 2017] :notice] [pid 3473] ModSecurity: Status engine is currently disabled, enable it by set SecStatusEngine to On.
[Sun Feb 19 16:33:40.013885 2017] [mpm_prefork:notice] [pid 3473] AH00163: Apache/2.4.16 (Linux/SUSE) OpenSSL/1.0.1i-fips configured -- resuming normal operations
[Sun Feb 19 16:33:40.013934 2017] [core:notice] [pid 3473] AH00094: Command line: '/usr/sbin/httpd-prefork -f /etc/apache2/httpd.conf -D SYSCONFIG -D SSL -C PidFile /var/run/httpd.pid -C Include /etc/apache2/sysconfig.d/ -D SYSTEMD -D FOREGROUND'
[Sun Feb 19 16:33:45.022714 2017] [core:notice] [pid 3473] AH00052: child pid 3485 exit signal Segmentation fault (11)
[Sun Feb 19 16:33:47.024813 2017] [core:notice] [pid 3473] AH00052: child pid 3486 exit signal Segmentation fault (11)
[Sun Feb 19 16:33:52.029721 2017] [core:notice] [pid 3473] AH00052: child pid 3481 exit signal Segmentation fault (11)
[Sun Feb 19 16:34:16.054527 2017] [core:notice] [pid 3473] AH00052: child pid 3505 exit signal Segmentation fault (11)
[Sun Feb 19 16:34:17.056118 2017] [core:notice] [pid 3473] AH00052: child pid 3501 exit signal Segmentation fault (11)
[Sun Feb 19 16:34:17.056145 2017] [core:notice] [pid 3473] AH00052: child pid 3502 exit signal Segmentation fault (11)
[Sun Feb 19 16:36:13.176170 2017] [core:notice] [pid 3473] AH00052: child pid 3508 exit signal Segmentation fault (11)
[Sun Feb 19 16:36:28.191208 2017] [core:notice] [pid 3473] AH00052: child pid 3506 exit signal Segmentation fault (11)
[Sun Feb 19 16:36:28.191241 2017] [core:notice] [pid 3473] AH00052: child pid 3507 exit signal Segmentation fault (11)
[Sun Feb 19 16:36:38.201731 2017] [core:notice] [pid 3473] AH00052: child pid 3617 exit signal Segmentation fault (11)
[Sun Feb 19 16:36:40.204770 2017] [core:notice] [pid 3473] AH00052: child pid 3620 exit signal Segmentation fault (11)
[Sun Feb 19 16:36:41.207193 2017] [core:notice] [pid 3473] AH00052: child pid 3621 exit signal Segmentation fault (11)
[Sun Feb 19 16:36:51.219528 2017] [core:notice] [pid 3473] AH00052: child pid 3555 exit signal Segmentation fault (11)
[Sun Feb 19 16:36:54.222686 2017] [core:notice] [pid 3473] AH00052: child pid 3480 exit signal Segmentation fault (11)
[Sun Feb 19 16:36:54.222723 2017] [core:notice] [pid 3473] AH00052: child pid 3489 exit signal Segmentation fault (11)
[Sun Feb 19 16:37:13.241850 2017] [core:notice] [pid 3473] AH00052: child pid 3642 exit signal Segmentation fault (11)
and after that everything works as expected.
site1.local, site2.local and site3.local are properly configured in /etc/hosts and point to 127.0.0.1
Taking the case of this 13.2 system with an Apache Server version 2.4.10, /var/log/apache2/error_log contains some intermittent AH00052 Segmentation faults when the Server is shut down (gracefully):
once 25th April last year;
5 times (5 different child processes) during the shut down on the 24th of January this year.
So, yes there seems to be an intermittent child process Segmentation fault issue with Apache version 2.x when the server is being shut down (gracefully).
But, your machine dependent start up issue seems to be a AH00547 problem with resolving host names.
And, as you have described, not all machines and, only at boot time (restarting the Apache Server resolves the issue).
[HR][/HR]A shot in the dark:
Do the affected machines boot from a SSD or a hybrid (SSHD) disk?
If so, please check the SMART health: " # smartctl --health /dev/sda".
You could, also check the systemd Journal (the complete Journal) for increasing ECC and/or read error rates: " # journalctl | grep ‘SMART’ | grep -Ei ‘ECC|read’".
Even if the affected machines have a physically rotating drive, it may be worth checking the health of the thing.
Simple the Web server tries to start before the network is completed. No net then web server fails. Network start is variable since it depend not only on your hardware but the ISP.
But what ISP to do with Apache considering we are discussing sites hosted on the localhost? Are you saying that if I pull my cable Apache won’t start? I don’t quite understand what you mean.
@ heyjoe: As an alternative to the graphical systemd-analyze plot, you could try a search of the systemd journal:
# journalctl --this-boot | grep -iE 'network|apache'
Feb 23 08:18:49 xxx systemd[1]: Starting Host and Network Name Lookups.
Feb 23 08:18:49 xxx systemd[1]: Reached target Host and Network Name Lookups.
Feb 23 08:18:49 xxx avahi-daemon[891]: Network interface enumeration completed.
Feb 23 08:18:49 xxx systemd[1]: Starting wicked network management service daemon...
Feb 23 08:18:49 xxx systemd[1]: Started wicked network management service daemon.
Feb 23 08:18:49 xxx systemd[1]: Starting wicked network nanny service...
Feb 23 08:18:49 xxx systemd[1]: Started wicked network nanny service.
Feb 23 08:18:58 xxx systemd[1]: Started wicked managed network interfaces.
Feb 23 08:18:58 xxx systemd[1]: Starting Network.
Feb 23 08:18:58 xxx systemd[1]: Reached target Network.
Feb 23 08:18:58 xxx systemd[1]: Starting The Apache Webserver...
Feb 23 08:18:58 xxx systemd[1]: Starting Network is Online.
Feb 23 08:18:58 xxx systemd[1]: Reached target Network is Online.
Feb 23 08:19:00 xxx start-ntpd[1441]: Starting network time protocol daemon (NTPD)sntp 4.2.8p8@1.3265-o Thu Sep 22 14:57:28 UTC 2016 (1)
Feb 23 08:19:00 xxx systemd[1]: Started The Apache Webserver.
#
Note that, the Apache Webserver is started after the systemd target “Network” has been reached but, before the systemd step “Network is Online” is being started.
BTW, the dates where the AH00052 Segmentation faults occurred were the timestamps in “/var/log/apache2/error_log”.
Ok here is that output (including the manual restarting of Apache after boot):
# journalctl --this-boot | grep -iE 'network|apache'
Feb 23 09:28:29 xxx systemd-udevd[251]: Network interface NamePolicy= disabled on kernel commandline, ignoring.
Feb 23 09:28:29 xxx systemd-udevd[251]: Network interface NamePolicy= disabled on kernel commandline, ignoring.
Feb 23 09:28:38 xxx systemd-udevd[552]: Network interface NamePolicy= disabled on kernel commandline, ignoring.
Feb 23 09:28:38 xxx kernel: e1000e: Intel(R) PRO/1000 Network Driver - 2.3.2-k
Feb 23 09:28:38 xxx kernel: e1000e 0000:00:19.0 eth1: Intel(R) PRO/1000 Network Connection
Feb 23 09:28:39 xxx avahi-daemon[1109]: Network interface enumeration completed.
Feb 23 09:28:40 xxx systemd[1]: Starting wicked network management service daemon...
Feb 23 09:28:40 xxx systemd[1]: Started wicked network management service daemon.
Feb 23 09:28:40 xxx systemd[1]: Starting wicked network nanny service...
Feb 23 09:28:40 xxx systemd[1]: Started wicked network nanny service.
Feb 23 09:28:40 xxx systemd[1]: Starting wicked managed network interfaces...
Feb 23 09:28:43 xxx systemd[1]: Started wicked managed network interfaces.
Feb 23 09:28:43 xxx systemd[1]: Starting Network.
Feb 23 09:28:43 xxx systemd[1]: Reached target Network.
Feb 23 09:28:43 xxx systemd[1]: Starting Network UPS Tools - power device driver controller...
Feb 23 09:28:43 xxx systemd[1]: Starting Network is Online.
Feb 23 09:28:43 xxx systemd[1]: Reached target Network is Online.
Feb 23 09:28:43 xxx start-ntpd[1476]: Starting network time protocol daemon (NTPD)
Feb 23 09:28:43 xxx systemd[1]: Starting The Apache Webserver...
Feb 23 09:28:45 xxx systemd[1]: Started The Apache Webserver.
Feb 23 09:28:48 xxx upsdrvctl[1481]: Network UPS Tools - Generic HID driver 0.38 (2.7.1)
Feb 23 09:28:49 xxx systemd[1]: Started Network UPS Tools - power device driver controller.
Feb 23 09:28:49 xxx systemd[1]: Starting Network UPS Tools - power devices information server...
Feb 23 09:28:49 xxx systemd[1]: Started Network UPS Tools - power devices information server.
Feb 23 09:28:49 xxx systemd[1]: Starting Network UPS Tools - power device monitor and shutdown controller...
Feb 23 09:28:49 xxx upsdrvctl[1481]: Network UPS Tools - UPS driver controller 2.7.1
Feb 23 09:28:49 xxx systemd[1]: Started Network UPS Tools - power device monitor and shutdown controller.
Feb 23 09:33:22 xxx systemd[1]: Stopping The Apache Webserver...
Feb 23 09:33:22 xxx systemd[1]: Starting The Apache Webserver...
Feb 23 09:33:24 xxx systemd[1]: Started The Apache Webserver.
It seems in my case network is online before starting Apache.
BTW bugzilla seems not helpful. Someone just replied to the ticket requesting need-info but refused to even explain the steps how to get this info with an explanation that he has no time for this. This is the third time in my recent bugzilla reports when someone replies with something in the style “STFW for how to debug” which is quite frustrating.
Hm. But the dates in my log are from this February and you were talking about 25.April last year. That’s why I wonder where this info comes from?
In my case, the AH00052 Segmentation faults are (quite) intermittent: initially 9 months between occurrences; and then since the 24th of January this year, nothing . . .
In your case, with a later version of Apache, it seems that the AH00052 Segmentation faults are occurring regularly – not intermittently . . .
More specific to your case, are the AH00547 errors being reported by the Apache core.
[HR][/HR]Why the concern related to the Segmentation faults?
A segmentation fault (often shortened to segfault) or access violation is a fault, or failure condition, raised by hardware with memory protection, notifying an operating system (OS) the software has attempted to access a restricted area of memory (a memory access violation).
They arise primarily due to errors in use of pointers for virtual memory addressing, particularly illegal access.
In other words, for folks concerned about software quality, Segmentation faults are bad news – they are an indication of a fundamental programming error – a correct implementation never suffers from Segmentation faults.
[HR][/HR]And the AH00547 errors?
It seems that the network Name Service cache is taking some time to initialise and there is something in the Apache start-up which needs to access a named network resource which the network Name Service cannot, at that point in time, resolve.
You could try: " # journalctl --this-boot | grep -iE ‘name service|network|apache’"
The systemd “nscd.service” is the “Name Service Cache Daemon” which is normally started before the Network is started.
The systemd “systemd-hostnamed.service” is the “Hostname Service” which is normally started after the Apache Webserver has started.
All these sound like serious issues. I wonder why the neglecting attitude I received in bugzilla. Someone should definitely look into this. I am not sure what to do.
Here is the output of today’s boot:
# journalctl --this-boot | grep -iE 'name service|network|apache'
Feb 24 08:59:35 xxx systemd-udevd[250]: Network interface NamePolicy= disabled on kernel commandline, ignoring.
Feb 24 08:59:35 xxx systemd-udevd[250]: Network interface NamePolicy= disabled on kernel commandline, ignoring.
Feb 24 08:59:43 xxx systemd-udevd[561]: Network interface NamePolicy= disabled on kernel commandline, ignoring.
Feb 24 08:59:44 xxx kernel: e1000e: Intel(R) PRO/1000 Network Driver - 2.3.2-k
Feb 24 08:59:44 xxx kernel: e1000e 0000:00:19.0 eth1: Intel(R) PRO/1000 Network Connection
Feb 24 08:59:44 xxx avahi-daemon[1128]: Network interface enumeration completed.
Feb 24 08:59:46 xxx systemd[1]: Starting wicked network management service daemon...
Feb 24 08:59:46 xxx systemd[1]: Started wicked network management service daemon.
Feb 24 08:59:46 xxx systemd[1]: Starting wicked network nanny service...
Feb 24 08:59:46 xxx systemd[1]: Started wicked network nanny service.
Feb 24 08:59:46 xxx systemd[1]: Starting wicked managed network interfaces...
Feb 24 08:59:48 xxx systemd[1]: Started wicked managed network interfaces.
Feb 24 08:59:48 xxx systemd[1]: Starting Network.
Feb 24 08:59:48 xxx systemd[1]: Reached target Network.
Feb 24 08:59:49 xxx systemd[1]: Starting Network UPS Tools - power device driver controller...
Feb 24 08:59:49 xxx systemd[1]: Starting Network is Online.
Feb 24 08:59:49 xxx systemd[1]: Reached target Network is Online.
Feb 24 08:59:49 xxx start-ntpd[1496]: Starting network time protocol daemon (NTPD)
Feb 24 08:59:49 xxx systemd[1]: Starting The Apache Webserver...
Feb 24 08:59:51 xxx systemd[1]: Started The Apache Webserver.
Feb 24 08:59:54 xxx upsdrvctl[1500]: Network UPS Tools - Generic HID driver 0.38 (2.7.1)
Feb 24 08:59:55 xxx systemd[1]: Started Network UPS Tools - power device driver controller.
Feb 24 08:59:55 xxx upsdrvctl[1500]: Network UPS Tools - UPS driver controller 2.7.1
Feb 24 08:59:55 xxx systemd[1]: Starting Network UPS Tools - power devices information server...
Feb 24 08:59:55 xxx systemd[1]: Started Network UPS Tools - power devices information server.
Feb 24 08:59:55 xxx systemd[1]: Starting Network UPS Tools - power device monitor and shutdown controller...
Feb 24 08:59:55 xxx systemd[1]: Started Network UPS Tools - power device monitor and shutdown controller.
Feb 24 09:05:37 xxx systemd[1]: Starting Hostname Service...
Feb 24 09:05:37 xxx systemd[1]: Started Hostname Service.
BTW today I noticed another weirdness with Apache. I wonder if it is related in any way. I asked about it in this thread.
On my Leap 42.2 AMD Laptop the systemd journal network and name service trace looks like this (shortened because of Network Manager and WLAN):
# journalctl --this-boot | grep -iE 'name service|network|apache'
Feb 27 09:41:35 xxx systemd-udevd[180]: Network interface NamePolicy= disabled by default.
Feb 27 09:41:35 xxx systemd-udevd[180]: Network interface NamePolicy= disabled by default.
Feb 27 09:41:45 xxx systemd-udevd[481]: Network interface NamePolicy= disabled by default.
Feb 27 09:41:48 xxx kernel: audit: type=1400 audit(1488184908.185:7): apparmor="STATUS" operation="profile_load" name="/usr/lib/apache2/mpm-prefork/apache2" pid=517 comm="apparmor_parser"
Feb 27 09:41:48 xxx kernel: audit: type=1400 audit(1488184908.185:8): apparmor="STATUS" operation="profile_load" name="/usr/lib/apache2/mpm-prefork/apache2//DEFAULT_URI" pid=517 comm="apparmor_parser"
Feb 27 09:41:48 xxx kernel: audit: type=1400 audit(1488184908.185:9): apparmor="STATUS" operation="profile_load" name="/usr/lib/apache2/mpm-prefork/apache2//HANDLING_UNTRUSTED_INPUT" pid=517 comm="apparmor_parser"
Feb 27 09:41:48 xxx kernel: audit: type=1400 audit(1488184908.185:10): apparmor="STATUS" operation="profile_load" name="/usr/lib/apache2/mpm-prefork/apache2//phpsysinfo" pid=517 comm="apparmor_parser"
Feb 27 09:41:48 xxx kernel: FUJITSU Extended Socket Network Device Driver - version 1.0 - Copyright (c) 2015 FUJITSU LIMITED
Feb 27 09:42:03 xxx systemd[1]: Starting Name Service Cache Daemon...
Feb 27 09:42:03 xxx systemd[1]: Started Name Service Cache Daemon.
Feb 27 09:42:03 xxx systemd[1]: Reached target Host and Network Name Lookups.
Feb 27 09:42:04 xxx avahi-daemon[1107]: Network interface enumeration completed.
Feb 27 09:42:05 xxx systemd[1]: Starting Network Manager...
Feb 27 09:42:06 xxx NetworkManager[1227]: <info> NetworkManager (version 1.0.12) is starting...
Feb 27 09:42:06 xxx NetworkManager[1227]: <info> Read config: /etc/NetworkManager/NetworkManager.conf
Feb 27 09:42:06 xxx systemd[1]: Started Network Manager.
Feb 27 09:42:06 xxx systemd[1]: Reached target Network
.
.
.
.
#
Is the Name Service Cache daemon (nscd) disabled on the troubled machines?
# systemctl status nscd.service
● nscd.service - Name Service Cache Daemon
Loaded: loaded (/usr/lib/systemd/system/nscd.service; enabled; vendor preset: enabled)
Active: active (running) since Mo 2017-02-27 09:42:03 CET; 10min ago
Process: 1105 ExecStart=/usr/sbin/nscd (code=exited, status=0/SUCCESS)
Main PID: 1116 (nscd)
Tasks: 11 (limit: 512)
CGroup: /system.slice/nscd.service
└─1116 /usr/sbin/nscd
Feb 27 09:42:03 xxx nscd[1116]: 1116 monitoring directory `/etc` (2)
Feb 27 09:42:03 xxx nscd[1116]: 1116 monitoring file `/etc/resolv.conf` (5)
Feb 27 09:42:03 xxx nscd[1116]: 1116 monitoring directory `/etc` (2)
Feb 27 09:42:03 xxx nscd[1116]: 1116 monitoring file `/etc/services` (6)
Feb 27 09:42:03 xxx nscd[1116]: 1116 monitoring directory `/etc` (2)
Feb 27 09:42:03 xxx nscd[1116]: 1116 monitoring file `/etc/netgroup` (7)
Feb 27 09:42:03 xxx nscd[1116]: 1116 monitoring directory `/etc` (2)
Feb 27 09:42:03 xxx systemd[1]: Started Name Service Cache Daemon.
Feb 27 09:43:37 xxx nscd[1116]: 1116 monitored file `/etc/resolv.conf` was written to
Feb 27 09:43:39 xxx nscd[1116]: 1116 monitored file `/etc/resolv.conf` was written to
#
BTW: it seems that the Leap 42.2 systems (and possibly the Leap 42.1 systems) disable the Network interface Name Policy by default – therefore no need to disable it on the kernel command line.
I really don’t know. I was hoping someone who does to help.
BTW are we sure about that? In my pasted log I don’t see “Name Service Cache Daemon”. I see only “Hostname Service” (is it the same?). So how can we know when nscd is started? And is this the actual reason for the main issue?