On behalf of our network manager, I report a problem trying to clone our current amavis milter (spamassassin and clamav) from an opensuse 11.4 (which we have been using, or its predecessors, happily since 10 years) to a new machine with OpenSuSE 42.2 and bundled sendmail and amavis.
We were used to the /etc/init.d mechanism to start services. Now Leap uses “rcamavis start” or “systemctl start amavis.service”. When we use it we see in /var/spool/amavis (dir) that both sockets are created (the one from sendmail to amavis and the one from amavis to spamassassin), though perhaps with funny permissions
srwxr-xr-x 1 vscan vscan 0 Feb 3 15:08 amavis-milter.sock
-rw-r–r-- 1 vscan vscan 6 Feb 3 15:08 amavisd-milter.pid
-rw-r----- 1 vscan vscan 0 Feb 3 15:08 amavisd.lock
-rw-r----- 1 vscan vscan 6 Feb 3 15:08 amavisd.pid
srwxr-x— 1 vscan vscan 0 Feb 3 15:08 amavisd.sock
however if we issue a “netstat -na” we see only one socket is in listen state:
(which is what the rcamavis should do) then both milter and amavis start, both sockets are created, both sockets remain in listen as shown by netstat-a, amavis-milter remains running and mail messages are processed normally.
What can be the cause of the sudden death of amavis-milter when started by rcamavis ?
Thanks
Can you post the contents of your amavis.service Unit file?
It’s likely at
/usr/lib/systemd/system/amavis.service
Also post the actual results of your command
systemctl status amavis.service
BTW - I wouldn’t automatically assume a “connection refused” error is caused by something “dying,” usually when I see that error it means that something exists but is blocking.
The “connection refused” was caused by /usr/sbin/amavisd-milter dying because /var/log/mail showed it started, and a ps later showed it was not running. Also netstat -na showed the socket (which existed) ***was NOT ***open for listen
Incidentally … amavis gives funny messages on the sockets. It said connection refused if the socket existed but was not open for listen. If the socket was deleted instead it gave “socket unsafe” errors ! @Miuku: all the permissions are OK for vscan:vscan … if was one of the first things we checked even before posting here !
This is the unit file as it stands now … we replaced the two red lines with the green line
# This file is part of package amavisd-new.
#
# Copyright (c) 2011 SuSE LINUX Products GmbH, Germany.
# Author: Werner Fink
# Please send feedback to http://www.suse.de/feedback
#
# Description:
#
# Used to start the amavisd
# Amavisd-new is a high-performance interface between mailer (MTA) and
# content checkers: virus scanners or SpamAssassin. It talks to the MTA
# via (E)SMTP, LMTP, or by using helper programs.
#
[Unit]
Description=Amavisd-new Virus Scanner interface
Requires=var-run.mount
Wants=nss-lookup.target network.target remote-fs.target time-sync.target
After=var-run.mount nss-lookup.target network.target remote-fs.target time-sync.target
Wants=named.service clamd.service
After=named.service clamd.service
Before=mail-transfer-agent.target
[Service]
Type=forking
PIDFile=/var/spool/amavis/amavisd.pid
ExecStartPre=-/bin/echo 'Starting virus-scanner (amavisd-new):'
EnvironmentFile=-/etc/sysconfig/amavis
### IFC ### ExecStartPre=/usr/sbin/amavisd-milter.sh start
### IFC ### ExecStart=/usr/sbin/amavisd start
ExecStart=/ifc/util/amavisd-ifc.patch
ExecStartPost=
ExecReload=/usr/sbin/amavisd reload
ExecStop=/usr/sbin/amavisd stop
ExecStopPost=/usr/sbin/amavisd-milter.sh stop
[Install]
WantedBy=multi-user.target
Essentially everything works all right if the two commands to start the milter and amavis are issued from the following script, not if they are issued by ExecStarPre and ExecStart
I do not think it has much sense to post the output of systemctl status now, that everything works, and we cannot downgrade to the situation when it was not working (the machine has been put to production), but anyhow here it is
I wonder if your problem is one of timing…
systemd is capable of running processes in parallel but your posting suggests to me that amavisd-milter may be trying to start before amavisd is fully started.
If someone simply copied what happens in a SysVinit system, because SysVinit has limited or no parallelism, then a timing issue wouldn’t be seen, but might be possible in systemd.
I have the same problem here and narrowed it down to systemd (pid 1) killing the amavisd-milter process. You can check/verify if that is the same in your case by stracing pid 1 while restarting amavis. I have not yet figured out a way to debug why it is doing that. This also explains why a wrapper script works fine: The wrapper script is killed, but the processes started by it are not.
My suspicion is that it has to do something with PID 1 becominig the parent of the process somehow.