problem starting amavis

On behalf of our network manager, I report a problem trying to clone our current amavis milter (spamassassin and clamav) from an opensuse 11.4 (which we have been using, or its predecessors, happily since 10 years) to a new machine with OpenSuSE 42.2 and bundled sendmail and amavis.

We were used to the /etc/init.d mechanism to start services. Now Leap uses “rcamavis start” or “systemctl start amavis.service”. When we use it we see in /var/spool/amavis (dir) that both sockets are created (the one from sendmail to amavis and the one from amavis to spamassassin), though perhaps with funny permissions

srwxr-xr-x 1 vscan vscan 0 Feb 3 15:08 amavis-milter.sock
-rw-r–r-- 1 vscan vscan 6 Feb 3 15:08 amavisd-milter.pid
-rw-r----- 1 vscan vscan 0 Feb 3 15:08 amavisd.lock
-rw-r----- 1 vscan vscan 6 Feb 3 15:08 amavisd.pid
srwxr-x— 1 vscan vscan 0 Feb 3 15:08 amavisd.sock

however if we issue a “netstat -na” we see only one socket is in listen state:

unix 2 [ACC] STREAM LISTENING 711994 /var/spool/amavis/amavisd.sock

and milter-amavis gives “error connecting to filter: Connection refused by /var/spool/amavis/amavis-milter.sock”

Apparently amavis-milter starts OK (as from log entry) but then immediately dies without notice !!!

If we do start MANUALLY as root invoking in sequence (as instructed in amavis.service):

/usr/sbin/amavisd-milter.sh start
/usr/sbin/amavisd start

(which is what the rcamavis should do) then both milter and amavis start, both sockets are created, both sockets remain in listen as shown by netstat-a, amavis-milter remains running and mail messages are processed normally.

What can be the cause of the sudden death of amavis-milter when started by rcamavis ?
Thanks

Can you post the contents of your amavis.service Unit file?
It’s likely at

/usr/lib/systemd/system/amavis.service

Also post the actual results of your command

systemctl status amavis.service

BTW - I wouldn’t automatically assume a “connection refused” error is caused by something “dying,” usually when I see that error it means that something exists but is blocking.

TSU

Check permissions for; /var/spool/amavis. The directory and EVERYTHING in it should be owned by vscan:vscan, including all hidden files.

The “connection refused” was caused by /usr/sbin/amavisd-milter dying because /var/log/mail showed it started, and a ps later showed it was not running. Also netstat -na showed the socket (which existed) ***was NOT ***open for listen

Incidentally … amavis gives funny messages on the sockets. It said connection refused if the socket existed but was not open for listen. If the socket was deleted instead it gave “socket unsafe” errors !
@Miuku: all the permissions are OK for vscan:vscan … if was one of the first things we checked even before posting here !:frowning:

This is the unit file as it stands now … we replaced the two red lines with the green line


# This file is part of package amavisd-new.
#
# Copyright (c) 2011 SuSE LINUX Products GmbH, Germany.
# Author: Werner Fink
# Please send feedback to http://www.suse.de/feedback
#
# Description:
#
#  Used to start the amavisd
#       Amavisd-new is a high-performance interface between mailer (MTA) and
#       content checkers: virus scanners or SpamAssassin. It talks to the MTA
#       via (E)SMTP, LMTP, or by using helper programs.
#
[Unit]
Description=Amavisd-new Virus Scanner interface
Requires=var-run.mount
Wants=nss-lookup.target network.target remote-fs.target time-sync.target
After=var-run.mount nss-lookup.target network.target remote-fs.target time-sync.target
Wants=named.service clamd.service
After=named.service clamd.service
Before=mail-transfer-agent.target

[Service]
Type=forking
PIDFile=/var/spool/amavis/amavisd.pid
ExecStartPre=-/bin/echo 'Starting virus-scanner (amavisd-new):'
EnvironmentFile=-/etc/sysconfig/amavis
### IFC ### ExecStartPre=/usr/sbin/amavisd-milter.sh start
### IFC ### ExecStart=/usr/sbin/amavisd start
ExecStart=/ifc/util/amavisd-ifc.patch
ExecStartPost=
ExecReload=/usr/sbin/amavisd reload
ExecStop=/usr/sbin/amavisd stop
ExecStopPost=/usr/sbin/amavisd-milter.sh stop

[Install]
WantedBy=multi-user.target

Essentially everything works all right if the two commands to start the milter and amavis are issued from the following script, not if they are issued by ExecStarPre and ExecStart

#!/bin/bash
/usr/sbin/amavisd-milter.sh start
/usr/sbin/amavisd start
exit

I do not think it has much sense to post the output of systemctl status now, that everything works, and we cannot downgrade to the situation when it was not working (the machine has been put to production), but anyhow here it is

● amavis.service - Amavisd-new Virus Scanner interface
   Loaded: loaded (/usr/lib/systemd/system/amavis.service; enabled; vendor preset: disabled)
   Active: active (running) since Tue 2017-02-14 16:26:40 CET; 24h ago
  Process: 1967 ExecStart=/ifc/util/amavisd-ifc.patch (code=exited, status=0/SUCCESS)
  Process: 1963 ExecStartPre=/bin/echo Starting virus-scanner (amavisd-new): (code=exited, status=0/SUCCESS)
 Main PID: 2009 (/usr/sbin/amavi)
    Tasks: 5 (limit: 512)
   CGroup: /system.slice/amavis.service
           ├─1973 /usr/sbin/amavisd-milter -s local:/var/spool/amavis/amavisd-milter.sock -p /var/spool/amavis/amavisd-milter.pid
           ├─2009 /usr/sbin/amavisd (master)              
           ├─2049 /usr/sbin/amavisd (ch17-avail)          
           └─2050 /usr/sbin/amavisd (ch18-avail)          

Feb 14 16:26:27 oort systemd[1]: Starting Amavisd-new Virus Scanner interface...
Feb 14 16:26:27 oort echo[1963]: Starting virus-scanner (amavisd-new):
Feb 14 16:26:40 oort systemd[1]: Started Amavisd-new Virus Scanner interface.

I give also the netstat -na output (the milter-sock one was missing when it was not running.

unix  2       ACC ]     STREAM     LISTENING     22865  /var/spool/amavis/amavisd-milter.sock
unix  2       ACC ]     STREAM     LISTENING     23250  /var/spool/amavis/amavisd.sock

I wonder if your problem is one of timing…
systemd is capable of running processes in parallel but your posting suggests to me that amavisd-milter may be trying to start before amavisd is fully started.

If someone simply copied what happens in a SysVinit system, because SysVinit has limited or no parallelism, then a timing issue wouldn’t be seen, but might be possible in systemd.

IMO you should submit a bug to https://bugzilla.opensuse.org referencing this Forum thread.

TSU

I have the same problem here and narrowed it down to systemd (pid 1) killing the amavisd-milter process. You can check/verify if that is the same in your case by stracing pid 1 while restarting amavis. I have not yet figured out a way to debug why it is doing that. This also explains why a wrapper script works fine: The wrapper script is killed, but the processes started by it are not.

My suspicion is that it has to do something with PID 1 becominig the parent of the process somehow.