Asterisk loses remote port over VPN

I have a slightly strange problem which has happened for some years. I have asterisk servers running on two machines with a VPN tunnel between them. Both systems have IAX and SIP connections to each other. Every so often one or both services on on or both of the machines cannot reach the other end. However other functionality using the same VPN between the same two machines continues without any problem. The only consistent way I have found to recover the connection is to reboot the system(s) running asterisk. I have tried restarting asterisk on both systems, rebooting routers and restarting the tunnel but that does not fix the issue. It’s as though the networking in OpenSuse has locked on to the failed route and won’t let it go, even though it will let others work OK.

If anyone has any ideas on how to diagnose this further I would really appreciate the information.

Can you enable logging on the Astrix servers?

See monitor 24/7 but retain only 15 minutes? to make a Wireshark capture, if you detect the connection going down within an hour, this will make sure you capture the last traffic and that might gives a clue.

The problem is that this might take several weeks before the problem occurs and it is not immediately obvious. Maybe logging would give some clue but not sure if the amount involved would be easy to go through. Thanks for the observation though.

If the problem is not immediately obvious it is maybe better to take care of that first.

Are the clients connected to the Asterix servers? If so can you ping from the other side of the VPN tunnel a client connected to the Asterix server?

Sorry for delay in replying, did not mean to be so long before I checked.

The only thing that goes wrong is that one or other of IAX or SIP, rarely both, won’t connect over the tunnel and in most cases the other is still OK. All other functions I use over the tunnel continue to work without any problem at all.

One thing that is probably relevant is that I think this happens when the tunnel has possibly failed and had to be restarted but as that process is automatic I can’t tell for sure when the Asterisk issue arises in relation to that restart.

With more information it is almost impossible to debug this this, so my suggestion is first to focus on getting more logging.

It should be possible to log when the tunnel is restarted and using that timestamp you can have a look at the Astrix logging.

It has happened again when one end had a power failure. The other end shows bot IAX and SIP getting unreachable but when the remote end came back IAX connects without problem but SIP seems to do nothing. I’ve tried reloading and restarting asterisk, changing the nat value in the sip config but it still doesn’t show anything other than it is retrying messages. The other end is quite happy with the connection. I also ran nmap and that indicates that the port is open at the remote end and from previous experience, if I reboot the machine SIP will be OK. That still suggests that the problem is with Linux and not Asterisk, or at least between Linux and Asterisk.

Can you please tell why you do not want to invest time on enabling logging?

It’s not a case of not enabling logging. Can you suggest what logging I could turn on which might help to find the source of the problem. And then there is the issue of the potential amount of data that I might need to collect if the problem doesn’t manifest for several weeks. Also, once the failure is established it remains and I’ve tried using Wireshark and Linux network tools to investigate the packet flow. It is possibly my lack of detailed knowledge of the way the packets are routed, even though I’ve tried looking at packets which are working.

See the link on Asteriks logging I shared earlier, you can enable logging on the fly.

Another way I found here:

  1. use “sip show registry” inside of asterisk to display the outgoing registrations
  2. enable sip debugging: “sip set debug on” (shows the sip traffic within asterisk cli)
  3. force a register attempt: “sip reload” and monitor the cli for appearing sip messages

If step 2 only shows outgoing but not incoming packets, you might have a firewall issue.

As to:

Instead of rebooting the machine, can you bring the interface that connects to the other side hard down and up again. Either by unplugging and plugging in the cable or using “ip link … up/down”