Several web site and service APIs wont respond

Hello everyone,

I installed LEAP 42.1 several times this week with and without internet connection during install, terminating in a fully updated system at one point or another each time.
At first I thought this was my fault during installation or setup but it persists until now even after several installs.

Specific web sites and services, which need some sort of server <-> client communication just won’t work at all and stop loading any output.
This includes (the ones I’ve experienced):

Yahoo Mail
iCloud Web
Codepen
Spotify Web (and desktop application for that matter)
Encryptr (as well as desktop application)
The Account Registration API for SUSE!
(I had to register and am writing from my OSX laptop,…shame…)

Services that DO work:

Google (all services)
JSFiddle
Facebook

This is on a fresh, unmodified(!) install and true for ANY browser I used (Konqueror, Firefox, Chrome, Chromium, Vivaldi)
I’m connecting wirelessly to a modified OpenWRT router on all machines and not having troubles on ANY other device (Windows/OSX/Fedora/Debian)
I’m not using any special firewall settings, routing tables or whatever fancy networking stuff, just a fresh install of LEAP
I did deactivate the firewall offered by SUSE in the installation though (might that be a hint?)

Any ideas here? Any console logs I can provide? Where to start?

This reads like a DNS issue. Have you explicitly set the nameserver? Using DHCP? Does the router provide this function?

grep -i "name" /etc/resolv.conf

Demonstrate the issue with

host -ta yahoo.com
/usr/sbin/traceroute yahoo.com

Perhaps try restarting the router as part of the diagnostics too.

Thanks for the pointers deano,

I rebooted my router from inside the OpenWRT GUI LuCI and issued your commands on the machine in question.


~> grep -i "name" /etc/resolv.conf
nameserver 192.168.1.1
nameserver fe80::ee08:6bff:fee8:28c5%wlan0

~> host -ta -yahoo.com
yahoo.com has address 98.139.183.24
yahoo.com has address 206.190.36.45
yahoo.com has address 98.138.253.109

~> /usr/sbin/traceroute yahoo.com
traceroute to yahoo.com (98.138.253.109), 30 hops max, 60 byte packets
1 OpenWrt.lan (192.168.1.1) X ms ...
2 dsl-servicio-l200.uninet.net.mx (....) X ms...
3 another uninet ping
// then some telia pings and from #8 on yahoo pings until hop #13 where there's just asterisks until #30
.
8 ae-6.pat2.nez.yahoo.com (...) X ms
.
.
.
13 * * *
14 * * *
.
.
.
30 * * *

Is this of any help?

So, openSUSE is using the router for name resolution, (and it in turn will have at least one external DNS configured). The problem is external to openSUSE. The host command came back with IP addresses for yahoo.com, so that was a good sign. Assuming any associated web servers are up, you should be able to use a browser successfully as well. In general, you can do the same for any internet domain name. Since you restarted the router, are there still sites that you can’t reach?

Yes, the issue persists. As mentioned I have access to devices connecting to the very same network and there’s no sign of issues with either OSX, iOS or Windows (Not so sure about an android phone which I’ll test later). I never have played with the DNS settings of OpenWRT but if it works on said machines, why would it not on Linux? Actually I’m trying to get some more output and on a laptop with Fedora it happens to be the same…

iCloud web service at least throws a connection error with bug report after logging in with a valid account:


// www.icloud.com
ERROR
authDidNotConnect–KVSFailed
...
callback functions(){}

Also spotify when launched from cli throws:


E [ap_handler_impl.cpp:872          ] Connection error:    ap_ping_timeout

Is there a possibility that OpenWRT being a linux distro handles linux distros different than Mac OS, based just as much on unix?
By the way I also get a sandbox error when launching apps from cli: *InitializeSandbox() called with multiple threads in process gpu-process
*but I guess that’s another story.
I guess I’ll also turn to OpenWRT forums for this.

Install and run httping for any http services (Websites, xml services, anything that uses an http header for connecting).

That will verify the server is reachable by either name or IP address and if there are any latencies
It may or may not be a DNS error, depending on when you launch the app that’s trying to connect, there may also be a timing issue, eg are you trying to connect on boot or when the User logs in or are you launching the app manually?

TSU

Thanks for chiming in. I’ll try to explain the situation further:

The explained issue is true for many (not any!) web interfaces that need some sort of interaction, like login/register forms, web applications stores etc. Mostly stuff that has some account managing or database read to do. It’s not about a specific desktop app trying to connect to the web.
Only those are nicer to debug since I can run them from cli (Spotify e.g.). So all of this is after booting the system and after nm connecting to an available network.

I’m using NetworkManager Service over Wicked. I tried with both though. I tried with automatic DHCP/DNS and manually setting everything.
In my case the router is at 192.168.1.1 and the DNS search domain is “lan”. Since this is the very basics for my network connection there’s not much to play around here. These numbers work fine on other operating systems. I do not have SUSE firewall activated since there’s one set up in the router.
I tried to deactivate that firewall in my router completely without any effect. I tried to allow port forwarding without any effect.

Now for Linux network issues I tried to search online about NetworkManager and general connection problems which led me to


/etc/sysconfig/network/config
#
// in specific the lines about DNS handling:
#
# To disable the execution of a module, don't remove it from the list
# but prepend it with a minus sign, "-ntp-runtime".
#
NETCONFIG_MODULES_ORDER="-dns-resolver -dns-bind -dns-dnsmasq -nis -ntp-runtime"
#
#
# Defines the DNS merge policy as documented in netconfig(8) manual page.
# Set to "" to disable DNS configuration.
#
NETCONFIG_DNS_POLICY=""

Changing these as described had no positive effect either.
Now for httping. When I issue with “ping” to any IP, like 8.8.8.8 or 208.67.222.222 or whatever, packets are transmitted and received fine.
But when I issue “httping” to these I just get a timeout.

Let’s retake the Spotify case. When I run “~> spotify” the initiation routine throws up and then the service needs to connect to its servers to render the applications layouts and fill it with content. So it starts to render stuff:


I [ap_connection_impl.cpp:901             ] Connecting to AP sjc1-accesspoint-a18.ap.spotify.com:4070
I [ap_connection_impl.cpp:530             ] Connected to AP: 194.68.28.130:4070
I [MainView.cpp:6828                      ] Load complete (0) url: sp://somehashnumber.feed/index.html

// From here it synchronizes the stuff loaded and then goes on to load more:
I [ap_connection_impl.cpp:901             ] Connecting to AP sjc1-accesspoint-a26.ap.spotify.com:443
I [ap_connection_impl.cpp:530             ] Connected to AP: 194.68.29.36:443
E [ap_handler_impl.cpp:872                ] Connection error:      ap_ping_timeout

// From there it tries to reach other addresses with the same problem thus not able to render any more layout/content

As mentioned I can “ping” those fine from cli but when using “httping” it’ll error out:


~> httping 194.68.29.36:443
PING 194.68.28.130:4070 (/):
timeout while receiving reply-headers from host

Is any of this of any help? I’m getting desperate. This has to be some one liner in some linux network config file…
Thanks for your time!

P.D.: Don’t get me wrong, I really couldn’t care less for Spotify but it demonstrates the issue in the command line, whereas really important services like mail providers and cloud stuff won’t.

As mentioned I can “ping” those fine from cli but when using “httping” it’ll error out:

~> httping 194.68.29.36:443
PING 194.68.28.130:4070 (/):
timeout while receiving reply-headers from host

FWIW, I get the same response with this particular address, but a basic nmap scan did yield

 # nmap 194.68.29.36

Starting Nmap 6.47 ( http://nmap.org ) at 2016-09-27 15:56 NZDT
Nmap scan report for sjc1-accesspoint-a26.sjc1.spotify.com (194.68.29.36)
Host is up (0.18s latency).
Not shown: 998 filtered ports
PORT    STATE SERVICE
80/tcp  open  http
443/tcp open  https

Nmap done: 1 IP address (1 host up) scanned in 16.77 seconds

so I’m not sure that the httping response means anything particularly significant here.

The next step after trying httping is to telnet, connecting to port 80 and read the http headers to see if it’s an authentication issue instead of a connectivity issue.

It takes some knowledge to understand what is returned by a telnet to port 80, so you may need to post the results here for others to see.

You’re describing a number of possible web application architectures, it might be helpful to try to identify anything in common with what works and what doesn’t… ie

  • Using a particular authentication service. A website (or web service) might be its own authenticator, but it’s also common to use various Internet authentication services like Yahoo, Google, Facebook, Linked-in, more.

  • A web service is defined as typically in XML or JSON format (structured text and tags) which commonly use the same ports as web pages… typically over port 80 or 443. A web service also is typically <data only> and is not the same as a “web application” which are full applications like email (Yahoo mail, gmail, outlook mail, etc). The actual application is likely running locally, can be installed locally or could be a locally running web page (which might be served from a remote server).

  • What kind of Internet Gateway are you running? Is it a simple router perhaps with a firewall based on IP tables or is it a Proxy Server? If you don’t know, then you need to describe if you’re connecting from home or a business network (particularly a very large business network). And, do you have control over your Internet Gateway or is it managed by your Provider or a business IT team?

  • Are you running any kind of web filtering software (It’s sometimes also built into anti-virus which might be running locally or on your Internet Gateway)

TSU

Thanks guys for further pushing me. I won’t be able to write this short as I made several observations.

I can read a lot that “/etc/resolv.conf” should actually be a symlink to “/run/resolvconf/resolv.conf” but after moving the file and making the link, after the next reboot the folder “/run/resolvconf/” was deleted leaving me with NetworkManager unable to resolve anything i.e. wifi connected but unable to resolve to anywhere. Now there is a “/run/netconfig/NetworkManager.netconfig” but when I establish the before mentioned symlink to that file, again NM is unable to resolve anything. So right now I’m back to the netconfig autogenerated file in /etc/.

Next I’ll try to respond some of your questions. (Note that I was able to get around the login issue on forums.opensuse.com [which is handled by login.microfocus.com] by removing the dynamically generated link when clicking on “Log In” on this forums site and logging in with valid information on the site [login.microfocus.com] directly. So now I’m writing from my OpenSUSE machine. Rather funny fact that when I wanted to reply with quote, AGAIN I apparently got some timout leaving me with a frozen page, so I had to write this manually)
Edit: I couldn’t send this either, thank god I’m used to forums and wrote this in an editor first. Had to send the .txt to my laptop and paste it into a new reply from there…how sick is that?

What kind of Internet Gateway are you running? Is it a simple router perhaps with a firewall based on IP tables or is it a Proxy Server? …]

It’s a TP-Link Archer C7 v2 which I myself flashed OpenWrt on to. It’s basically a small linux distro for routers. The default firewall (which I mentioned I tried to run activated and deactivated with no differences in the behaviour on my linux machines [Fedora/OpenSUSE]) is indeed based on IP tables and I do not have any Proxies set up. This is my router and I do have access to it. Not that it’s set up and working with OSX, iOS and Windows.

Are you running any kind of web filtering software …]

I used to have an AdBlock installed on the router firmware which is now disabled since I’m on this OS, although it had no effects on my connection before that on the mentioned operating systems. The OpenSUSE install has no additional software installed apart from the provided ISO and the firewall has been disabled by me. This excludes anti-virus, anti-anything. There’s this AppArmor thing in YaST which I disabled as well.

…] it might be helpful to try to identify anything in common with what works and what doesn’t …]

This is indeed tricky even knowing a decent part of web development. If it was for some specific authentication server it would not be of any help to me but I could at least provide that information here. Take codepen for instance. It seems to be more of a rendering issue caused by the timeouts. I can access www.codepen.io just fine. I cannot see rendered output from the individual pens though (this is without any account logged in!). The pen gets stuck outputting “Loading…”. Even more: On the contraty I CAN see rendered full page views. Codepen is using a variety of Frameworks, Servers etc. including e.g. Rails, Sinatra, Express, Node, Preprocessor Servers, VPN, NAT and whatnot…

The next step after trying httping is to telnet …]

All right here goes. I tried this with several sites that do work and that do not work for me. Although I don’t know if it’s of any help if I get a header from the main address when usually the timeout occurs on dynamically generated sub domains like “login.yahoo.com?..” until when there’s some data send event triggering. It’s usually user logins that don’t work but that’s why I’m mentioning the codepen example.

TELNET WITH FAILING SITES

HEAD / HTTP/1.1
Host yahoo.com

HTTP/1.1 400 Host Header Required
Date: Tue, 27 Sep 2016 17:42:41 GMT
Connection: keep-alive
Via: http/1.1 ir32.fp.gq1.yahoo.com (ApacheTrafficServer)
Server: ATS
Cache-Control: no-store
Content-Type: text/html
Content-Language: en
Content-Length: 6487

HEAD / HTTP/1.1
Host spotify.com

HTTP/1.1 400 Bad Request
Server: nginx
Date: Tue, 27 Sep 2016 17:45:17 GMT
Content-Type: text/html
Content-Length: 166
Connection: close

HEAD / HTTP/1.1
Host codepen.io

HTTP/1.1 400 Bad Request
Date: Tue, 27 Sep 2016 18:24:33 GMT
Content-Type: text/html
Content-Length: 177
Connection: close
Server: -nginx
CF-RAY: -

GNUTLS-CLI WITH FAILING SITE

HEAD / HTTP/1.1
Host login.microfocus.com
// This is the site handling opensuse forum accounts…

HTTP/1.1 400 Bad Request
Server: Apache-Coyote/1.1
Transfer-Encoding: chunked
Date: Tue, 27 Sep 2016 18:31:28 GMT
Connection: close
Strict-Transport-Security:max-age=86400
Set-Cookie: IPCZQX03a36c6c0a=; Domain=attachmategroup.com; Expires=Tue, 27-Sep-2016 18:31:39 GMT; Path=/
Set-Cookie: lb_login=EIODEBIJ; Path=/

*** Fatal error: The TLS connection was non-properly terminated.
*** Server has terminated the connection abnormally.

TELNET WITH WORKING SITES

HEAD / HTTP/1.1
Host google.com

HTTP/1.1 302 Found
Cache-Control: private
Content-Type: text/html; charset=UTF-8
Location: http://www.google.com.mx/?gfe_rd=cr&ei=w6_qV6eJDpTD8geCz6fgBw
Content-Length: 262
Date: Tue, 27 Sep 2016 17:43:31 GMT

HEAD / HTTP/1.1
Host youtube.com

HTTP/1.1 302 Found
Cache-Control: private
Content-Type: text/html; charset=UTF-8
Location: http://www.google.com.mx/?gfe_rd=cr&ei=frTqV-fKOJDD8gewjJfQCQ
Content-Length: 262
Date: Tue, 27 Sep 2016 18:03:42 GMT

HEAD / HTTP/1.1
Host facebook.com

HTTP/1.1 400 Bad Request
Content-Type: text/html; charset=utf-8
Date: Tue, 27 Sep 2016 18:06:53 GMT
Connection: close
Content-Length: 2959

HEAD / HTTP/1.1
Host stackoverflow.com

HTTP/1.1 400 Bad Request

Well these are my investigations as reaction to your responses. I hope they may be of any help on the issue. Thanks for staying with me! I’m going to have lunch after so much command line action… ¡Buen provecho!

In general,
400 errors means that you are able to successfully contact and connect to the Server, but you’re requesting something that doesn’t exist. It’s not likely a permissions issue (deny access instead), although sometimes something can be made to be completely invisible to the requestor.

So, a 400 error means you should not be looking at name resolution issues.

For several sites, you also seem to be saying that part of the page is rendering, if that is the case then for most web browsers (Microsoft Internet Explorer is the exception) you can inspect the run time behavior of individual components on the page by opening the “Dev Tools” (or whatever else the browser calls it) by doing the following keystroke combiantion

CTL-SHFT-J

Once the Dev Tools console is open, reload the page.
You should then see any errors, be able to inspect specific parts of the page, more.

I would probably consider your case to be some kind of filtering problem.
Are you sure you don’t have some kind of no-scripts or a content filtering add-on/extension installed in your browser?
Try installing a new browser and see if you get different results (A newly installed web browser with default settings won’t be filtering and blocking)

TSU

Thank you for your patience TSU. I understand that this is sort of abstract and tough to guide from distance. I mentioned that this behaviour is true for ANY browser I have installed on Fedora, OpenSuse and recently Mint. Without any extensions or plugins installed. Fresh factory ISO.

I actually already tried to see if I could get any hints from dev tools since I’m used to them, but it just leads me to the same idea: my requests get timed out. I cannot even post to forums on Linux.

Now here is what I think: I mentioned I flashed the firmware of my router to a linux based mini-distro called OpenWRT. Fact is that my router has exactly the same networking files set up as most linux distros including dmasq and whatnot. My wife asked me with a very sarcastic voice: “So Linux and Linux don’t work together?” Hahaha…
There has to be some conflict or something. But I’ll again head to the firmware forums since I guess this is very specific to it.
Only maybe do you have an idea what could be problematic in setting up 2 linux OS’ as follows: one as AP set to PPPoE serving as DHCP server, connected to a bridged modem and sending WIFI.
And the other one (that would be my linux workstations…) set up as client. What could conflict in either machine that I could change?

Mew, sorry man, that’s all I can think of right now. It’s late and I’m tired. Thanks again for your patience!

Hi
FWIW, I have a couple of routers here using dd-wrt (instead of open-wrt), one router is a AP extender, the other the backhaul to the main router via wireless, I use dhcp on a reduced range for automatic ip addresses (dhcp), rest of my systems have static ip address, manually configured gateway and DNS (I use openDNS servers). I also have an Airport Extreme (bridged) via ethernet cable connected to the router to use the the Gb ethernet ports at my desk, then off the main router again via ethernet I have a remote 100M/b switch for the systems with slower interfaces…

Could you explain your setup again as you say “one as AP set to PPPoE serving as DHCP server, connected to a bridged modem and sending WIFI.”

So your internet comes into the AP(via wireless?), the bridge (via ethernet to the AP?) is via another modem (openWRT?) an this one is your local wireless AP?

Filtering and blocking can be done anywhere.
It can be as a browser extension
It can be as a separate application on your local machine.
It can be done at your Internet Gateway or any other critical node in your network.
It can be done on a proxy if you connect through a remote proxy outside your network.

TSU

Hi Malcolm,

it’s way simpler than that sorry for the confusion I just said “one as AP” trying to make a point that the router is basically a Linux machine.
The setup is like this (pretty common I think):

Street cable
|
ISP Modem (bridged)
|
LEDE(OpenWrt) Router TP-Link Archer C7 v2 (WAN set to PPPoE)
Self compiled trunk image from Github (dated Sep 29) including QoS and AdBlock modules, but I already tried without those without a difference)
|
Several clients (Windows 7,8 & 10, Android, iOS, OSX, Fedora, Mint)
*All clients work without the described issues except the Linux ones

Now of that I am indeed pretty aware. Thing is I’m unable to analyze if or where my router software (linux distro) is filtering/blocking some specific traffic sent by my linux clients. I already mentioned that this issue is true for FRESH ISO installs without any software or extensions. So if this was on the client side it had to do with system configs! (Won’t you agree…?)

Hi
If your not using IPv6, disable it via YaST Network Settings on the first tab, in Firefox in about:config search on ipv6 and disable. It requires a restart.

Configure your DNS on the router to use openDNS, 208.67.222.222 and 208.67.220.220 (Check there website for the filtered content ones if required). Then restart the network on your machine to pickup the new ones (or set in Network Manager).

If you connect via ethernet (if possible) do you have the same issue?

A long shot but if wireless, there may be interference in your location from other AP’s compared to other locations in your locale. I use a tablet with wifi-analyzer to check out channels and interference.

Then it will be down to the wireless device, it’s not a broadcom or intel AC device?

Also is the wireless network mixture a speeds, eg N&G G only etc?

Disabled v6 and set openDNS servers in router config under “WAN” interface under “advanced” tab (unchecked “Use DNS servers advertised by peer” and set the addresses). After that nslookup would still show my router (192.168.1.1) as Server, but I guess that’s just normal? (BTW the OpenDNS website is one of those that break when I click on one of their menus or buttons)

If you connect via ethernet (if possible) do you have the same issue?

Yes.

A long shot but if wireless, there may be interference in your location from other AP’s compared to other locations in your locale. I use a tablet with wifi-analyzer to check out channels and interference.

I’m living in a quite separated neighbourhood. My WIFI is the only signal present in my place apart from a wireless phone and two almost never connected micro ovens. Apart from that I would say this would affect each and any device regardless of the OS, wouldn’t it?

Then it will be down to the wireless device, it’s not a broadcom or intel AC device?

It’s an Intel 7260. I actually found some interesting articles about the generic driver versions and whatnot. But even after placing the corresponding version into /lib/frameworks I just eliminated some dmesg messages about the next best driver being loaded due to the original not being found.

Also is the wireless network mixture a speeds, eg N&G G only etc?

My 5GHz radio is unconfigured (just off…) and my 2.4GHz radio is set to N. As I said, if it was for radio wave interference/speed/overlays or whatever, wouldn’t it affect any client?

I did some more cli on my own part but cannot see anything strange:


#> ifconfig wlp4s0
wlp4s0     Link encap:Ethernet  HWaddr 11:22:33:44:55:66
           inet addr:192.168.1.244  Bcast:192.168.1.255  Mask:255.255.255.0
           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
           Packets without erros...

#> ip route
default via 192.168.1.1 dev wlp4s0  proto static  metric 600
169.254.0.0/16 dev wlp4s0  scope link  metric 1000
192.168.1.0/24 dev wlp4s0  proto kernel  scope link  src 192.168.1.244  metric 600

#> dig iclcoud.com // one of many failing sites
HEADER returns with "status: NOERROR" and ANSWER SECTION shows 4 perfectly resolved IPs

#> traceroute -q1 icloud.com
Hops to 9 and from there only asterisks. This is true on a working client as well though.

#> dmesg | grep iwlwifi
Shows the infamous "Unsupported splx structure" warning which I found many articles about but absolutely no resolution for.
// Printed "Direct firmware load for iwlwifi-7260-17.ucode failed with error -2" before I renamed and added the latest working driver directly to /lib/firmware

Intel(R) Wireless WiFi driver for Linux
Copyright(c) 2003 - 2015 Intel Corporation
iwlwifi 0000:04:00.0: enabling device (0000 -> 0002)
iwlwifi 0000:04:00.0: Unsupported splx structure
iwlwifi 0000:04:00.0: loaded firmware version 16.242414.0 op_mode iwlmvm
usbcore: registered new interface driver btusb
Bluetooth: hci0: read Intel version: 3707100180012d0d25
Bluetooth: hci0: Intel device is already patched. patch num: 25
iwlwifi 0000:04:00.0: Detected Intel(R) Dual Band Wireless AC 7260, REV=0x144
iwlwifi 0000:04:00.0: L1 Disabled - LTR Enabled
iwlwifi 0000:04:00.0: L1 Disabled - LTR Enabled

Actually I found some lines about how the Bluetooth and the WiFi don’t get along well on these. Gonna try to deactivate Bluetooth next and see if does any good. Any more output I could provide…?

Hi
Seems like you have done exhaustive tests… :wink:

Delete the /etc/resolv.conf file, reboot and try again, you should get the openDNS ones, if that doesn’t work, set the nameservers manually in /etc/resolv.conf and remove the 198 one and see how that goes.

Another check, if you create a test user, login, open Firefox and try browsing to the affected sites, how does that go?

Also check on a windows or other client, what DNS servers are they seeing?

Forgot to mention,
After <every> test,
You always have to purge your name resolution cache, else your succeeding tests are polluted.
You can purge the cache by restarting the nscd service

systemctl restart nscd.service

TSU

I had the same issue under Opensuse Tumbleweed. In my case, it manifested in the Spotify ap_ping_timeout error. Spotify wouldn’t work on either the Web version or the desktop app.

My “solution”? Installing Ubuntu, which doesn’t show this issue. After seeing the great lengths you people went into debugging this, with no solution or even a cause identified, I bailed out.

It’s a shame, since I really like Tumbleweed and moreover Yast, but there is only so much ******** (no offense intended) I am willing to work out in order to have a working rolling-release distro.

I hope you can solve this problem, I’ll be watching this thread.