Kernel 4.10.x - setting regulatory domain broken?

Now, this is actually not a request for help, at least not for help in the sense of “how to fix that?” here.

A few days ago, I opened a bug report on openSUSE’s bugzilla, not reaction so far.

https://bugzilla.opensuse.org/show_bug.cgi?id=1026937

As you can see, after some extra research I am pretty sure “I am not hunting a ghost”, but still it might be a (rare?) corner case.

Fortunately, Tumbleweed now also uses kernel 4.10.x and on my local machines this kernel is affected by that problem.

So if you are running Tumbleweed with kernel 4.10.0 or higher, please try to reproduce the problem as shown in the bug report above.

No matter if your system is affected (I expect it is), please report back here.

If more than only a few systems are affected (by percentage), then voting up that bug or adding your own comment to the bug report, might be a good option for later (not yet, wait until this thread has at least a few reports).

So now it is up to you, the community, please take a few minutes of your time and try to give feedback.

If you use kernel 4.10 or higher on another version of openSUSE, feel also free to report here, if you don’t use a kernel 4.10 or higher and have some wireless problem, report it somewhere else and NOT here.

AK

On Wed 01 Mar 2017 05:46:01 PM CST, Akoellh wrote:

Now, this is actually not a request for help, at least not for help
here.

A few days ago, I opened a bug report on openSUSE’s bugzilla, not
reaction so far.

1026937 – kernel-default 4.10.0 from Kernel:stable - setting regulatory domain (crda) fails

As you can see, after some extra research I am pretty sure “I am not
hunting a ghost”, but still it might be a (rare?) corner case.

Fortunately, Tumbleweed now also uses kernel 4.10.x and on my local
machines this kernel is also affected by this problem.

So if you are running Tumbleweed with kernel 4.10.0 or higher, please
try to reproduce the problem as shown in the bug report above.

No matter if your system is affected (I expect it is), please report
back here.

If more than only a few systems are affected (by percentage), then
voting up that bug or adding your own comment to the bug report, might
be a good option for later (not yet, wait until this thread has at least
a few reports).

So now it is up to you, the community, please take a few minutes of your
time and try to give feedback.

If you use kernel 4.10 or higher on another version of openSUSE, feel
also free to report here, if you don’t use a kernel 4.10 or higher and
have some wireless problem, report it somewhere else and NOT here.

AK

Hi
I get 249 using any COUNTRY=XX, I also see no crda errors in the logs…

I’m Using the GNOME DE and Network Manager

Tumbleweed Version 20170227
Kernel: 4.10.1-1-default


Cheers Malcolm °¿° SUSE Knowledge Partner (Linux Counter #276890)
openSUSE Leap 42.1|GNOME 3.16.2|4.1.36-44-default
If you find this post helpful and are logged into the web interface,
please show your appreciation and click on the star below… Thanks!

Hi Malcom,

Hm, same kernel version but I get these errors on 42.2 and 42.1, so maybe some other component is the root cause (udev, systemd … who knows).

Can you test this on a 42.x with kernel from Tumbleweed?

AK

On Wed 01 Mar 2017 06:26:02 PM CST, Akoellh wrote:

Hi Malcom,

Hm, same kernel version but I get these errors on 42.2 and 42.1, so
maybe some other component is the root cause (udev, systemd … who
knows).

Can you test this on a 42.x with kernel from Tumbleweed?

AK

Hi
Have a repo in mind to use for 42.2?

Just tested with 42.2 and the 4.4.49-16-default kernel, same 249 error,
same system (multiboot), no crda log entries anywhere…


03:00.0 Network controller [0280]: Qualcomm Atheros QCA9565 / AR9565 Wireless Network Adapter [168c:0036] (rev 01)
Subsystem: Dell Device [1028:020e]
Kernel driver in use: ath9k

I have a funny feeling I may not be able to duplicate…


Cheers Malcolm °¿° SUSE Knowledge Partner (Linux Counter #276890)
openSUSE Leap 42.1|GNOME 3.16.2|4.1.36-44-default
If you find this post helpful and are logged into the web interface,
please show your appreciation and click on the star below… Thanks!

Just use the kernel directly from Tumbleweed or get the latest one from Kernel:stable, if you only want to test the kernel manual install with “rpm -i” is the easiest way.

No surprises there, only kernel >= 4.10 are affected here (on 3 different systems with 42.1 or 42.2 but none with TW).


03:00.0 Network controller [0280]: Qualcomm Atheros QCA9565 / AR9565 Wireless Network Adapter [168c:0036] (rev 01)
Subsystem: Dell Device [1028:020e]
Kernel driver in use: ath9k

Same card on one of my systems (which is affected, runs 42.2).

AK

On Wed 01 Mar 2017 06:43:22 PM CST, malcolmlewis wrote:

[QUOTE]
On Wed 01 Mar 2017 06:26:02 PM CST, Akoellh wrote:

Hi Malcom,

Hm, same kernel version but I get these errors on 42.2 and 42.1, so
maybe some other component is the root cause (udev, systemd … who
knows).

Can you test this on a 42.x with kernel from Tumbleweed?

AK

Hi
Have a repo in mind to use for 42.2?

Just tested with 42.2 and the 4.4.49-16-default kernel, same 249 error,
same system (multiboot), no crda log entries anywhere…


03:00.0 Network controller [0280]: Qualcomm Atheros QCA9565 / AR9565
Wireless Network Adapter [168c:0036] (rev 01) Subsystem: Dell Device
[1028:020e] Kernel driver in use: ath9k

I have a funny feeling I may not be able to duplicate…

[/QUOTE]
Hi
Also nothing on my HP 255 G4, it’s running broadcom-wl and
Tumbleweed (single boot system)…


Cheers Malcolm °¿° SUSE Knowledge Partner (Linux Counter #276890)
openSUSE Leap 42.1|GNOME 3.16.2|4.1.36-44-default
If you find this post helpful and are logged into the web interface,
please show your appreciation and click on the star below… Thanks!

OK, so it looks like Tumbleweed itself is not affected, if I am not hit (3 times in a row) by a (rare?) corner case, then the cause has to be somewhere in user space.

I’m just trying to rebuild (locally) all related components I am confortable to touch with sources from TW (crda, libnl3, wireless-regdb), if it is related to udev/systemd or even glibc, then tough luck, I am not going to touch that and will just live with 4.10.x not working with my systems.

However, reports from users running 4.10.x on older systems (42.1 or 42.2) are still welcome.

Another important point:

Testing is not dependent on what Wireless Card you have, actually you do not need a wireless card at all as this regulatory stuff is a function of the module cfg80211. So even if you don’t have a wireless card or one that uses some external driver not relying on cfg80211/mac80211, you can still do testing.

The only thing you might have to do is to load the module “cfg80211” manually before running the test.

AK

Sorry, cannot confirm on 42.2 with kernel:stable


LT_B:~ # uname -r
4.10.1-2.g561cf31-default
LT_B:~ # crda
COUNTRY environment variable not set.
LT_B:~ # COUNTRY=DE crda ; echo $?
Failed to set regulatory domain: -7
249
LT_B:~ #

But I see no visible action on regdomain even if I have set WIRELESS_REGULATORY_DOMAIN=‘IT’ in /etc/sysconfig/network/config, so maybe I’m looking in the wrong place?

Well, unfortunately, it really seems I have hit a corner case on my systems (like the guy on redhat’s bugzilla).

Just tried a fresh install of 42.2 in a VM and no problems there, d.a.m.n.

First idea “maybe one of your own packages (and there are quite a few) causes this, so install all of them in little groups and see if/when it breaks”.

But no dice here, all “homebrew stuff” is now installed and still anything is fine, so I don’t even have a clue where to look next.

  • It’s obviously not the kernel

  • my own packages do not trigger it on a fresh install

and

  • I have no idea which configuration file/setting could be a candidate

Remember, this must be something that works up until 4.9.11 (last 4.9.x which was in kernel:stable) and suddenly breaks on 3 different machines with different versions of Leap (42.1 and 42.2).

Looks like many more hours of headbanging …

AK

P.S. If you want to see a result when changing regulatory domains, use “iw reg set IT ; iw reg get”. If that does not work, stop network services completely, try again, if that still does not work, kill wpa_supplicant manually, try again if that still does not work, unload all wireless modules, reload them, try again.

Please note: a different card on my system, so maybe it’s HW dependent:


LT_B:~ # lspci -nnk -s 04:00
04:00.0 Network controller [0280]: Qualcomm Atheros AR9485 Wireless Network Adapter [168c:0032] (rev 01)
    Subsystem: AzureWave Device [1a3b:2126]
    Kernel driver in use: ath9k
    Kernel modules: ath9k
LT_B:~ #

“iw reg set IT ; iw reg get” apparently works, but at boot the system is set to:


LT_B:~ # iw reg get
global
country 00: DFS-UNSET
...

instead of to the “IT” setting I would expect with WIRELESS_REGULATORY_DOMAIN=‘IT’:


LT_B:~ # iw reg get
global
country IT: DFS-ETSI
...

No big deal, since this is a 2.4 GHz only card, but anyway something is not in good shape here too.

Unlikely, as written before this is a function of the “cfg80211” module.

Just a guess, but If you set this in /etc/sysconfig/network/config and use NetworkManager, this is most likely ignored.

Edit:

The really annoying fact is the error message I get.

“nl80211 not found”

WTH?

So it looks crda “asks” the kernel “do you have nl80211 support?” and what ever answer it gets, the interpretation is “nope, no support in the kernel, too bad”.

If there were no nl80211 support in the kernel, there wouldn’t be any output by iw reg get (and no wireless depending on cfg80211/mac80211 at all), so the message itself is pretty ridiculous.

AK

Update:

For people trying to test this, could you please do the following when running the test:


COUNTRY=DE crda ; echo $?
Failed to set regulatory domain: -7
249

This (as seen in this thread) is the expected behaviour.

“And now for something completely different”


**modprobe l2tp_netlink**

and now


 COUNTRY=DE crda ; echo $?
nl80211 not found.
251

My “corner case” error.

Now remove the module again.


modprobe l2tp_netlink -v -r
rmmod l2tp_netlink
rmmod l2tp_core
rmmod udp_tunnel
rmmod ip6_udp_tunnel

And finally


 COUNTRY=DE crda ; echo $?
Failed to set regulatory domain: -7
249

back to normal again.

AK

Here is mine, but will not help you:

linux-977p:~ # COUNTRY=DE crda ; echo $?
Failed to set regulatory domain: -7
249
linux-977p:~ # modprobe l2tp_netlink
linux-977p:~ # COUNTRY=DE crda ; echo $?
Failed to set regulatory domain: -7
249
linux-977p:~ # modprobe l2tp_netlink -v -r
rmmod l2tp_netlink
rmmod l2tp_core
rmmod udp_tunnel
rmmod ip6_udp_tunnel
linux-977p:~ # COUNTRY=DE crda ; echo $?
Failed to set regulatory domain: -7
249
linux-977p:~ # 

It’s even a little more complicated.

Loading order of modules seems to be important.

I was hit by this, because of a dependency of a NWM-package which pulled in “xl2tpd”. That package contains a file in “/usr/lib/modules-load.d” which autoloads the respective l2tp modules and even pulls them into to initial ramdisk.

By this, those modules were loaded before cfg80211 and all the other modules needed by my wireless devices and this triggers the strange problem.

So for a real test, you have to take care all of your wireless modules are unloaded before loading “l2tp_netlink”.

I have an Intel Wireless on this machine, so I have to unload “iwldvm”, please adjust this to your hardware.


modprobe -rv iwldvm
rmmod iwldvm
rmmod mac80211
rmmod iwlwifi
rmmod cfg80211

lsmod | grep -E '80211|l2tp'

When loading cfg80211 first and then l2tp_netlink, anything is fine.


modprobe -v cfg80211
insmod /lib/modules/4.10.0-1.g81ace5a-default/kernel/net/wireless/cfg80211.ko

modprobe -v l2tp_netlink
insmod /lib/modules/4.10.0-1.g81ace5a-default/kernel/net/ipv4/udp_tunnel.ko 
insmod /lib/modules/4.10.0-1.g81ace5a-default/kernel/net/ipv6/ip6_udp_tunnel.ko 
insmod /lib/modules/4.10.0-1.g81ace5a-default/kernel/net/l2tp/l2tp_core.ko 
insmod /lib/modules/4.10.0-1.g81ace5a-default/kernel/net/l2tp/l2tp_netlink.ko 

COUNTRY=DE crda ; echo $?
Failed to set regulatory domain: -7
249

and you can change your regulatory domain via “iw reg set $COUNTRYCODE”.


iw reg set US
iw reg get
global
country US: DFS-FCC
        (2402 - 2472 @ 40), (N/A, 30), (N/A)
        (5170 - 5250 @ 80), (N/A, 23), (N/A), AUTO-BW
        (5250 - 5330 @ 80), (N/A, 23), (0 ms), DFS, AUTO-BW
        (5490 - 5730 @ 160), (N/A, 23), (0 ms), DFS
        (5735 - 5835 @ 80), (N/A, 30), (N/A)
        (57240 - 63720 @ 2160), (N/A, 40), (N/A)

And now let’s reverse the order, but first unload all modules again:

modprobe -rv l2tp_netlink
rmmod l2tp_netlink
rmmod l2tp_core
rmmod udp_tunnel
rmmod ip6_udp_tunnel

modprobe -rv cfg80211
rmmod cfg80211

So now, first load l2tp_netlink and then cfg80211:


modprobe -v l2tp_netlink
insmod /lib/modules/4.10.0-1.g81ace5a-default/kernel/net/ipv4/udp_tunnel.ko 
insmod /lib/modules/4.10.0-1.g81ace5a-default/kernel/net/ipv6/ip6_udp_tunnel.ko 
insmod /lib/modules/4.10.0-1.g81ace5a-default/kernel/net/l2tp/l2tp_core.ko 
insmod /lib/modules/4.10.0-1.g81ace5a-default/kernel/net/l2tp/l2tp_netlink.ko 
modprobe -v cfg80211
insmod /lib/modules/4.10.0-1.g81ace5a-default/kernel/net/wireless/cfg80211.ko

And finally, repeat the test:

COUNTRY=DE crda ; echo $?
nl80211 not found.
251

Now remove the l2tp_netlink module, load your respective wireless driver and you should be up and running again.

AK

P.S. Sorry, Freund aus dem Sauerland, da war ich ein klein wenig zu langsam.

linux-977p:~ # lsmod | grep -i iw
iwlmvm                393216  0 
mac80211              835584  1 iwlmvm
iwlwifi               184320  1 iwlmvm
cfg80211              651264  3 iwlmvm,iwlwifi,mac80211
linux-977p:~ # modprobe -rv iwlmvm
rmmod iwlmvm
rmmod mac80211
rmmod iwlwifi
rmmod cfg80211
linux-977p:~ # lsmod | grep -E '80211|l2tp'
linux-977p:~ # modprobe -v cfg80211
insmod /lib/modules/4.10.1-2.g561cf31-default/kernel/net/wireless/cfg80211.ko 
linux-977p:~ # 
linux-977p:~ # modprobe -v l2tp_netlink
insmod /lib/modules/4.10.1-2.g561cf31-default/kernel/net/ipv4/udp_tunnel.ko 
insmod /lib/modules/4.10.1-2.g561cf31-default/kernel/net/ipv6/ip6_udp_tunnel.ko 
insmod /lib/modules/4.10.1-2.g561cf31-default/kernel/net/l2tp/l2tp_core.ko 
insmod /lib/modules/4.10.1-2.g561cf31-default/kernel/net/l2tp/l2tp_netlink.ko 
linux-977p:~ # COUNTRY=DE crda ; echo $?
Failed to set regulatory domain: -7
249
linux-977p:~ # iw reg set US
linux-977p:~ # iw reg get
global
country US: DFS-FCC
        (2402 - 2472 @ 40), (N/A, 30), (N/A)
        (5170 - 5250 @ 80), (N/A, 17), (N/A), AUTO-BW
        (5250 - 5330 @ 80), (N/A, 23), (0 ms), DFS, AUTO-BW
        (5490 - 5730 @ 160), (N/A, 23), (0 ms), DFS
        (5735 - 5835 @ 80), (N/A, 30), (N/A)
        (57240 - 63720 @ 2160), (N/A, 40), (N/A)

linux-977p:~ # 

And here, reverse as you say:

linux-977p:~ # modprobe -v l2tp_netlink
linux-977p:~ # modprobe -rv l2tp_netlink
rmmod l2tp_netlink
rmmod l2tp_core
rmmod udp_tunnel
rmmod ip6_udp_tunnel
linux-977p:~ # modprobe -rv cfg80211
rmmod cfg80211
linux-977p:~ # modprobe -v l2tp_netlink
insmod /lib/modules/4.10.1-2.g561cf31-default/kernel/net/ipv4/udp_tunnel.ko 
insmod /lib/modules/4.10.1-2.g561cf31-default/kernel/net/ipv6/ip6_udp_tunnel.ko 
insmod /lib/modules/4.10.1-2.g561cf31-default/kernel/net/l2tp/l2tp_core.ko 
insmod /lib/modules/4.10.1-2.g561cf31-default/kernel/net/l2tp/l2tp_netlink.ko 
linux-977p:~ # modprobe -v cfg80211
insmod /lib/modules/4.10.1-2.g561cf31-default/kernel/net/wireless/cfg80211.ko 
linux-977p:~ # COUNTRY=DE crda ; echo $?
nl80211 not found.
251

QED…

Thanks man, you just proved that I am not insane … (or at least not THAT insane or better not THAT insane YET…yeah, whatever :-)).

So after all, this really seems to be a bug in the kernel.

But before changing my bug report for another time, calling for more people to test.

AK

Forgotten, my System:

linux-977p:~ # uname -a
Linux linux-977p 4.10.1-2.g561cf31-default #1 SMP PREEMPT Mon Feb 27 13:06:34 UTC 2017 (561cf31) x86_64 x86_64 x86_64 GNU/Linux
linux-977p:~ # lsb-release -a
LSB Version:    n/a
Distributor ID: openSUSE project
Description:    openSUSE Leap 42.2
Release:        42.2
Codename:       n/a
linux-977p:~ # /sbin/lspci -nnk | grep -iA3 net
01:00.0 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller [10ec:8168] (rev 15)
        Subsystem: Lenovo Device [17aa:3833]
        Kernel driver in use: r8169
        Kernel modules: r8169
02:00.0 Network controller [0280]: Intel Corporation Intel Dual Band Wireless-AC 3165 Plus Bluetooth [8086:3166] (rev 99)
        Subsystem: Intel Corporation Device [8086:4210]
        Kernel driver in use: iwlwifi
        Kernel modules: iwlwifi
linux-977p:~ # 

Confirming your finding, Axel. This is with an Atheros card, confirming that it doesn’t seem HW dependent.
Skipping the annoying part, here is what matters:


LT_B:~ # modprobe -v l2tp_netlink
insmod /lib/modules/4.10.1-2.g561cf31-default/kernel/net/ipv4/udp_tunnel.ko 
insmod /lib/modules/4.10.1-2.g561cf31-default/kernel/net/ipv6/ip6_udp_tunnel.ko 
insmod /lib/modules/4.10.1-2.g561cf31-default/kernel/net/l2tp/l2tp_core.ko 
insmod /lib/modules/4.10.1-2.g561cf31-default/kernel/net/l2tp/l2tp_netlink.ko 
LT_B:~ # modprobe -v cfg80211
insmod /lib/modules/4.10.1-2.g561cf31-default/kernel/net/wireless/cfg80211.ko 
**LT_B:~ # COUNTRY=DE crda ; echo $?
nl80211 not found.
251**
LT_B:~ # iw **reg set IT**
LT_B:~ # iw reg get
global
**country 00: DFS-UNSET**
...
LT_B:~ #
LT_B:~ # modprobe -rv l2tp_netlink
rmmod l2tp_netlink
rmmod l2tp_core
rmmod udp_tunnel
rmmod ip6_udp_tunnel
LT_B:~ # modprobe -v ath9k
insmod /lib/modules/4.10.1-2.g561cf31-default/kernel/net/mac80211/mac80211.ko 
insmod /lib/modules/4.10.1-2.g561cf31-default/kernel/drivers/net/wireless/ath/ath.ko 
insmod /lib/modules/4.10.1-2.g561cf31-default/kernel/drivers/net/wireless/ath/ath9k/ath9k_hw.ko 
insmod /lib/modules/4.10.1-2.g561cf31-default/kernel/drivers/net/wireless/ath/ath9k/ath9k_common.ko 
insmod /lib/modules/4.10.1-2.g561cf31-default/kernel/drivers/net/wireless/ath/ath9k/ath9k.ko btcoex_enable=1 bt_ant_diversity=1 ps_enable=1
LT_B:~ # iw **reg set IT**
LT_B:~ # iw reg get
global
**country IT: DFS-ETSI**
...
LT_B:~ #

FYI with this Atheros chip the only practical way to permanently set regdomain at boot seems a boot option like “cfg80211.ieee80211_regdom=IT” or equivalent.
All other sysconfigs I tried seem to be ignored by NetworkManager, regdomain being read from the chip EEPROM and changed only by the “iw reg set” command.

Yup, this correlates with my observations.

NWM seems to ignore most of the settings in /etc/sysconfig/network/config, some NETCONFIG variables are honoured but that’s about it (and again, I don’t think this is hardware specific).

AK