Firewall-cmd unable to populate large ipset

I had large ipset for blocking most of internet to knock my host. It does not work any more.

Creating ipset by adding ip nets by country to ipset used to work nicely. Last time it worked nicely was on August (kernel 5.13.2-1-default, Tumbleweed distribution update done 2021-07-24).

I did Tumbleweed distribution upgrade on 2021-08-21 (kernel 5.13.8-1-default). Since then adding IP nets country by country to ipset slows down. In the beginnging entries are inserted into ipset from by-country net files with reasonable speed, but everty additional country file increases entries creation time exponentially. It is so slow that it does not finish during whole week!

Now I tried to import all nets from single file. Practically it fails, too.

firewall-cmd --permanent --new-ipset=deny_public --type=hash:net --option=family=inet --option=hashsize=16384 --option=maxelem=208516
[FONT=monospace]firewall-cmd --permanent --ipset=deny_public --add-entries-from-file=./blacklist_nets_sorted

Latter command sits there hours and hours solidly eating 100% of single core CPU cycles.

Firewalld used to use iptables backend. I decided to give a try to libnftables backend, but with no avail. Is there a way to see what firewall-cmd is doing while consuming CPU?

[/FONT]

[FONT=monospace][FONT=monospace]$ zypper info firewalld libnftables1 python3-firewall ipset
Loading repository data... 
Reading installed packages... 


Information for package firewalld: 
---------------------------------- 
Repository     : Main Repository (OSS) 
Name           : firewalld 
Version        : 1.0.1-1.1 
Arch           : noarch 
Vendor         : openSUSE 
Installed Size : 453.7 KiB 
Installed      : Yes 
Status         : up-to-date 
Source package : firewalld-1.0.1-1.1.src 
Summary        : A firewall daemon with D-Bus interface providing a dynamic firewall 
Description    :  
    firewalld is a firewall service daemon that provides a dynamic customizable 
    firewall with a D-Bus interface. 


Information for package libnftables1: 
------------------------------------- 
Repository     : Main Repository (OSS) 
Name           : libnftables1 
Version        : 1.0.0-1.2 
Arch           : x86_64 
Vendor         : openSUSE 
Installed Size : 811.9 KiB 
Installed      : Yes (automatically) 
Status         : up-to-date 
Source package : nftables-1.0.0-1.2.src 
Summary        : nftables firewalling command interface 
Description    :  
    libnftables is the nftables command line interface placed into a 
    library. 


Information for package python3-firewall: 
----------------------------------------- 
Repository     : Main Repository (OSS) 
Name           : python3-firewall 
Version        : 1.0.1-1.1 
Arch           : noarch 
Vendor         : openSUSE 
Installed Size : 2.0 MiB 
Installed      : Yes (automatically) 
Status         : up-to-date 
Source package : firewalld-1.0.1-1.1.src 
Summary        : Python3 bindings for FirewallD 
Description    :  
    The python3 bindings for firewalld.

[/FONT][/FONT]
[FONT=monospace][FONT=monospace][FONT=monospace]Information for package ipset: 
------------------------------ 
Repository     : Main Repository (OSS) 
Name           : ipset 
Version        : 7.15-1.2 
Arch           : x86_64 
Vendor         : openSUSE 
Installed Size : 28.3 KiB 
Installed      : Yes (automatically) 
Status         : up-to-date 
Source package : ipset-7.15-1.2.src 
Summary        : Netfilter ipset administration utility 
Description    :  
    IP sets are a framework inside the Linux kernel, which can be 
    administered by the ipset utility. Depending on the type, currently 
    an IP set may store IP addresses, (TCP/UDP) port numbers or IP 
    addresses with MAC addresses in a way, which ensures lightning speed 
    when matching an entry against a set. 

    ipset can: 
    * store multiple IP addresses or port numbers and match against the 
      collection by iptables in one swoop; 
    * dynamically update iptables rules against IP addresses or ports 
      without performance penalty; 
    * express complex IP address and ports based rulesets with one single 
      iptables rule and benefit from the speed of IP sets 
[/FONT][/FONT][/FONT]

Maybe this Red Hat Bug Report – <https://bugzilla.redhat.com/show_bug.cgi?id=1908127#c11>

  • Please check the auditd logging.

Also, this GitHub firewalld issue – <https://github.com/firewalld/firewalld/issues/738>

Bottom line – either, set the firewalld to use the iptables backend or, create the large dataset by means of nft.

I checked audit.log. There is not many NETFILTER_CFG lines at all.

I had iptables backend in use before. Now it is nftables. Both behave similarly. i.e. large ipset creating is slow.

For IPv4 ipset I created a mitigation. I merged all blocked networks together so that 208518 entries reduced to 19062 entries. It takes about 78 minutes from firewall-cmd to add those entries to ipset. When firewall-cmd runs, almost all its CPU cycles are spent in userland. No other processsses are using significant amount of CPU at that time.

This perl script I wrote for mitigating large ipset creation merges IP nets to larger blocks. Net::CIDR package is used for input validation and Net::CIDR::Lite is doing the job. It reads nets in CIDR format from stdin and prints results to stdout.

#!/usr/bin/perl -w 
use strict; 
use Net::CIDR::Lite; 
use Net::CIDR ':all'; 
my $cidr = Net::CIDR::Lite->new; 
while (<>) { 
        chomp; 
        my $net = Net::CIDR::cidrvalidate ($_); 
        if ($net) { 
                $cidr->add ($net); 
        } 
        else { 
                exit 1; 
        } 
} 
print join ("
", $cidr->list()) . "
";

Can it possibly matter in CPU uusage which python3 version is used as firewall-cmd interpreter? Mine is now python3.8.

I did some more research to see how much time adding different nunber of entries to new ipset takes. Here are results that show exponential time growth of firewall-cmd execution times. Almost all of execution time is spent in userland. Ipset file itself (under /etc/firewalld/ipsets) is created quickly at the end of ipset entries adding process.

|Number of entries added|**Execution time in seconds
|—|
**|
|1|0.288749|
|2|0.338041|
|4|0.329535|
|8|0.334876|
|16|0.350141|
|32|0.380960|
|64|0.373696|
|128|0.592653|
|256|1.24076|
|512|3.82732|
|1024|13.8303|
|2048|58.0955|
|4096| 219.704|
|8192| 865.891|
|16384|3604.74|

How do investigate or debug this issue more?

  1. Code inspection.
  2. Module testing.
  3. Code analysis and performance testing.
  4. Code architecture analysis and modelling.
  5. Unit testing.
  6. Based on the results of the investigations, redesign …

Looking at the issue at hand, may I suggest that, a team of 10 “normal” programmers be assigned to the task – if “top performers” are available, possibly only 5 will be needed …

I reported this issue upstream as https://github.com/firewalld/firewalld/issues/881.

For comparing how fast 209500 IPv4 routes can be aggregated and ipset xml file created I wrote a perl filter script that reads routes from stdin and writes xml to stdout. It takes just 17 seconds to create that xml file with aggregated routes (ca. 19000 routes). I estimate firewall-cmd needs 10 to 11 days for same task without preaggregating routes.

#!/usr/bin/perl -w 
# 
# Script for testing how fast ipset.xml file can be created
# Reads CIDR IPv4 or IPv6 routes from stdin
# Writes aggregated routes ipset xml file to stdout

use strict; 
use XML::Writer; 
use Net::CIDR::Lite; 
use Net::CIDR ':all'; 

(our $prog = $0) =~ s,.*/,,; 

my $cidr = Net::CIDR::Lite->new; 
while (<>) { 
        chomp; 
        my $t = Net::CIDR::cidrvalidate ($_); 
        if ($t) { 
                $cidr->add ($t); 
        } 
        else { 
               print STDERR "$prog: $_: Invalid CIDR
"; 
               exit 1; 
       } 
} 
my $xml = XML::Writer -> new (OUTPUT => "self", DATA_MODE => 1, DATA_INDENT => 2, NAMESPACES => 1); 
$xml->xmlDecl ("utf-8"); 
$xml->startTag ("ipset", "type" => "hash:net"); 
$xml->emptyTag ("option", "name" => "family", "value" => "inet"); 
$xml->emptyTag ("option", "name" => "hashsize", "value" => "16384"); 
$xml->emptyTag ("option", "name" => "maxelem", "value" => $#{$cidr->list()}); 
foreach my $i ($cidr->list()) { 
        $xml->dataElement("entry", $i); 
} 
$xml->endTag ("ipset"); 
$xml->end; 

print $xml; 

#EOF

This ipset creation has now been improved a lot with a fix done upstream. It has also flowed down to Tumbleweed.

For me firewall-cmd takes now 330 seconds to create ipset from 19062 IPv4 networks and 1985 seconds to create ipset from 46562 IPv6 networks.