Nmap Development mailing list archives
Improving address exclusion matching
From: Daniel Miller <bonsaiviking () gmail com>
Date: Thu, 1 Nov 2018 00:35:30 -0500
Back in 2012, David posted an interesting message to this list: https://seclists.org/nmap-dev/2012/q4/420 He suggested that our existing method for checking a candidate address against a list of excluded addresses was slower than it should be, and that it was noticeably affecting some users' scan times. I've been thinking about this off and on since then (6 years doesn't seem all that long at this age :)), and I recently had a breakthrough that I was able to code up in just a couple of days. The previous system works by turning each exclusion specification provided via --exclude or --excludefile into an array of bit vectors, using 256 bits to represent all possible values for one octet of an address. Then each candidate address is checked against each specification linearly until it matches (the candidate is excluded) or all specifications have been checked. The new system uses a radix tree, or "trie," data structure, where each node represents an address prefix. Comparison is done using bitwise operations to mask off the less-significant portion of the address and compare only the relevant portion. When a prefix matches, the next bit of the candidate address after the prefix determines which branch of the tree to descend. Some branches are dead-ends: the candidate doesn't match and can be scanned. Others are always-match nodes, caused when an exclusion specification had a CIDR-style netmask like 192.168.0.0/24. In all, a worst-case maximum of 128 "comparisons" is done for any given IPv6 address (32 for IPv4), no matter how many excluded addresses are provided. This sounds great, but here are the real numbers on performance improvement: If you're excluding fewer than 1000 addresses or netmasks, you probably won't notice. I ran some tests using an exclude list of 1000 /24 networks and a target list of 65536 random IP addresses. The new matching algorithm was about 40% faster, but the old one didn't take much more than 1 second to read and check those addresses. $ /usr/bin/time ./nmap-new -iL 65k.ips -n -sL --excludefile exclude.txt | tail 0.53user 0.07system 0:00.61elapsed 99%CPU (0avgtext+0avgdata 12168maxresident)k 0inputs+0outputs (0major+23145minor)pagefaults 0swaps <snip> Nmap done: 65533 IP addresses (0 hosts up) scanned in 0.62 seconds $ /usr/bin/time ./nmap-old -iL 65k.ips -n -sL --excludefile exclude.txt | tail 0.91user 0.11system 0:01.03elapsed 99%CPU (0avgtext+0avgdata 12220maxresident)k 0inputs+0outputs (0major+23150minor)pagefaults 0swaps <snip> Nmap done: 65533 IP addresses (0 hosts up) scanned in 1.03 seconds I did another more exhaustive scan, using 10000 individual random IP addresses as an exclude list and "scanning" a /16 network (again, 65536 targets). This time the difference was far more noticeable: the new trie matching still only took about 0.6 seconds, even with only 77% CPU usage, but the old one took nearly 6 seconds! $ /usr/bin/time ./nmap-new 199.141.0.0/16 -n -sL --excludefile exclude.txt | tail 0.30user 0.16system 0:00.60elapsed 77%CPU (0avgtext+0avgdata 13372maxresident)k 0inputs+0outputs (0major+23406minor)pagefaults 0swaps <snip> Nmap done: 65531 IP addresses (0 hosts up) scanned in 0.61 seconds $ /usr/bin/time ./nmap-old 199.141.0.0/16 -n -sL --excludefile exclude.txt | tail 5.60user 0.16system 0:05.84elapsed 98%CPU (0avgtext+0avgdata 13584maxresident)k 0inputs+0outputs (0major+23485minor)pagefaults 0swaps <snip> Nmap done: 65531 IP addresses (0 hosts up) scanned in 5.84 seconds Memory use is pretty close to the same for both, with a slight benefit to the new method. One compatibility note: the new trie matching is not used with IPv4 octet ranges or wildcards. To maintain compatibility, those exclude specifications are still matched using the old system, but I imagine most users doing large scans with many exclusions are using CIDR notation instead. The changes were made in r37496 - 37498, with additional cleanup of headers and interfaces following in r37499 - 37501. Happy scanning! Dan
_______________________________________________ Sent through the dev mailing list https://nmap.org/mailman/listinfo/dev Archived at http://seclists.org/nmap-dev/
Current thread:
- Improving address exclusion matching Daniel Miller (Oct 31)
- Re: Improving address exclusion matching David Fifield (Nov 03)