[go: nahoru, domu]

|
|
Subscribe / Log in / New account

Making WiFi fast

Ready to give LWN a try?

With a subscription to LWN, you can stay current with what is happening in the Linux and free-software community and take advantage of subscriber-only site features. We are pleased to offer you a free trial subscription, no credit card required, so that you can see for yourself. Please, join us!

By Jonathan Corbet
November 8, 2016
Linux Plumbers Conference
Dave Täht has been working to save the Internet for the last six years (at least). Recently, his focus has been on improving the performance of networking over WiFi — performance that has been disappointing for as long as anybody can remember. The good news, as related in his 2016 Linux Plumbers Conference talk, is that WiFi can be fixed, and the fixes aren't even all that hard to do. Users with the right hardware and a willingness to run experimental software can have fast WiFi now, and it should be available for the rest of us before too long.

Networking, Täht said, has been going wrong for over a decade; it turns out that queuing theory has not properly addressed the problem of matching data rates to the bandwidth that the hardware can provide. Developers have tended to optimize for the fastest rates possible, but those rates are rarely seen in the real world when WiFi is involved. The "make WiFi fast" effort, involving a number of developers, seeks to change the focus and to optimize both throughput and latency at all data rates.

He has been working on the bufferbloat problem for the last six years. Hundreds of people have been involved in this effort, which was spearheaded by the Linux networking stack. Many changes were merged, starting with byte queue limits in 3.3 and culminating (so far) with the BBR congestion-control algorithm, which was merged for 4.8. At this point, all network protocols can be debloated — with the exception of WiFi and LTE. But, he said, a big dent has just been made in the WiFi problem.

For the rest of the talk, Täht enlisted the aid of Ham the mechanical monkey. Ham, it seems, works in the marketing department. He only cares about benchmarks; if the numbers are big, they will help to sell products. [Dave Täht] Ham has been the nemesis for years, driving the focus in the wrong direction. The right place to focus is on use cases, where the costs of bufferbloat are felt. That means paying much more attention to latency, and focusing less on the throughput numbers that make Ham happy.

As an example, he noted that the Slashdot home page can, when latency is near zero, be loaded in about eight seconds (the LWN page, he said, was too small to make an interesting example). If the Flent tool is used to add one second of latency to the link, that load takes nearly four minutes. We have all been in that painful place at one point or another. The point is that latency and round-trip times matter more than absolute throughput.

Unfortunately, the worst latency-causing bufferbloat is often found on high-rate connections deep within the Internet service provider's infrastructure. That, he said, should be fixed first, and WiFi will start to get better for free. But that is only the start. WiFi need not always be slow; its problems are mostly to be found in its queuing, not in external factors like radio interference. The key is eliminating bufferbloat from the WiFi subsystem.

To get there, Täht and his collaborators had to start by developing a better set of benchmarks to show what is going on in real-world situations. The most useful tool, he said, is Flent, which is able to do repeatable tests under network load and show the results in graphical form. Single-number benchmark results are not particularly helpful; one needs to look at performance over time to see what is really going on. It is also necessary to get out of the testing lab and test in the field, in situations with lots of stations on the net.

What they found was that the multiple-station case is where things fall down in the WiFi stack. If you have a single device on a WiFi network, things will work reasonably well. But as soon as there is contention for air time, the problems show up.

How to improve WiFi

The WiFi stack in current kernels has four major layers of interest, when it comes to queuing:

  • At the top, the queuing discipline accepts packets and feeds them into the driver layer. The amount of buffering there is huge; it can hold ten seconds of WiFi data.

  • The mac80211 layer does high-level WiFi work, and adds some queuing and latency of its own.

  • The driver for the WiFi adapter maintains several queues of its own, perhaps holding several seconds of data. This level is where aggregation is done; aggregation groups a set of packets into a single transmitted frame to improve throughput — at the cost of increased latency.

  • The firmware in the adapter itself can hold another ten seconds of data in its queues.

That adds up to a lot of queuing in the WiFi subsystem, with all of the associated problems. The good news is that fixing it required no changes to the WiFi protocols at all. So those fixes can be applied to existing networks and existing adapters.

The first step was to add a "mac80211 intermediate queue" that handles all packets for a given device, reducing the amount of queuing overall, especially since the size of this queue is strictly limited. It is meant to to hold no more data than can be sent in two "transmission opportunities" (slots in which an aggregate of packets can be transmitted). The fq_codel queue management algorithm was generalized to work well in this setting.

The queuing discipline layer was removed entirely, eliminating a massive amount of buffering. Instead, there is a simple per-station queue, and round-robin fair queuing between the stations. The goal is to have one aggregated frame in the hardware for transmission, and another one queued, ready to go as soon as the hardware gets to it. Only having two packets queued at this layer may not scale to the very highest data rates, he said, but, in the real world, nobody ever sees those rates anyway.

There should be a single aggregate under preparation in the mac80211 layer; all other packets should be managed in the (short) per-station queues. In current kernels, mac80211 pushes packets into the low-level driver, where they may accumulate. In the new model, instead, the driver calls back into the mac80211 layer when it needs another packet; that gives mac80211 a better view into when transmission actually happens. The total latency imposed by buffering in this scheme is, he said, limited to 2-12ms, and there is no need for intelligence in the network hardware.

Results and future directions

The result of all this work is WiFi latencies that are less than 40ms, down from a peak of 1-2 seconds before they started, and much better handling of multiple stations running at full rate. Before the changes, a test involving 100 flows all starting together collapsed entirely, with at most five flows getting going; all the rest failed due to TCP timeouts caused by excessive buffering latency. Afterward, all 100 could start and run with reasonable latency and bandwidth. All this work, in the end, comes down to a patch that removes a net 200 lines of code.

There are some open issues, of course. The elimination of the queuing discipline layer took away a number of useful network statistics. Some of these have been replaced with information in the debugfs filesystem. There is, he said, some sort of unfortunate interaction with TCP small queues; Eric Dumazet has some ideas for fixing this problem, which only arises in single-station tests. There is an opportunity to add better air-time fairness to keep slow stations from using too much transmission time. Some future improvements, he said, might come at a cost: latency improvements might reduce the peak bandwidth slightly. But latency is what almost all users actually care about, so that bandwidth will not be missed — except by Ham the monkey.

At this point, the ath9k WiFi driver fully supports these changes; the code can be found in the LEDE repository and daily snapshots. Work is progressing on the ath10k driver; it is nearly done. Other drivers have not yet been changed. Expanding the work may well require some more thought on the driver API within the kernel but, for the most part, the changes are not huge.

WiFi is, Täht said, the only wireless technology that is fully under our control. We should be taking more advantage of that control to make it work as well as it possibly can; he wishes that there were more developers working in this area. Even a relatively small group has been able to make some significant progress in making WiFi work as it should, though; we will all be the beneficiaries of this work in the coming years.

[Your editor thanks LWN subscribers for supporting his travel to LPC.]

Index entries for this article
KernelNetworking/Wireless
ConferenceLinux Plumbers Conference/2016


to post comments

Making WiFi fast

Posted Nov 8, 2016 22:38 UTC (Tue) by Sesse (subscriber, #53779) [Link] (5 responses)

All of this would be a lot nicer if just all drivers supported mac80211… In particular, the Intel drivers (iwlwifi) don't, and there doesn't seem to be a lot of movement towards it. Sure, many Linux-based APs run ath9k and/or ath10k, but we need this both ways.

Making WiFi fast

Posted Nov 8, 2016 23:45 UTC (Tue) by ay (guest, #79347) [Link] (4 responses)

iwlwifi is and has always been a mac80211 driver...

Making WiFi fast

Posted Nov 8, 2016 23:50 UTC (Tue) by Sesse (subscriber, #53779) [Link] (3 responses)

Well, for one, I can't use minstrel on it?

Making WiFi fast

Posted Nov 10, 2016 3:01 UTC (Thu) by drag (guest, #31333) [Link]

maybe its a hardware or firmware limitation.

Making WiFi fast

Posted Nov 11, 2016 0:01 UTC (Fri) by tohojo (subscriber, #86756) [Link] (1 responses)

A driver can opt in to using minstrel, or it can do it's own rate control. Iwl does the latter (not sure if it's in the driver or in firmware). Not sure what it would take to get the Intel drivers to use these changes, but I don't think it's trivial, unfortunately (my laptop also has an Intel card in it).

Making WiFi fast

Posted Nov 16, 2016 20:06 UTC (Wed) by mtaht (guest, #11087) [Link]

Let me point out a link to a talk that describes the enormous technical debt that needs to be paid down in just one (other) wifi driver set in order to make forward progress:

https://www.linuxplumbersconf.org/2016/ocw//system/presen...

These sorts of issues are almost universal in wifi, in every chipset and OS.

Making WiFi fast

Posted Nov 9, 2016 1:08 UTC (Wed) by mtaht (guest, #11087) [Link] (3 responses)

A really excellent summary, Jon. Thanks. I sent a couple nits via email.

Most of the patches are queued up for 4.9 and 4.10 already, and more testing is very desirable of all that.

The lastest 2 patchsets (for airtime fairness) for lede - not yet submitted there - are here: https://kau.toke.dk/git/lede/

An unofficial patchset that applies on top of net-next (for ath9k and ath10k), and .deb files for ubuntu is here: http://www.taht.net/~d/airtime-8/

Once those are straightened out and submitted...

Further work on top of these *will* compromise some measures of bandwidth, in favor of latency. In this talk, I felt it was very important to get more wifi developers focused on better benchmarks, before we ran the gamut of getting them upstream.

Work has paused for a bit - Toke is working on an interim thesis defense, and I'm out banging a drum for more funding - and addressing the TSQ issue is next on the agenda - but hopefully we'll be done with these soon, and be able to start exploring possibilities in other wifi chipsets with hopefully those driver experts getting excited about this, also.

Making WiFi fast

Posted Nov 9, 2016 1:09 UTC (Wed) by mtaht (guest, #11087) [Link]

Making WiFi fast

Posted Nov 17, 2016 9:55 UTC (Thu) by Wol (subscriber, #4433) [Link] (1 responses)

To try and keep Ham happy ... how about a "contended bandwidth benchmark"?

Pick a standard number of devices typical in the home - let's say 5. And let's say our wifi router has a max bandwidth of 100Mb/s.

Now fire up 5 devices, clobber the access point, and see the total throughput. If each device manages 10Mb/s, that means the router is achieving 50% of theoretical maximum throughput.

I know it's pretty much the same benchmark as they currently use, except they use one device so they achieve 100% of theoretical max, and they quote the throughput.

If we can go to Ham and say "in a typical home, with 5 devices, half your throughput is unusable", that's a benchmark he won't like becoming public ... :-)

Cheers,
Wol

Making WiFi fast

Posted Dec 28, 2016 17:48 UTC (Wed) by mtaht (guest, #11087) [Link]

This is overlong, but does do a "typical family" evaluation, as you suggest.

http://caia.swin.edu.au/reports/161107A/CAIA-TR-161107A.pdf

Making WiFi fast

Posted Nov 9, 2016 8:10 UTC (Wed) by Felix.Braun (guest, #3032) [Link] (16 responses)

Maybe WiFi can work in American houses built mainly out of wood with tons of space around each one. But in my experience a set of concrete walls will dampen the WiFi signal significantly even within a relatively small appartment. And if you live in a house with 15 other families then the 2.4 GHz band will become pretty crowded really fast. So it's Ethernet for me.

Making WiFi fast

Posted Nov 9, 2016 9:08 UTC (Wed) by Sesse (subscriber, #53779) [Link] (8 responses)

Uhm, the solution to 2.4 GHz getting crowded is to… use 5 GHz. There's hardly any reason to use 2.4 GHz anymore, and 802.11ac doesn't even support it.

Making WiFi fast

Posted Nov 9, 2016 17:55 UTC (Wed) by spaetz (guest, #32870) [Link] (2 responses)

> There's hardly any reason to use 2.4 GHz anymore, and 802.11ac doesn't even support it.

Except that the HP laptop I just bought only supports 2.4GHz. Looking for 802.11n and Linux compatability, I failed to notice that a 500€ machine does not do 5GHz nowadays. It is a shame, really.

Making WiFi fast

Posted Nov 10, 2016 17:23 UTC (Thu) by kamil (subscriber, #3802) [Link]

Check if the WiFi in your laptop is upgradable. It's often on a separate mini-PCIe card that is trivial to replace with little more than a screwdriver, and the cards can often be found for under $/€20 on eBay and such.

Making WiFi fast

Posted Nov 11, 2016 0:08 UTC (Fri) by cesarb (subscriber, #6266) [Link]

> Looking for 802.11n and Linux compatability, I failed to notice that a 500€ machine does not do 5GHz nowadays.

There's a solution for that now: look for 802.11ac. Since 802.11ac is 5 GHz only, its presence means that the WiFi adapter can do 5 GHz.

Once 802.11ac becomes more popular, it should reduce the annoying tendency of offering professional-grade laptops with only 2.4 GHz WiFi.

Making WiFi fast

Posted Nov 10, 2016 3:04 UTC (Thu) by drag (guest, #31333) [Link] (4 responses)

2.4ghz is nice if you need longer distance. Different frequencies have different strengths.

Making WiFi fast

Posted Nov 10, 2016 8:33 UTC (Thu) by Sesse (subscriber, #53779) [Link] (3 responses)

They really don't. In empty space, they fade _exactly_ the same way (it's physics). And like I said in another comment, for most obstructions, they fade very similarly, too. The effects of “the band is seven times as wide” (really!) and “ambient noise tends to be about 3 dB lower on 5 GHz” drowns out these considerations in practice.

Making WiFi fast

Posted Nov 16, 2016 18:50 UTC (Wed) by mb (subscriber, #50428) [Link]

>They really don't. In empty space, they fade _exactly_ the same way (it's physics).

Except that my house consists of a little bit more matter than empty space and that the 5GHz signal certainly is a lot weaker than the 2.4 GHz signal after it passed a few walls.

Making WiFi fast

Posted Nov 18, 2016 7:11 UTC (Fri) by Sertorius (guest, #47862) [Link] (1 responses)

If you're going to make comments like that, please make sure you actually *know* the physics. The relevant equation in this case, the Friis path loss equation, has a lambda squared on the top, or if you prefer f squared on the bottom. So yes, path loss is significantly lower at lower frequencies; this is the reason that satellites use the lower of a pair of frequencies to transmit (because they are power-constrained); likewise frequency-division duplex phones will use the lower frequency channel for the uplink (again, power-constrained). This is also the best-case scenario; usually the path loss exponent is higher than 2 due to multipath fading (due to reflections).

Here's a graph if you aren't convinced.

5 GHz is severely attenuated by relatively mild obstructions (such as gyprock/drywall or timber) that 2.4 penetrates very easily. If you have concrete or brick walls, you'll want an AP in every room.

The main benefits of 5 GHz are that you have a lot more non-overlapping channels, so it is easier to avoid interference - it is also good if you have a lot of users to support and want to have a LOT of short-range APs.

Making WiFi fast

Posted Nov 18, 2016 8:34 UTC (Fri) by zlynx (guest, #2285) [Link]

The 5GHz fade is also a great thing in apartment buildings. All of your neighbors have a WiFi router that they picked up somewhere and they managed to make it though the configuration wizard. So every single apartment is at 100% transmission power, all of the time. Having the walls cut that down (and having all of the extra channels) really helps.

Making WiFi fast

Posted Nov 9, 2016 9:36 UTC (Wed) by Cyberax (✭ supporter ✭, #52523) [Link] (6 responses)

Rapid 5GHz signal fading is a GOOD thing. You won't get a lot of noise from neighbors. So just install a couple of WiFi repeaters in rooms with bad reception.

Making WiFi fast

Posted Nov 9, 2016 12:26 UTC (Wed) by Sesse (subscriber, #53779) [Link] (5 responses)

5 GHz actually fades pretty similarly to 2.4 GHz for most materials (red brick being a notable counterexample). But in many jurisdictions, it starts out with a small (3 dB) penalty in allowed transmission strength.

Making WiFi fast

Posted Nov 9, 2016 19:18 UTC (Wed) by jonth (subscriber, #4008) [Link] (4 responses)

A small nitpick: fading!=attenuation.

One other observation I'd like to share: 20 years of experience in cellular comms has taught me is that you can't beat a wire.

Making WiFi fast

Posted Nov 10, 2016 1:39 UTC (Thu) by samroberts (subscriber, #46749) [Link] (3 responses)

Unless your goal is to decrease wiring!

Making WiFi fast

Posted Nov 11, 2016 0:13 UTC (Fri) by cesarb (subscriber, #6266) [Link]

Or your wire is broken. Wireless is much harder to cut.

(A coworker just found out that the Ethernet wire to one of the WiFi APs at work was broken, which explains network issues they were having.)

Making WiFi fast

Posted Nov 14, 2016 6:13 UTC (Mon) by eduard.munteanu (guest, #66641) [Link] (1 responses)

In practice, though, WiFi is often added on to the premises as an afterthought. Whoever set up the space didn't plan properly for networking, so WiFi gets used as a stop-gap measure. As with all last resort measures, it kinda sucks, not necessarily because there's something inherently wrong with WiFi.

Making WiFi fast

Posted Mar 9, 2019 1:29 UTC (Sat) by gdt (subscriber, #6284) [Link]

A reminder that wired connections also have downsides, mostly its inconvenience and cost.

A good RJ-45 jack is rated for 2,500 cycles. So wireless is a much better fit for high-traffic areas such as cafes and libraries. Patch leads are a small but ongoing expense, and staff and students don't like "BYO patch lead".

A wired port costs around $200 per wallport to cable. But this can blow out when a custom solution is required. Wiring a cafe table will cost more than than table.

Wireless networks work without any further action by the user. Once set up (which is far too hard) Eduroam connects your laptop or phone to the campus network the moment you go to use the device. No searching for a jack and patch lead. Wireless is so convenient that it's common to see a person sitting next to a wall port but using wireless.

Wired from modern devices is difficult. Using wired ethernet from a phone or tablet requires special cabling (a OTG cable) to the ethernet dongle. The dongle itself is a optional purchase. Cheaper dongles meant for laptops might not have driver support in a phone. Using wired ethernet from a recent laptop requires a USB-C/ethernet dongle, which means the laptop can't be powered whilst using the wired network. To have both power and wired networking requires a bulky and expensive "docking station".

We should be telling people who need network performance to use wired. But that may not end up being the bulk of the connections on a campus network.

Making WiFi fast

Posted Nov 9, 2016 14:13 UTC (Wed) by sourcejedi (guest, #45153) [Link] (12 responses)

> there is no need for intelligence in the network hardware

and the wheel turns again. (It's making me think of one or two great posts about this point in general, cpu v.s. offloads, which I can't find rn).

> It is meant to to hold no more data than can be sent in two "transmission opportunities" (slots in which an aggregate of packets can be transmitted). The fq_codel queue management algorithm was generalized to work well in this setting.

> The goal is to have one aggregated frame in the hardware for transmission, and another one queued, ready to go as soon as the hardware gets to it. Only having two packets queued at this layer may not scale to the very highest data rates, he said, but, in the real world, nobody ever sees those rates anyway.

> There should be a single aggregate under preparation in the mac80211 layer; all other packets should be managed in the (short) per-station queues.

This doesn't seem quite clear.

What controls the length of the per-station queues? You could read this as saying the per-station queues are limited to an aggregate's worth overall, but I'm not sure that's right.

I assume fq_codel is being applied to the per-station queues, that's the only way I can understand, but this doesn't read like that to me.

Ah, merged code says codel applies a (currently hardcoded) 20ms target. So I assume that's what sizes the per-station queues.

Maybe "The fq_codel queue management algorithm was generalized to work well in this setting." would be better put after "all other packets should be managed in the (short) per-station queues".

(also the next step in this effort is to go from round-robin of the station queues, to airtime fairness. yet more awesome)

Making WiFi fast

Posted Nov 9, 2016 18:04 UTC (Wed) by zlynx (guest, #2285) [Link] (2 responses)

I think you'd want to test quite a bit before trying for airtime fairness. I think the interaction of codel with the two frame limit will result in fairness even with round-robin. Because the slow devices will be getting a much more limited packet queue in software, because they take longer to transmit, the faster devices will have more packets in queue.

I am *guessing* that devices will get airtime fairness "for free."

Making WiFi fast

Posted Nov 10, 2016 2:46 UTC (Thu) by mtaht (guest, #11087) [Link] (1 responses)

while we are working on various modifications to fq_codel to make it more robust across a wide range of rates and numbers of stations, the airtime fairness patches we have currently do indeed behave better than what we call the fq-mac version without an explicit modification to codel.

I certainly welcome more testers (see the links to patches I posted earlier), ideas, and data. In addition to the data on the slides there, we have a large paper on the airtime fairness stuff pending academic review, which I can provide privately if you would like to see it.

Making WiFi fast

Posted Sep 6, 2019 21:58 UTC (Fri) by mtaht (guest, #11087) [Link]

That paper was ultimately published as "Ending the anomaly" - the capstone to solving a 16+ year old problem in wifi that nobody, until us - had figured out how to solve.

https://www.usenix.org/system/files/conference/atc17/atc1...

Making WiFi fast

Posted Nov 10, 2016 0:18 UTC (Thu) by rgmoore (✭ supporter ✭, #75) [Link] (5 responses)

> there is no need for intelligence in the network hardware

and the wheel turns again. (It's making me think of one or two great posts about this point in general, cpu v.s. offloads, which I can't find rn).

It sounds, though, as if this is almost the opposite of the traditional wheel. The traditional wheel is driven by increased complexity requiring an offload processor to take load off the CPU, followed by bringing that same level of complexity back into the CPU to save money when processing power gets cheaper. In this case, though, the process reduces complexity to the point the offload processor is redundant.

Making WiFi fast

Posted Nov 10, 2016 3:15 UTC (Thu) by drag (guest, #31333) [Link] (4 responses)

The trend has always been towards sucking as much functionality out of the computer and into the processor die as possible and out of hardware logic and into the software as much as possible.

It's a cost performance thing.. as in cost and performance and reliability improves the dumber the hardware gets and the faster the cpu gets. This is generally speaking, of course. The deal here is software is much more flexible, much cheaper, and is much easier to patch to correct bugs.

That's one of the really big take-home points about Moore's law.

It's true for everything in computing, not just networking. Phone modems had their guts ripped out and became winmodems. I hated winmodems until I learned how to chance the software drivers Linux to get different algorithms and bump up my connection speeds. Then it moved to sound cards and into network and into harddrive controllers, and now things like software raid is superior for most purposes over hardware raid. Even now there isn't really any such thing as '3D acceleration' anymore, instead you just have different types of processor cores that are optimized to graphics workloads with most of the logic in the 'drivers'.

The problem with networking is that we deal with such small MTU sizes that _sometimes_ you can get better performance by offloading some of it. But for most server purposes turning off all the 'offload features' on network cards isn't a bad idea.

Making WiFi fast

Posted Nov 10, 2016 8:38 UTC (Thu) by Sesse (subscriber, #53779) [Link] (3 responses)

While I agree with most of your point, there really is something as 3D acceleration. Even the most modern of GPUs will have a triangle rasterizer, a texture mapper and a framebuffer blend unit, all of them large fixed-function blocks (well, instantiated lots of times). This is _not_ done by the more CPU-like units (the shader cores), even though they certainly are flexible these days.

You can imagine moving all of these functions up into software, but it doesn't seem to work all that well in practice (witness e.g. Larrabee).

Fixed-function hardware

Posted Nov 10, 2016 10:04 UTC (Thu) by farnz (subscriber, #17727) [Link] (2 responses)

It's worth noting that the original Larrabee design was intended to be a pure software system on a massively parallel chip; by the time they got as far as cancelling Larrabee the GPU in favour of Knights Ferry, they'd had to add traditional fixed-function samplers to ensure that the GPU design would be competitive. Similar applies to CPUs - in some senses, the Cell Broadband Engine SPUs are what you get if you replace a fixed-function L2 cache controller with a software-controlled L2 cache, while weak memory models are what you get if you make software responsible for cache coherency only.

In general, it looks like there's a (movable) happy medium between hardware and software; where the hardware's function is well-understood, and unlikely to change in the next decade (texture samplers, cache controllers, Ethernet checksum handling etc), then it's best as fixed-function hardware. Where there's still debate about what the function should be (not just how fast you can make it), then it's best as programmable hardware (TCP offloads, graphics shaders etc) under the control of software.

Fixed-function hardware

Posted Nov 16, 2016 19:47 UTC (Wed) by mtaht (guest, #11087) [Link] (1 responses)

The core need for offloaded into the hardware firmware is that wifi has the need to do certain things under very hard realtime constraints that the Linux kernel cannot meet. In other words, it's latency, once again, driving the need for intelligence "down there". From a signal processing perspective, we care about nanoseconds - and there are like 400+ DSPs on a modern 802.11ac chip. Up from there, in the core wifi standards are need for sub 10us response times for many operations.

What we showed was that at the higher levels of the wifi stack - at the txop level - linux is more than responsive enough to fare well at the 500+us latency range, and we can put a lot more intelligence there, that can make a huge difference in actual network behavior.

I have outlined on my blog multiple ways for even smarter firmware can do even better than we do today shifting more stuff back into the core processor, instead onto the onboard firmware

As well as multiple ways to do more smart things in the core linux networking layer, building on top of this work. If made more universal, we can also make a dent in several other nagging problems in wifi, like better routing metrics.

Many of the trials, travails, missteps, and other bugs we've had
to fix along the way are in my blog and/or discussed on the make-wifi-fast list.

http://blog.cerowrt.org/post/

There is so much more that can be done to improve wifi! The best document we have on all that, is here:

https://docs.google.com/document/d/1Se36svYE1Uzpppe1HWnEy...

Fixed-function hardware

Posted Nov 16, 2016 20:09 UTC (Wed) by mtaht (guest, #11087) [Link]

As a counter-example of how better onboard firmware could cut observed latencies down below what we can achieve by moving more ops into the kernel, see:

http://blog.cerowrt.org/post/a_look_back_at_cerowrt_wifi/

Some chipsets already expose a per-station concept, in particular.

Making WiFi fast

Posted Nov 10, 2016 2:53 UTC (Thu) by mtaht (guest, #11087) [Link] (1 responses)

I'd sent a few nits regarding this section of the article to jon earlier, as the description is unclear. Let me get to that in another post.

If I said "there is no need for intelligence in the network hardware" I did not mean that. There is plenty! of room for more intelligence there, just no need (for non-mu-mimo) for more than 2 txops of queuing in the onboard memory, or firmware, or driver.

It is kind of my hope actually that by reducing the max queuing in the wifi chip that we can fit more code into the firmware, like keeping better statistics or putting in better rate control information, or doing saner things with interrupts, or presenting a better API, etc.

Making WiFi fast

Posted Nov 10, 2016 8:58 UTC (Thu) by sourcejedi (guest, #45153) [Link]

I could have quoted more

> The goal is to have one aggregated frame in the hardware for transmission

- that's all I was really thinking about. If you tell this to original hardware designers, one would expect them to be somewhat surprised. (Or maybe I'm wrong: they'd tell you how ill-suited the network stacks they targetted were for wireless, and it's about time).

Making WiFi fast

Posted Nov 10, 2016 3:11 UTC (Thu) by mtaht (guest, #11087) [Link]

The current 20ms target in the mainline merged wifi-fq_codel code is an artifact of a number of other performance problems we were having at the time - notably powersave was broken by the patch set for a long while.

So... we backed off from the more aggressive 5ms default. Most of our recent testing has been against the original 5ms target with pretty good results. We also reverted back to the quantum 1514 default, rather than 300, as that hurt us on cpu on small platforms.

So, after more stuff lands it is my hope that we will revert these two changes back to the theoretically more correct 5% of 100ms that the target represents. Also, codel's taking place 2-10ms behind the actual packet delivery, presently.

In the talk I suggested shrinking txops more explicitly under contention. There is also the idea of turning codel's drop scheduler off at either a "good sized" aggregate, or an aggregate close to the ideal size for a given station, to move the knee of the curve closer to full utilization, where currently it turns off at a single large packet outstanding. We've also discussed dynamically modifying the target via ewma based on the workload and other common delays in the system (contention and interference).

Testing this stuff is HARD! Nobody's ever applied AQM technology to wifi or aggregating macs before, so far as I know. We will never get a perfect result, the goal is merely to get one that is reasonably good across most rates, and across most common numbers of stations.

I struggle to rewrite the description - one thing that is not obvious is that the fq_codel implementation for wifi is one very large set of queues for the entire device, with the per-station pointer within it disambiguating things. This was michal kazior's innovation - prior to that I'd been stuck on the idea of a full fq_codel instance created per station, with perhaps 64 queues each (and possibly derived from cake's set associative version of fq_codel). Now there's tons of queues and if there is a station collision on a given queue, it gets sorted out. Much better than what I'd had in mind!

It was my first talk on the work, mae culpa!

Still working on rephrasing the troublesome bit in the article, give me a few hours.

Making WiFi fast

Posted Nov 9, 2016 19:31 UTC (Wed) by fratti (guest, #105722) [Link] (2 responses)

Are the WiFi drivers for popular Android phones upstreamed? I see these changes potentially making a big impact on public wifis where phones are very often used, unless I'm misunderstanding this and this is strictly concerning the AP side of things, not the client.

Making WiFi fast

Posted Nov 9, 2016 20:29 UTC (Wed) by pizza (subscriber, #46) [Link]

> Are the WiFi drivers for popular Android phones upstreamed?

Not only are they generally not upstreamed, but many of them still aren't available in source form.

Making WiFi fast

Posted Nov 9, 2016 21:35 UTC (Wed) by sourcejedi (guest, #45153) [Link]

you are misunderstanding this :)

at least in the sense that, I don't think phones with too-large transmit buffers would degrade the experience of other stations on the same network. And such phones will still benefit from improved behaviour of the AP, in the packets they _receive_.

Making WiFi fast

Posted Nov 10, 2016 3:23 UTC (Thu) by Otus (subscriber, #67685) [Link] (1 responses)

Has power use been a consideration and been measured?

I can imagine that pushing more logic to software and handling packets in smaller batches could be detrimental to power consumption.

Making WiFi fast

Posted Nov 10, 2016 23:58 UTC (Thu) by tohojo (subscriber, #86756) [Link]

Not explicitly. However, while we are theoretically doing more work per packet, we have not seen anything that even registers on the CPU usage monitor. Crypto is still offloaded on most hardware, and most of the power save features of WiFi work by letting idle stations sleep for a while to conserve power. That is unchanged; these changes only really kick in when a device is busy.

Making WiFi fast

Posted Nov 10, 2016 16:31 UTC (Thu) by fredex (subscriber, #11727) [Link] (1 responses)

This article seems to be speaking from the point of view of the Linux workstation/laptop/whatever. But what about the wifi router/AP itself? wouldn't there be similar benefits there? (I'm aware of the firmware that implements codel for certain Netgear routers, but this article sounds as if it has more tweaks in mind than just the ones there.)

Making WiFi fast

Posted Nov 10, 2016 23:54 UTC (Thu) by tohojo (subscriber, #86756) [Link]

Actually, we have been doing most of the testing of this on APs. And the companion airtime fairness work is explicitly AP-centric. But it benefits clients as well; basically, it needs to go where the queues build, and that becomes the WiFi link as soon as the upstream internet speed increases above the effective WiFi rate...

Making WiFi fast

Posted Nov 10, 2016 23:54 UTC (Thu) by lkraav (subscriber, #76113) [Link] (9 responses)

You guys are all missing the most important question: why does Dave have an Estonian word Täht ("star") for his last name?

Making WiFi fast

Posted Nov 11, 2016 1:13 UTC (Fri) by mtaht (guest, #11087) [Link] (7 responses)

Heh. I have varied explanations for my last name.

1) My family's spaceship crashed in estonia in the late 1880s. Since the tradition there was serfs' last names were their job (e.g. "Joe Plumber"), thus, we named ourselves after a "star or planet", and have been trying to get off earth ever since.

My hobby is asteroid exploration, as it happens.

2) Ever type in just "Taht" into nearly any word processor, and watch it respell it to "That"? Yep, well over 30% of the documents I have to sign have screwed up my name, and I've watched people really struggle to get past the autocorrect to get it right. I filed bugs with every spell checker maker (in the 80s and 90s) with an algorithm to fix this, none deployed it (Anything without a period in front of it and a capital Taht, don't respell). The bastards!

Thus the umlaut.

3) It's a good test of i18n capability and interop with various protocols and tools.

4) I used to be a death metal fan.

I am glad I could clear this mystery up. Back to WiFi!

Making WiFi fast

Posted Nov 11, 2016 6:55 UTC (Fri) by lkraav (subscriber, #76113) [Link] (1 responses)

ACKACKACKA

(because min. 10 char ACK needed here)

Making WiFi fast

Posted Nov 11, 2016 16:44 UTC (Fri) by mtaht (guest, #11087) [Link]

Of possible interest is the pathological piece I wrote while wrestling with who I am today. This epiphany is why I'm now "dave", and not "mike" taht - and it's where Ham the analogy came from: http://the-edge.blogspot.com/2003_06_08_the-edge_archive....

I've cultivated out of the box thinking ever since, in myself, and everyone.

Making WiFi fast

Posted Nov 17, 2016 10:06 UTC (Thu) by Wol (subscriber, #4433) [Link] (4 responses)

> I filed bugs with every spell checker maker (in the 80s and 90s) with an algorithm to fix this, none deployed it (Anything without a period in front of it and a capital Taht, don't respell). The bastards!

WordPerfect had an incredibly simple fix for this, in response to all sorts of problems.

1) Let the user dictionary correct a word to itself, eg "MSc == MSc".

2) The user dictionary is the first one consulted.

3) As soon as any dictionary or rule fired, terminate all further checking.

Fixes ANY and ALL spellcheck problems :-) (Oh, and this was back in WP5.1 for DOS, if not earlier.)

Cheers,
Wol

Making WiFi fast

Posted Nov 17, 2016 21:19 UTC (Thu) by mtaht (guest, #11087) [Link] (3 responses)

In the vast majority of cases you want the spellchecker to automagically correct taht to that.

Making WiFi fast

Posted Nov 17, 2016 23:44 UTC (Thu) by zlynx (guest, #2285) [Link]

Yes but the vast majority of words which are capitalized and not at the beginning of the sentence are proper nouns which are often spelled differently from regular words.

Examples: Kaycee, Maryanne, Marianne, Marian, Britney, Britnee, Destynie.

Making WiFi fast

Posted Nov 18, 2016 4:47 UTC (Fri) by neilbrown (subscriber, #359) [Link]

> In the vast majority of cases you want the spellchecker to automagically correct taht to that.

Speak for yourself :-)

In every single imaginable case I do *not* want any spellchecker to automagically correct anything.
I'm very happy for possible errors to be highlighted and for probable corrections to be only a gesture away. But if I ever make an error (whch I do), I want it to be *my* error, not a machine's.

Making WiFi fast

Posted Nov 18, 2016 20:49 UTC (Fri) by Wol (subscriber, #4433) [Link]

So, you have "Taht == Taht".

Okay, it won't pick up a mis-spelt "that" at the start of a sentence, but when was that ever "proper" English". :-)

Iirc the complete rule, WordPerfect ignored case if the dictionary entry was all lower case. As soon as you had mixed or upper case, it had to be a perfect match.

Cheers,
Wol

Making WiFi fast

Posted Nov 11, 2016 12:20 UTC (Fri) by osma (subscriber, #6912) [Link]

I'm glad you asked it, since I was wondering the same!

Making WiFi fast

Posted Mar 30, 2020 19:11 UTC (Mon) by mtaht (guest, #11087) [Link]

I just wanted to note that AQL finally landed for the ath10k in openwrt head yesterday, and the results are *wonderful*.


Copyright © 2016, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds