Bufferbloat: Sacrificing Latency for Throughput
The solution to slow networks has been singular in its approach: Add bandwidth. Got a slow network? Add more pipe, that'll take care of the problem, right? One network engineer says no -- and that solution may be making the problem worse.
There are few things worse than being on a highway somewhere, cruising along at the posted speed limit, then seeing the tell-tale flashes of red brake lights: There's traffic congestion ahead.
Suddenly, you're trapped with hundreds of other cars and trucks, on a highway with no nearby exits, crawling along at walking speed (if you're lucky), and probably little idea of what's causing the delay. In the city, this problem is mitigated somewhat by the fact that there are more exits enabling you to get off the road and onto city streets that can route you around the problem. Larger cities will have traffic reports on the radio that will alert you to potential problems, so you may be able to avoid route altogether--if the radio report is timely.
Things are getting better. If I am driving to Chicago, for instance, a passenger can look up a Google Map of the road ahead and tell me if it's green, yellow, or red in time for me to shift to an alternate route. Provided I have the presence of mind to have them look--usually I don't start caring about the traffic ahead until there's congestion. In the absence of a problem, I will happily toodle along only caring about my immediate surroundings until the next problem.
This attitude about vehicular traffic is often mirrored in our approach to network traffic. When things are humming along, we often ignore the health of the network, reacting only to problems that, for whatever reason, we didn't anticipate. In other words, we cruise along without complaint on our high-bandwidth Internet superhighways until there's a traffic jam.
And, like concrete roads, our solution to the problem has been fairly singular in its approach: we add more lanes in the form of bandwidth. Got a slow network? Add more pipe, that'll take care of the problem, right?
One network engineer says otherwise. Bandwidth is only part of the solution, according to Jim Gettys, one of the creators of the X Window system: Network latency is the other half of the solution. And, Gettys is saying, because we have thrown so much technology at improving throughput and bandwidth for the sake of an exploding rate of growth of consumer Internet traffic, we have overridden the basic congestion avoidance protocols that could reduce latency and prevent Internet traffic jams in the first place.
Gettys has coined the term for this problem: bufferbloat.
What is bufferbloat?
You may have heard quite a bit about bufferbloat in recent months, as Gettys' writings about the problem have gotten a lot of people worked up about the continued health of the Internet.
The idea behind bufferbloat is this: our operating systems and routers have large, scaling TCP network buffers that, by design, "trap" large amount of packets in order to maintain the maximum possible throughput (the actual number of packets that get from point A to point B). Throughput, it has been reasoned, is the best value for a network to have: The fewer lost packets, the better the integrity of the data coming in. And, as files have gotten larger -- and larger files more numerous and traffic far more busy -- more buffers have been added to hardware and software in order to handle the flow in a smoother manner.
But in the obsessive quest for reducing packet loss and smoothing out traffic, Getty argues, another bad situation has been made worse.
Even though there are a lot of buffers on a given network, there is a chance that one (or more) of those buffers will become full. If that full buffer happens to be near a known (or unknown) network bottleneck and that bottleneck gets saturated, suddenly you have packets running smack dab into a queue, waiting for the buffer in question to empty out and deliver said packets to the next hop in the network. Packets are getting lost, so TCP (and UDP) will try to work around the problem and deliver the information via another route. But if that full buffer is on the last network hop or two before its destination (or just outside the source of the packets), then there is very little the packets can do but sit through the buffer queue and wait to get passed on.
Buffers in situations like this actually defeat congestion avoidance protocols, because they're impossible to get around.
Like running into a traffic jam on a highway with no exits around.
It's important to note that the problem of bufferbloat is not just because there are more buffers around. It's that there are so many buffers on the network that the odds of one being near a network bottleneck have risen very significantly.