Networking 101: Understanding BGP Routing
Updated by Paul Rubens.
The Border Gateway Protocol (BGP) is the routing protocol of the Internet, used to route traffic across the Internet. For that reason, it's a pretty important protocol, and it can also be the hardest one to understand.
From our overview of Internet routing, you should realize that routing in the Internet is comprised of two parts: the internal fine-grained portions managed by an IGP such as OSPF, and the interconnections of those autonomous systems (AS) via BGP.
Who needs to understand BGP?
BGP is relevant to network administrators of large organizations which connect to two or more ISPs, as well as to Internet Service Providers (ISPs) who connect to other network providers. If you are the administrator of a small corporate network, or an end user, then you probably don't need to know about BGP.
- The current version of BGP is BGP version 4, based on RFC4271.
- BGP is the path-vector protocol that provides routing information for autonomous systems on the Internet via its AS-Path attribute.
- BGP is a Layer 4 protocol that sits on top of TCP. It is much simpler than OSPF, because it doesn’t have to worry about the things TCP will handle.
- Peers that have been manually configured to exchange routing information will form a TCP connection and begin speaking BGP. There is no discovery in BGP.
- Medium-sized businesses usually get into BGP for the purpose of true multi-homing for their entire network.
- An important aspect of BGP is that the AS-Path itself is an anti-loop mechanism. Routers will not import any routes that contain themselves in the AS-Path.
Why do you need to understand BGP?
When BGP is configured incorrectly, it can cause massive availability and security problems, as Google discovered in 2008 when its YouTube service became unreachable to large portions of the Internet. What happened was that, in an effort to ban YouTube in its home country, Pakistan Telecom used BGP to route YouTube's address block into a black hole. But, in what is believed to have been an accident, this routing information somehow got transmitted to Pakistan Telecom's Hong Kong ISP and from there got propagated to the rest of the world. The end result was that most of YouTube's traffic ended up in a black hole in Pakistan.
More sinisterly, 2003 saw a number of BGP hijack attacks, where modified BGP route information allowed unknown attackers to redirect large blocks of traffic so that it travelled via routers in Belarus or Iceland before it was transmitted on to its intended destination.
Clearly, BGP is significant. Here we'll provide a short overview of how BGP works, along with the problems it solves and causes.
First a little terminology. In the world of BGP, each routing domain is known as an autonomous system, or AS. What BGP does is help choose a path through the Internet, usually by selecting a route that traverses the least number of autonomous systems: the shortest AS path.
You might need BGP, for example, if your corporate network is connected to two large ISPs. To use BGP you would need an AS number, which you can get from the American Registry of Internet Numbers (ARIN).
Once BGP is enabled, your router will pull a list of Internet routes from your BGP neighbors, who in this case will be your two ISPS. It will then scrutinize them to find the routes with the shortest AS paths. These will be put into the router's routing table. (If you only connect to a single ISP then you don't need BGP. That's because there's only one path to the Internet, so there's no need for a routing protocol to select the best path.)
Generally, but not always, routers will choose the shortest path to an AS. BGP only knows about these paths based on updates it receives.
Unlike Routing Information Protocol (RIP), a distance-vector routing protocol which employs the hop count as a routing metric, BGP does not broadcast its entire routing table. At boot, your peer will hand over its entire table. After that, everything relies on updates received.
Route updates are stored in a Routing Information Base (RIB). A routing table will only store one route per destination, but the RIB usually contains multiple paths to a destination. It is up to the router to decide which routes will make it into the routing table, and therefore which paths will actually be used. In the event that a route is withdrawn, another route to the same place can be taken from the RIB.
The RIB is only used to keep track of routes that could possibly be used. If a route withdrawal is received and it only existed in the RIB, it is silently deleted from the RIB. No update is sent to peers. RIB entries never time out. They continue to exist until it is assumed that the route is no longer valid.
BGP path attributes
In many cases, there will be multiple routes to the same destination. BGP therefore uses path attributes to decide how to route traffic to specific networks.
The easiest of these to understand is Shortest AS_Path. What this means is the path which traverses the least number of AS "wins."
Another important attribute is Multi_Exit_Disc (Multi-exit discriminator, or MED). This makes it possible to tell a remote AS that if there are multiple exit points on to your network, a specific exit point is preferred.
The Origin attribute specifies the origin of a routing update. If BGP has multiple routes, then origin is one of the factors in determining the preferred route.
To get a true sense of how BGP works, it's important to spend some time talking about the issues that plague the Internet.
First, we have a very big problem with routing table growth. If someone decides to deaggregate a network that used to be a single /16 network, they could potentially start advertising hundreds of new routes. Every router on the Internet will get every new route when this happens. People are constantly pressured to aggregate, or combine multiple routes into a single advertisement. Aggregation isn't always possible, especially if you want to break up a /19 into two geographically separate /20s. Routing tables are approaching 200,000 routes now, and for a time they were appearing to grow exponentially.
Second, there is always a concern that someone will "advertise the Internet." If some large ISP's customer suddenly decides to advertise everything, and the ISP accepts the routes, all of the Internet's traffic will be sent to the small customer's AS. There's a simple solution to this. It's called route filtering. It's quite simple to set up filters so that your routers won't accept routes from customers that you aren't expecting, but many large ISPs will still accept the equivalent of "default" from peers that have no likelihood of being able to provide transit.
Finally, we come to flapping. BGP has a mechanism to "hold down" routes that appear to be flaky. Routes that flap, or come and go, usually aren't reliable enough to send traffic to. If routes flap frequently, the load on all Internet routes will increase due to the processing of updates every time someone disappears and reappears. Dampening will prevent BGP peers from listening to all routing updates from flapping peers. The amount of time one is in hold-down increases exponentially with every flap. It's annoying when you have a faulty link, since it can be more than an hour before you can get to many Internet sites, but it is very necessary.
This quick discussion of BGP should be enough to get you thinking the right way about the protocol but is by no means comprehensive. Spend some time reading the RFCs if you're tasked with operating a BGP router. Your peers will appreciate it.
Photo courtesy of Shutterstock.