Sunday, 13 March 2011

BGP part 2

The second post today is also about BGP. I was asked why in  transit AS we have to run both IGP and iBGP. Why we cannot pick up one routing protocol? Let's find the answer!
In the following picture we can see the transit area which is connected to two others AS. Now let's focus on 3 different situation:
  • routers in the transit area are running only IGP
  • routers in the transit area are running only iBGP
  • routers in the transit area are running both IGP and iBGP

So in the first option, routers in the transit area are running only IGP. Routers R1 and R5 have established the BGP-sessions to routers R6 and R7, respectively. And if you are running only IGP between all others router within the transit area, you have to redistribute on R1 and R5 all received BGP prefixes into the IGP. Easy? Maybe it is easy, however today BGP routing table has about 380 000 prefixes. That amount of prefixes will kill the IGP and the routers will suffer a shortage of CPU power. So this solution is impossible! You can of course establish the iBGP session between R1 and R5. Then R1 and R5 will have all BGP prefixes, even without the redistribution BGP into IGP. However when the R1 would like to send a packet to the R7 via R5, R1 will send the packet to either R2, R3 or R4. And because none of these routers have the route to the R7 (again, we don't redistribute BGP into IGP), the packet will be dropped.

In the second option we are using only iBGP. We redistribute connected into BGP, so every router can ping whatever he want. But BGP is designed to carry a huge number of prefixes. In the path selection, the BGP doesn't take into account the link speed or link delay, which is the strength of link-state IGPs. Moreover BGP has a relative long time to reach the convergence in case of any failures in the network. So this solution may work, however it is not optimal.

Finally we can use IGP and iBGP in our core network. And this solution is almost perfect! Almost, because we have to deploy BGP sessions between every single router (we can use MPLS to avoid this, however this technology is out of scope of this post). Before we saw what happens without a full-mesh BGP sessions (R2, R3 or R4 are dropping all unknown packets). In this solution R1 (or R5) is advertising all BGP prefixes that he learned from R6 to the transit network (so to the R2, R3 or R4) using BGP, so all other routers that are running BGP have the knowledge of all these prefixes, and all other routers in the transit area know exactly how to reach the exit point (R1), because of IGP. In case of any failure in the core network, each router within the transit area is able to update its routing table (and finds out the new route to the exit point) using the IGP and it is much faster that the convergence of BGP.

2 comments:

  1. "And this solution is almost perfect! Almost, because we have to deploy BGP sessions between every single routers (we can use MPLS to avoid this, however this technology is out of scope of this post)"

    you can use route reflector to avoid creating BGP session between every router in iBGP.

    ReplyDelete
  2. Yes, you are right. Maybe it will be better to say, that you have to run the BGP process in every single router within your core network.

    ReplyDelete