One the the best tools I know of when beginning to troubleshoot a possible connectivity problem is the use of the ARP table. Every ethernet device that utilizes IP and ethernet has a ARP table and it is dead useful. By looking at the ARP table one can divide and conquer at the transition point between ethernet and IP.
ARPs are broadcasts generated by all ethernet/IP speaking devices (DTEs). Because of the layered nature of the TCP/IP protocol stack, a device needed a way to determine the IP addresses (Layer 3 on Network Layer) of devices utilizing a lower layer (Layer 2 or Link Layer). Devices request for IP addresses of other devices by sending MAC broadcasts over Layer 2 to all devices on a LAN saying, "If you know who has this IP address send it too me." A device with the IP address responses to the sender with an ARP reply containing its own MAC address and IP address. From the information learned from the ARP replies, the device builds a table of IP to MAC address bindings. Once the table has been built for the MAC to IP translations, frames can be forwarded without the need to ARP for an IP.
ARP give us the essential information of MAC address to IP address bindings that makes IP routing (and other protocols) work. If there is no entry in the ARP table for a devices MAC address there is no IP connectivity. Ethernet and IP connectivity can validated immediately at the router or the computer's gateway. This can give an engineer a point to begin troubleshooting the IP side of the network or the ethernet side of the network. If ethernet is the issue, we can begin to troubleshoot trunks, vlans, stp, etc or other link layer protocols and physical connectivity. If its determined to be an IP problem we can begin to looking at routing/forwarding tables, ACLs and other higher level protocols.
Default Networking
Saturday, June 2, 2012
Wednesday, May 30, 2012
BGP's Origin Attribute and What It Means to You
In BGP all prefix's are advertised via Network Layer Reachability Information (NLRI) and are assigned an Origin Attribute, one of the many BGP attributes, but it can be confusing as to what the Origin Attribute means and where it comes from.
The Origin Attribute gives a clue into how the route was generated by the BGP speaker originating the NLRI for a prefix. Hence the name of 'Origin Attribute'. It is a well know mandatory attribute that should not be changed by any other BGP speaking router. The Origin Attribute has three different values, which are IGP, EGP, or Incomplete. In Cisco's BGP RIB, origin values are represented by the following 'i', 'e', or '?' respectively and RFC 4271 assigns them values of 0, 1 , or 2. But you may wonder where or how the router decides which origin value to advertise in the NLRI.
When the origin attribute of IGP is advertised by a BGP speaker, this means that it originates from an AS's IGP directly. One BGP speaker directly advertises the NLRI through a network statement that has an exact match with a prefix that already exists in the FIB (Forwarding Information Base). EGP is learned from exactly that, an Exterior Gateway Protocol, such as a predecessor protocol to the BGP protocol. And finally, the Incomplete Origin is learned and advertised by a BGP speaker from routes redistributed from another protocol such as OSPF, EIGRP, static, etc. The point of an Incomplete origin value is to let you know that BGP does not know how the prefix was originated.
Now when we are talking about aggregating many prefixes together in BGP some additional rules apply. If there is at at least one route that has an incomplete origin, the entire aggregated prefix's NLRI will also be marked with an Incomplete origin value. Same rule is applied if EGP and IGP prefix's are aggregated, if one prefix has an EGP origin the entire aggregate prefix will be advertised as an EGP origin value. And finally, aggregated prefix's that have all IGP origins will be advertised with IGP value.
This is important in the BGP decision making process, because when an AS Path is the same to a destination prefix, the prefix with the lowest origin attribute value is preferred. As already stated, you cannot change the Origin Attribute of learn prefixes (unless you do some things ISPs do that are against the RFC); so therefore, if you need adjust the BGP RIB through an input policy that changes a different attribute such as AS Path, or local preference.
The Origin Attribute gives a clue into how the route was generated by the BGP speaker originating the NLRI for a prefix. Hence the name of 'Origin Attribute'. It is a well know mandatory attribute that should not be changed by any other BGP speaking router. The Origin Attribute has three different values, which are IGP, EGP, or Incomplete. In Cisco's BGP RIB, origin values are represented by the following 'i', 'e', or '?' respectively and RFC 4271 assigns them values of 0, 1 , or 2. But you may wonder where or how the router decides which origin value to advertise in the NLRI.
When the origin attribute of IGP is advertised by a BGP speaker, this means that it originates from an AS's IGP directly. One BGP speaker directly advertises the NLRI through a network statement that has an exact match with a prefix that already exists in the FIB (Forwarding Information Base). EGP is learned from exactly that, an Exterior Gateway Protocol, such as a predecessor protocol to the BGP protocol. And finally, the Incomplete Origin is learned and advertised by a BGP speaker from routes redistributed from another protocol such as OSPF, EIGRP, static, etc. The point of an Incomplete origin value is to let you know that BGP does not know how the prefix was originated.
Now when we are talking about aggregating many prefixes together in BGP some additional rules apply. If there is at at least one route that has an incomplete origin, the entire aggregated prefix's NLRI will also be marked with an Incomplete origin value. Same rule is applied if EGP and IGP prefix's are aggregated, if one prefix has an EGP origin the entire aggregate prefix will be advertised as an EGP origin value. And finally, aggregated prefix's that have all IGP origins will be advertised with IGP value.
This is important in the BGP decision making process, because when an AS Path is the same to a destination prefix, the prefix with the lowest origin attribute value is preferred. As already stated, you cannot change the Origin Attribute of learn prefixes (unless you do some things ISPs do that are against the RFC); so therefore, if you need adjust the BGP RIB through an input policy that changes a different attribute such as AS Path, or local preference.
Thursday, May 24, 2012
Future of STP, and Layer2 Multipathing?
Many changes are happening in data center architectures because of various Layer2 problems. Layer2 Multipathing seems to be the new buzzword that people are talking about to solve these problems. So I have been doing a bit of research to familiarize myself with terms and the technology in theory.
Layer2 Multipathing is essentially a replacement to the STP 802.1D protocol that has ruled the Ethernet switch market for a very long time. Essentially, what L2 Multipathing does is turn switching in to link state routing. This is needed because of the scalability problems, suboptimal path selections, and inefficient use of network links with STP as we know it today. That's right, no more BPDU, or mac flooding for switches to learn there place in the network.
There are two flavors of Layer2 Multipathing that are gaining traction, TRILL, and SPB. TRILL with stands for Transparent Interconnection of Lots of Links and was developed by the IETF. SPB or Shortest Path Bridging was developed by the IEEE and is designated 802.1aq.
TRILL uses the routing protocol IS-IS to calculate a layer 2 path through the LAN. Any Layer2 frame a switch running TRILL (known as Rbridges) receives at its edge is encapsulated within a TRILL frame. A lookup of a destination Rbridge is preformed and the frame is sent accross the TRILL domain to that Rbridge. Once the frame is received it de-encapsulates and passed to the end host. Rbidges learn where each other are in the network via hellos via IS-IS, and IS-IS calculates the shortest path between each Rbridge.
SPB again utilizes IS-IS to determine a shortest path through the LAN, and a switch that runs SPB builds a Shortest Path Tree or SPT to determine the optimal forwarding path. Essentially this is a MAC-in-MAC (Like PBB) or a Q-in-Q like solution. Designated MAC or VLAN (I think of it as a Provider VLAN or the outer tag) is assigned for each switch and frames received are encapsulated within it, the SPT is looked up for the destination, and the frame switched(or forwarded) to the edge switch which again de-encapsulated and the frame is sent to the end device.
As can be seen, both protocols are very similar in the way they work but the encapsulation protocols are very different in operation. But as with routed protocols, we can now load balance frames across muliple equal cost links, we can preform PBR type functions, or influence forwarding decisions just as if it was routing. The TRILL and SPB frames also keep track of hop counts to implement an IP like loop prevention/broadcast storm control. I am very excited to see how these protocols develop among the vendors, and how it will be implemented within both the data center and with service provider environments.
Layer2 Multipathing is essentially a replacement to the STP 802.1D protocol that has ruled the Ethernet switch market for a very long time. Essentially, what L2 Multipathing does is turn switching in to link state routing. This is needed because of the scalability problems, suboptimal path selections, and inefficient use of network links with STP as we know it today. That's right, no more BPDU, or mac flooding for switches to learn there place in the network.
There are two flavors of Layer2 Multipathing that are gaining traction, TRILL, and SPB. TRILL with stands for Transparent Interconnection of Lots of Links and was developed by the IETF. SPB or Shortest Path Bridging was developed by the IEEE and is designated 802.1aq.
TRILL uses the routing protocol IS-IS to calculate a layer 2 path through the LAN. Any Layer2 frame a switch running TRILL (known as Rbridges) receives at its edge is encapsulated within a TRILL frame. A lookup of a destination Rbridge is preformed and the frame is sent accross the TRILL domain to that Rbridge. Once the frame is received it de-encapsulates and passed to the end host. Rbidges learn where each other are in the network via hellos via IS-IS, and IS-IS calculates the shortest path between each Rbridge.
SPB again utilizes IS-IS to determine a shortest path through the LAN, and a switch that runs SPB builds a Shortest Path Tree or SPT to determine the optimal forwarding path. Essentially this is a MAC-in-MAC (Like PBB) or a Q-in-Q like solution. Designated MAC or VLAN (I think of it as a Provider VLAN or the outer tag) is assigned for each switch and frames received are encapsulated within it, the SPT is looked up for the destination, and the frame switched(or forwarded) to the edge switch which again de-encapsulated and the frame is sent to the end device.
As can be seen, both protocols are very similar in the way they work but the encapsulation protocols are very different in operation. But as with routed protocols, we can now load balance frames across muliple equal cost links, we can preform PBR type functions, or influence forwarding decisions just as if it was routing. The TRILL and SPB frames also keep track of hop counts to implement an IP like loop prevention/broadcast storm control. I am very excited to see how these protocols develop among the vendors, and how it will be implemented within both the data center and with service provider environments.
Tuesday, May 22, 2012
Cisco ISE and Wireless BYOD
A recent opportunity came up to deploy Cisco Identity Services Engine or ISE for a client in support of BYOD. The goal for the our client was to provide a way for persons belonging to a specific AD group (a BYOD group) to have access to the outside internet via their wireless mobile devices utilizing their internal AD credentials, but not having access to the internal network resources with those same devices. We attempted to do this via the WLAN controllers, but what we discovered was that WLAN controllers could not query an AD group, which was a limitation of the controllers software. Asking Cisco why lead us to Cisco's ISE product. We downloaded a 90 day licensed version of ISE and had installed on the network.
From initially logging in we could tell that ISE was going a monster of a program, with more buttons and knobs than we could wrap our heads around. One of the biggest things we were looking for was the ability to limit access to a WLAN SSID based on an AD group. By utilizing the ISE's ability to match RADIUS attributes utilizing regular expressions, we were able to isolate the SSID for authentication. We were also easily able to query AD for group membership and built a matching query string. Something we struggled with for weeks on the WLCs. Once we built a policy on that criteria we were in business. Essentially, the policy said that if you are connecting to this SSID but you are not in the approved AD group you are not getting on the network with that device. Exactly what the client was looking for.
So we accomplished what we set out to do in short order, but ISE provided more features to meet the clients WLAN security needs. At the time, the client had MAC address filtering on the WLCs for company assets, which is a very manual process of placing the each WLAN NIC's MAC addresses in the controllers for each individual PC. This is a large administrative overhead that was implemented because of unauthorized devices that were accessing the corporate wireless, and corporate resources. This is a less then desirable solution because of potential of MAC address spoofing attacks.
The next request from the client was to figure out if ISE could identify which machines were company assets and which were not, and to allow those assets access to cooperate wireless and deny access to the rest automatically, thus they can decommission the WLAN MAC address filtering. This one took quite a bit a time to figure out. Again we matched on the RADIUS attributes, and tried to match on domain membership. We failed again and again and tried many scenarios. Finally, we decided to enable machine authentication through Cisco's AnyConnect Client. Which meant the machine had to pass authentication against AD. We ran into other problems though, the machine would authenticate, and the user would authenticate, but they were not tied together. ISE provided an answer. We were able to build a policy that essentially made ISE check to see if a machine that the user was logging into was authenticated against AD before letting a user on. If it was true and the machine was authenticated and the user passed AD authorization, then they were let on to the corporate wireless, but if the machine could not be authenticate then the user was also denied access, even if they had valid user credentials.
It was a very slick solution, and this was only for wireless access. ISE has many other granular interrogation abilities when implemented on the wire with dot1X authentication, and a great profiling and posturing abilities. But I can say that ISE is not an easy solution. You have to really know what your goal is and an idea of how to accomplish it. You also have to understand the protocols being used and what information is contained within them. ISE also takes time. Policies can break other policies and like an ACL the first match wins, which could leave holes for malicious users. We spent days building policies that seemed worked, and many other days testing to ensure those policies worked in all cases, only to find loopholes that violate corporate security policy. We are currently working on the other capabilities of ISE for our client and I will share our impressions and findings as we progress through more challenges and opportunities.
From initially logging in we could tell that ISE was going a monster of a program, with more buttons and knobs than we could wrap our heads around. One of the biggest things we were looking for was the ability to limit access to a WLAN SSID based on an AD group. By utilizing the ISE's ability to match RADIUS attributes utilizing regular expressions, we were able to isolate the SSID for authentication. We were also easily able to query AD for group membership and built a matching query string. Something we struggled with for weeks on the WLCs. Once we built a policy on that criteria we were in business. Essentially, the policy said that if you are connecting to this SSID but you are not in the approved AD group you are not getting on the network with that device. Exactly what the client was looking for.
So we accomplished what we set out to do in short order, but ISE provided more features to meet the clients WLAN security needs. At the time, the client had MAC address filtering on the WLCs for company assets, which is a very manual process of placing the each WLAN NIC's MAC addresses in the controllers for each individual PC. This is a large administrative overhead that was implemented because of unauthorized devices that were accessing the corporate wireless, and corporate resources. This is a less then desirable solution because of potential of MAC address spoofing attacks.
The next request from the client was to figure out if ISE could identify which machines were company assets and which were not, and to allow those assets access to cooperate wireless and deny access to the rest automatically, thus they can decommission the WLAN MAC address filtering. This one took quite a bit a time to figure out. Again we matched on the RADIUS attributes, and tried to match on domain membership. We failed again and again and tried many scenarios. Finally, we decided to enable machine authentication through Cisco's AnyConnect Client. Which meant the machine had to pass authentication against AD. We ran into other problems though, the machine would authenticate, and the user would authenticate, but they were not tied together. ISE provided an answer. We were able to build a policy that essentially made ISE check to see if a machine that the user was logging into was authenticated against AD before letting a user on. If it was true and the machine was authenticated and the user passed AD authorization, then they were let on to the corporate wireless, but if the machine could not be authenticate then the user was also denied access, even if they had valid user credentials.
It was a very slick solution, and this was only for wireless access. ISE has many other granular interrogation abilities when implemented on the wire with dot1X authentication, and a great profiling and posturing abilities. But I can say that ISE is not an easy solution. You have to really know what your goal is and an idea of how to accomplish it. You also have to understand the protocols being used and what information is contained within them. ISE also takes time. Policies can break other policies and like an ACL the first match wins, which could leave holes for malicious users. We spent days building policies that seemed worked, and many other days testing to ensure those policies worked in all cases, only to find loopholes that violate corporate security policy. We are currently working on the other capabilities of ISE for our client and I will share our impressions and findings as we progress through more challenges and opportunities.
Sunday, May 20, 2012
Redistributing eBGP into anything!?
I was just witnessed a very surprising conversation through NANOG. The person was asking if he can redistribute their full BGP table into OSFP. Of course, the rest of a forum was able to assist this poor chap out.
This is a problem on several levels. Frist of which, OSPF is designed as an IGP meant to find the best path to a prefix within a AS, not the entire internet, which BGP is designed for. Therefore, it is not recommended to inject an EGP into an IGP such as OSPF. On the other hand, BGP, or eBGP more specifically, is designed for route control, not best path discovery; or in other words, to provide the AS control of what routes get advertised out of the AS, what routes get advertised into the AS, and what path to take for egress and ingress traffic.
Second, a full internet BGP table is 400,000+ prefixes. I cannot image a router that can calculate that many SPF algorithms, or any other IGP protocol, and make it through convergence, not to mention if there is a one area or several areas within OSPF. I suppose it would depend on the amount of memory, and processing power of the router. Maybe there is a router that can, but could you imagine maintaining a route table with 400,000 prefixes, it would be an administrative nightmare.
Unless you are an ISP, keep the EGP at the edge or within the core for specialty cases. Also, unless you are multi-homed with 2 or more ISPs there is not really a reason to have a full BGP table. I do acknowledge that there are reasons to redistibute individual routes into the IGP for path selection within the AS, but there are other ways of accomplishing this within the IGP. I can imagine also redistributing a default route learned from BGP into the IGP, but OSPF does not support this feature in Cisco IOS, though it is supported in JUNOS.
If you have had experiences redistributing BGP into an IGP I would be very interested in hearing how you have utilized it. I have myself redistributed a default route into EIGRP within a multi-homed environment, and also did some interesting things with MPLS from a practical perspective, and done other things in the lab.
This is a problem on several levels. Frist of which, OSPF is designed as an IGP meant to find the best path to a prefix within a AS, not the entire internet, which BGP is designed for. Therefore, it is not recommended to inject an EGP into an IGP such as OSPF. On the other hand, BGP, or eBGP more specifically, is designed for route control, not best path discovery; or in other words, to provide the AS control of what routes get advertised out of the AS, what routes get advertised into the AS, and what path to take for egress and ingress traffic.
Second, a full internet BGP table is 400,000+ prefixes. I cannot image a router that can calculate that many SPF algorithms, or any other IGP protocol, and make it through convergence, not to mention if there is a one area or several areas within OSPF. I suppose it would depend on the amount of memory, and processing power of the router. Maybe there is a router that can, but could you imagine maintaining a route table with 400,000 prefixes, it would be an administrative nightmare.
Unless you are an ISP, keep the EGP at the edge or within the core for specialty cases. Also, unless you are multi-homed with 2 or more ISPs there is not really a reason to have a full BGP table. I do acknowledge that there are reasons to redistibute individual routes into the IGP for path selection within the AS, but there are other ways of accomplishing this within the IGP. I can imagine also redistributing a default route learned from BGP into the IGP, but OSPF does not support this feature in Cisco IOS, though it is supported in JUNOS.
If you have had experiences redistributing BGP into an IGP I would be very interested in hearing how you have utilized it. I have myself redistributed a default route into EIGRP within a multi-homed environment, and also did some interesting things with MPLS from a practical perspective, and done other things in the lab.
Saturday, May 19, 2012
Intra-area and External OSPF LSAs Explained (Briefly)
So why segment your OSPF AS into various areas all connected to a backbone area (a.k.a. area 0)? Why not just keep it one simple backbone area? That answer is to contain Type-1 (Router) and 2 (Network) LSA propagation, and summarize at border routers to allow for faster convergence within each area. LSAs help us do that, but we need to know what they are and what they do for us.
First a bit about LSA's and OSPF. Link State Advertisements or LSAs are generated for every router, for every transient network (both broadcast and non-broadcast multiaccess networks...ethernet, frame-relay, etc), for every prefix from another area, and every prefix that redistributed into an OSFP network from an external routing source. These LSAs are flooded within an area, and some into other areas as we will see. All this information is shared between OSPF speaking routers. It is gathered and placed into the OSPF database called the Link State Database or LSDB on each router. Once the routers have all the information about every prefix everywhere within its AS, it then runs the Shortest Path First (SPF) algorithm against the LSDB to calculate the shortest or 'best' path to each network prefix. This path is determined by the lowest cost calculated based on a metric that drived from the bandwidths of the links and the costs advertised by all other routers. Each router calculates the best path to each prefix from its own perspective, essentially building a tree that no brach crosses any other branch, thus creating a loop free topology. The problem is that the larger the database or topology is, the more processing and time it takes to calculate the shortest path, hence slower convergence. Depending on the router's capabilities and the number of SFP calculations to process if the database is too large during convergence, I have seen routers never converge.
This is where OSPF's 2 teir hierarchy, areas and LSAs comes into play. Area Border Routers (ABRs) are routers that connect 2 or more areas. All areas must be connected to the backbone area (with the exception of a sham-link) and is connected via an ABR. The ABRs do not forward type 1 or 2 LSAs into adjacent areas. And this is good. Instead the ABRs take the prefix information learned from the connected areas and flood them as a Type-3 (Net Summary) LSA into each adjacent area. All other routers in the area instead of calculating the SPF to the advertising router within another area, instead use the cost the ABR provides along with the cost to get to the ABR(not always though, see stubby networks below). Now it would not make sense to have every subnet connected to every router in an area advertised as a Type-3 LSA into the adjacent areas. This would do nothing to reduce the number of LSAs, so at the ABRs we should summarize the prefixes contained within that area. This summarization can effectively reduce the Type-3 LSA advertisements into adjacent areas to one Type-3 LSA, if the IP numbering schema is well designed.
The fun begins with two AS's or autonomous systems exchanging OSPF LSAs (via redistribution). For this to occur we need a Autonomous System Boarder Router or ASBR. The ASBRs produce Type-5 (AS-External) LSAs for each external route redistributed into OSPF and and floods them throughout the AS (again, see stubby networks below). The ASBR can advertise the LSA as a Type-1 or Type-2 External metric. Essentially, a Type-1 External metric is calculated based on the advertising ASBRs metric plus the metric to get to the ASBR. In other words, a router will receive an LSA from an ASBR and add the cost to get to the ASBR to the metric received from the ASBR for the Type 1 External prefix. The other is Type-2 External metrics are much easier the metric that ASBR advertises does not change throughout the AS. Type-1s are preferred over Type-2s by the routers. Again like Type-3 LSAs, Type-5 LSAs can be summarized to reduce LSA propagation throughout the AS.
So what happens if the ASBR is in a different area than the Backbone area? The problem is how is another area's router going to calculate how to get to the ASBR of another area if all it receives is a LSA. This is where a Type-4 LSA helps. A ABR (remember it connects 2 or more areas) helps out by letting everyone in the area know the cost of getting to the ASBR from the ABR for each prefix.
There is one other LSA that needs to be touched on, and that is the Type-7 LSA or NSSA External. Some say NSSA is Cisco propriety, but there is OSPF extension that describes it in RFC 1587. An NSSA or Not-So-Stubby-Area, is an area that is confused. But to know why it is confused need to know what a stubby network is. In a stubby network, the ABR advertise no Type-5 LSAs into the area, and will instead advertise a default route as a Type-3 LSA. This again is to reduce the number of LSAs propagated to the stub area. An ABR may send other Type-3 LSAs as well or it may not depending on the configuration. A stubby network which does not receive type-3 LSAs with the exception of the default route is called a totally stubby network. But a NSSA is a stub network (which doesn't get Type-5 LSAs, and may not get Type-3 LSAs) that is connected to another AS, whose prefixes are being redistributed into OSPF within the stub network, hence the could have been stubby network it is not so stubby any more. These prefixes are advertised via a Type-7 LSA. However, when the ABR receives and re-advertises the prefixes received from the Type-7 LSA it advertises them in the backbone area as a Type-5 LSA.
Of couse there are exceptions to some of these rules as we have seen with stubby areas and NSSAs. For example, in an MPLS VRF or Layer 3 VPN that is being redistributed from MP-BGP into OSPF on a Provider Edge router, the PE router does not advertise the prefix via a Type-5 LSA but it in fact advertises it as a Type-3, even though technically BGP and OSPF are differing AS's. But this makes sense, from a customer point of view, the prefixes the customer advertises into the VRF are not from differing AS's but all part of the customers own AS. This way each of the customers site router actually receive Type-3 LSAs, and everything looks to be part of the same large network with no ISP interference.
I hope this provides you with some insight into the various LSA types that are used in multiarea OSPF domains. OSPF is a very complex routing protocol with many 'button and knobs', and as the title suggested I tried to be brief and hit on the highlights of each LSA (the devil is in the details). I know I did not touch on Type-6 Group Membership LSAs but I have never seen them used in production. Small OSPF networks may only have one backbone area and have no convergence problems, larger OSPF network may have many areas and still work to reduce there LSDB. How to control the propagation of LSAs is a key factor to faster OSPF convergence times. In conclusion, OSPF is my favorite IGP, and has great convergence when designed and tuned correctly, and scales very well thanks to the controls that are built into the LSAs. In a later post I describe how to summarize and filter Type-3 and 5 LSAs from being advertised to different areas.
First a bit about LSA's and OSPF. Link State Advertisements or LSAs are generated for every router, for every transient network (both broadcast and non-broadcast multiaccess networks...ethernet, frame-relay, etc), for every prefix from another area, and every prefix that redistributed into an OSFP network from an external routing source. These LSAs are flooded within an area, and some into other areas as we will see. All this information is shared between OSPF speaking routers. It is gathered and placed into the OSPF database called the Link State Database or LSDB on each router. Once the routers have all the information about every prefix everywhere within its AS, it then runs the Shortest Path First (SPF) algorithm against the LSDB to calculate the shortest or 'best' path to each network prefix. This path is determined by the lowest cost calculated based on a metric that drived from the bandwidths of the links and the costs advertised by all other routers. Each router calculates the best path to each prefix from its own perspective, essentially building a tree that no brach crosses any other branch, thus creating a loop free topology. The problem is that the larger the database or topology is, the more processing and time it takes to calculate the shortest path, hence slower convergence. Depending on the router's capabilities and the number of SFP calculations to process if the database is too large during convergence, I have seen routers never converge.
This is where OSPF's 2 teir hierarchy, areas and LSAs comes into play. Area Border Routers (ABRs) are routers that connect 2 or more areas. All areas must be connected to the backbone area (with the exception of a sham-link) and is connected via an ABR. The ABRs do not forward type 1 or 2 LSAs into adjacent areas. And this is good. Instead the ABRs take the prefix information learned from the connected areas and flood them as a Type-3 (Net Summary) LSA into each adjacent area. All other routers in the area instead of calculating the SPF to the advertising router within another area, instead use the cost the ABR provides along with the cost to get to the ABR(not always though, see stubby networks below). Now it would not make sense to have every subnet connected to every router in an area advertised as a Type-3 LSA into the adjacent areas. This would do nothing to reduce the number of LSAs, so at the ABRs we should summarize the prefixes contained within that area. This summarization can effectively reduce the Type-3 LSA advertisements into adjacent areas to one Type-3 LSA, if the IP numbering schema is well designed.
The fun begins with two AS's or autonomous systems exchanging OSPF LSAs (via redistribution). For this to occur we need a Autonomous System Boarder Router or ASBR. The ASBRs produce Type-5 (AS-External) LSAs for each external route redistributed into OSPF and and floods them throughout the AS (again, see stubby networks below). The ASBR can advertise the LSA as a Type-1 or Type-2 External metric. Essentially, a Type-1 External metric is calculated based on the advertising ASBRs metric plus the metric to get to the ASBR. In other words, a router will receive an LSA from an ASBR and add the cost to get to the ASBR to the metric received from the ASBR for the Type 1 External prefix. The other is Type-2 External metrics are much easier the metric that ASBR advertises does not change throughout the AS. Type-1s are preferred over Type-2s by the routers. Again like Type-3 LSAs, Type-5 LSAs can be summarized to reduce LSA propagation throughout the AS.
So what happens if the ASBR is in a different area than the Backbone area? The problem is how is another area's router going to calculate how to get to the ASBR of another area if all it receives is a LSA. This is where a Type-4 LSA helps. A ABR (remember it connects 2 or more areas) helps out by letting everyone in the area know the cost of getting to the ASBR from the ABR for each prefix.
There is one other LSA that needs to be touched on, and that is the Type-7 LSA or NSSA External. Some say NSSA is Cisco propriety, but there is OSPF extension that describes it in RFC 1587. An NSSA or Not-So-Stubby-Area, is an area that is confused. But to know why it is confused need to know what a stubby network is. In a stubby network, the ABR advertise no Type-5 LSAs into the area, and will instead advertise a default route as a Type-3 LSA. This again is to reduce the number of LSAs propagated to the stub area. An ABR may send other Type-3 LSAs as well or it may not depending on the configuration. A stubby network which does not receive type-3 LSAs with the exception of the default route is called a totally stubby network. But a NSSA is a stub network (which doesn't get Type-5 LSAs, and may not get Type-3 LSAs) that is connected to another AS, whose prefixes are being redistributed into OSPF within the stub network, hence the could have been stubby network it is not so stubby any more. These prefixes are advertised via a Type-7 LSA. However, when the ABR receives and re-advertises the prefixes received from the Type-7 LSA it advertises them in the backbone area as a Type-5 LSA.
Of couse there are exceptions to some of these rules as we have seen with stubby areas and NSSAs. For example, in an MPLS VRF or Layer 3 VPN that is being redistributed from MP-BGP into OSPF on a Provider Edge router, the PE router does not advertise the prefix via a Type-5 LSA but it in fact advertises it as a Type-3, even though technically BGP and OSPF are differing AS's. But this makes sense, from a customer point of view, the prefixes the customer advertises into the VRF are not from differing AS's but all part of the customers own AS. This way each of the customers site router actually receive Type-3 LSAs, and everything looks to be part of the same large network with no ISP interference.
I hope this provides you with some insight into the various LSA types that are used in multiarea OSPF domains. OSPF is a very complex routing protocol with many 'button and knobs', and as the title suggested I tried to be brief and hit on the highlights of each LSA (the devil is in the details). I know I did not touch on Type-6 Group Membership LSAs but I have never seen them used in production. Small OSPF networks may only have one backbone area and have no convergence problems, larger OSPF network may have many areas and still work to reduce there LSDB. How to control the propagation of LSAs is a key factor to faster OSPF convergence times. In conclusion, OSPF is my favorite IGP, and has great convergence when designed and tuned correctly, and scales very well thanks to the controls that are built into the LSAs. In a later post I describe how to summarize and filter Type-3 and 5 LSAs from being advertised to different areas.
Wednesday, May 16, 2012
Advantages of Loopback Interfaces on Routers
It is very easy to create and use loopback interfaces on routers and they can provide many advantages to the network engineers who utilize them.
First let me define what a loopback interface is. A loopback is a logical virtual interface created on a router that emulates a real interface. Once assigned an IP address and that IP or Subnet is advertised on the network, one has an always up interface that is reachable as long as the route to that IP is available in the IP routing table.
So here is an example configuration of a loopback interface and address assignment on a Cisco Router running IOS called R1:
R1(config)#interface loopback 0
R1(config-if)#ip address 10.10.10.10 255.255.255.255
R1(config-if)#end
So what advantage does having this loopback address give you?
Routing protocols such as OSPF or BGP can utilize the loopback address as the Router ID or RID. A network engineer can assign RIDs that are easily identifiable. These RIDs are advertised to the routers peers or neighbors establishing adjacency.
(Note: RID do not have to be a loopback, and can be assigned automatically by the protocol via an interface IP address, or you can specify a 32 byte address under the protocol, but there are advantages to making it a loopback address.)
Continuing on from the example above, R1 is connected to R2 on interface FastEthernet0/0 via a /30 subnet. R2 will be configured with a loopback address of 20.20.20.20/32.
Although the OSPF routing protocols will automatically assign a configured loopback address, to follow best practices we will specify the loopback as the RID and advertise it as a reachable network via that router. The following example is for OSPF:
R1(config)#router ospf 1
R1(config-router)#router-id 10.10.10.10
R1(config-router)#network 10.10.10.10 0.0.0.0 area 0
When R1 and R2 establish a neighbor relationship via OSPF the 10.10.10.10/32 network will be reachable as it is now advertised by R1. But lets take a look at the OSPF neighbor relationship on R2:
R2#show ip ospf neighbor
Neighbor ID Pri State Dead Time Address Interface
10.10.10.10 1 FULL/DR 00:00:37 10.1.1.1 FastEthernet0/0
The RID is the Neighbor ID. This is very advantageous to those who maintain the network because now we can design the loopback addressing schema to allow us to quickly identify what routers are peered. And now from R2 the loopback address of 10.10.10.10 is reachable.
R2#ping 10.10.10.10
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 10.10.10.10, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 20/26/40 ms
This is an easy way to test if your router is reachable even if multiple interfaces to it may be down. As long as the router has IP connectivity and able to advertise its routes to the rest of the network it will be reachable via the network via the loopback address. No guessing which interfaces are up and which ones are down.
In the following BGP example, the iBGP neighbor peering can be specified via the neighbors loopback address. (Note: To implement this for eBGP, BGP multi-hop would need to be implemented because eBGP has a TTL of 1, where iBGP has a TTL of 254.) BGP's neighbor table looks like this between iBGP peers.
R2#sh ip bgp summary
BGP router identifier 20.20.20.20, local AS number 65000
BGP table version is 1, main routing table version 1
Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd
10.10.10.10 4 65000 4 4 1 0 0 00:00:34 0
Because BGP peerings are established over TCP port 179 these routers could have many router hops in between them and they do not have to be directly connected. By utilizing the 'update source interface' command you can specify that the TCP connection with the peer router be established using the address on the router's local loopback interface thus utilizing the routers IGP to route to the destination router you are trying to peer with.
Loopback addresses have other advantages with their 'always up' nature. They can be used for DNS entries by associating the loopback IP address to an assigned hostname, thereby making the router always reachable via its DNS name as long as the router has IP connectivity. TACACS+ or RADIUS can use the loopback address as a source for AAA functions and thereby reducing the administration overhead of having to add every IP address of the router to the AAA server to ensure functionality should an interface fail . NMS products can easily add routers via the loopback addresses and it eliminates the guess work as to what IP address the router should be added with. Point to point serial interfaces can utilize the 'ip unnumbered interface' command to assign the IP of the loopback to multiple serial interfaces.
Loopback addressing becomes even more important in large fully meshed or route-reflector BGP environments where many routers will have multiple peering with many other BGP speaking routers, and many IGPs can be running in the background. Adding and troubleshooting other services like LDP for MPLS, or building MPLS L2 Pseudo-wire crossconnects, all running over MP-BGP, loopback addressing becomes extremely important to easily identify the peer routers to deliver the services in large ISP environments.
I hope this gives you some ideas about the advantages of loopback interfaces and ways to implement them in your environment. I am sure there are many other useful ways to utilize loopback interfaces. Please feel free to provide any additional benefits that you have found in your networks.
Subscribe to:
Posts (Atom)