Thursday, December 27, 2012

A different perspective on CIR, PIR, Tc, Bc & Be

The best topics in my CCIE studies have been the ones where I've experienced a true paradigm shift in my thinking. With this topic, I've had three, and one of those came months after the first, when I thought I was most of the way done typing the first revision of this document. I will do my best to convey all three here and now, and perhaps save someone else the same long journey.

But first, an introduction....

Both policing and shaping are tools to deal with a service provider giving a higher speed phyiscal interface with an understanding that the customer will only use a fraction of the speed. This is an ingress tool to not allow the core or egress links to become swamped. For example, a SP might give out Gig-E interfaces with the understanding that their customers will only use 200MB of it. If all customers actually used 1GB, the edge router, core, or egress routers could easily run out of bandwidth.

Policing is the tool used at the SP side to enforce the traffic policy, and shaping is the tool used at the enterprise edge towards the policer, to conform to the policy.

By the way, I've had an argument about this with a couple people in the past. The CIR CAN equal the line rate of the interface. Marketing nonsense from some ISPs may make you believe otherwise.  In a scenario where CIR equals line rate, you don't need to shape or police, and none of this matters!

Before I delve into how CIR, PIR, Bc, Be, Tc, etc work, I will share with you the first two secrets to understanding all of this.

Saturday, December 22, 2012

Serial Link Compression

At a high-level, there are basically two compression techniques supported on serial links on Cisco routers:

Stacker - Based on the Lempel-Ziv (LZ) algorithm, this method provides the best compression, particuarly in the case of varying data, but at the cost of more CPU usage.
Predictor - Attempts to predict the next bits to be transmitted based on history.  Does not compress as well as Stacker, but is lighter on CPU usage.

These two high-level technologies are implemented for payload compression through:
Stacker - Obvious; uses LZ algorithm mentioned above, supported on both PPP and HDLC.  This is the only compression method supported on HDLC.
Predictor - Again, obvious; only supported on PPP.
MPPC - A Microsoft algorithm (Microsoft Point to Point Compression); implemented in Cisco devices for interoperability to Microsoft clients.  Cannot be used router-to-router, only router-to-Microsoft OS.  Uses LZ / Stacker compression.  I'm not going to talk about this one further because I have no convenient way to lab it.
LZS - I only know this exists because it's on the menu.  I had a terribly hard time finding details about this, either in my CCIE study guides, the Cisco documentation, or other blogs.  It's obviously LZ-based, but it doesn't seem anyone uses it!  If you have details I'm lacking, please reply to the post.  Also not going to talk about this one any further.
Frame Relay Packet-by-Packet  - Cisco proprietary, uses LZ algorithm, but uses a per-packet dictionary.  Pre-FRF9
Frame Relay Data Stream - Cisco proprietary, uses LZ algorithm, but uses the same dictionary across all packets.  Pre-FRF9
Frame Relay FRF9 - Industry standard; enabled per DLCI or per sub-interface. Uses LZ algorithm.  Requires IETF LMI. 

So here's how we apply all these things.

Sunday, December 16, 2012

The nitty-gritty of WRED

WRED is such a simple topic at first glance.  It's effective congestion avoidance, and it can be enabled with one command!  If you dig a little deeper, however, there can be quite a lot there.  I've been avoiding doing the WRED deep-dive for a while now, but it finally caught up with me. 

I assume most anyone reading this understands what WRED does at a high-level already, so I will only touch on the general idea.  Any interface that has its transmit (egress) buffer fill up goes into tail drop.  Tail drop is a state where all new packets are dropped.  This is bad, because if TCP sessions are running through that interface, the packet loss will cause all TCP sessions that were part of the tail drop to decrease their window size and go into slow start.  This process is called global synchronization.  It produces a saw-tooth effect on traffic diagrams, as all TCP flows slow down at once, gradually speed back up, experience congestion/packet loss at the same time, and then repeat the slowdown, for infinity.

RED (random early detection) solves this problem by randomly dropping packets prior to the transmit buffer filling up.  The idea is that some TCP flows will go into slow start instead of all of them.  Theoretically, tail drop is avoided, and therefore global synchronization is also avoided.  It's of note that RED/WRED does absolutely nothing for UDP flows, as UDP flows don't have a transport-layer ACK, and therefore there's no way to know at the UDP level if packets haven't been received.  Therefore, UDP cannot implement a slow-start at the transport layer.  If you have an entire interface full of UDP traffic, there's no benefit to running RED at all.

Cisco only implements WRED, as opposed to RED.  WRED is Weighted Random Early Detection, and takes into account IP Precedence (default) or DSCP values, making the "less important" flows get more aggressively dropped. 

WRED can be implemented in two fashions:
1) On the command line
2) As part of a CBWFQ policy

Friday, December 7, 2012

BGP Cost Community, EIGRP SoO, and backdoor links

BGP Cost Community (in relation to EIGRP) and EIGRP Site-of-Origin, or SoO, are two related, and somewhat overlapping topics.  The intent of Cost Community is to prevent suboptimal routing and routing loops between EIGRP sites (sometimes) separated by MPLS.  Site of Origin is more focused on loop prevention.

We'll be working off this diagram:

Sunday, December 2, 2012

OSPF PE: Downward bit, Super Area 0, Domain IDs, capability vrf-lite, sham links

This post will bite off quite a lot.  I wanted to write one post that encompassed the entirety of the interaction of using OSPF as a PE to CE routing protocol.

Let me begin by saying... what a disastrously bad idea doing this is.  BGP is the obvious PE to CE routing protocol.  I've never deployed OSPF as a PE to CE in production, but I know someone that has, and he hated it too.  Even the service provider (AT&T) that offered the OSPF option won't let you opt for it any longer if you're a new customer.  The only argument I've heard for using anything besides BGP - that actually made sense - is if you have a great deal of routers with basic IOSes and don't have BGP as a routing option. 

The reason it's a disastrously bad idea is because it's too ambitious.  To me, it feels like the designers sat down with the concept of converting a large layer 2 frame-relay OSPF network natively to MPLS without having to rethink the OSPF design.  With all the band-aids available, you can keep your area design intact, even if it makes no sense whatsoever in an MPLS world.

In a nutshell, these are the "add-ons" we'll be looking at:
The OSPF Down Bit - designed to prevent loops from forming in an OSPF area that's multihomed to the MPLS backbone.
Super Area 0 - The MPLS network will be treated as an "area 0" in and of itself.  This is in case your areas become disconnected from area 0 due to the migration to MPLS.  This way, each area will always be attached to area 0.  Can be disabled with "capability vrf-lite"
Sham Links - Creates a control-plane intra-area link over the Super Area 0.  Can be useful for traffic engineering
Domain-ID (community) - Controls whether or not routes should be considered inter-area or external.  The domain-id is populated by the router process number by default if it's not specified.  Same Domain-ID = inter-area, different = external.  It assumes different means that they should be treated as separate OSPF processes.

This is the diagram we'll be referencing.