Jeff Kronlage's CCIE Study Blog: OER/PfR Configuration [Part 2 of 2]

We'll be picking up right where we left off in part 1, transitioning our topology from dynamips 7200s to dynamips 3725s. The 7200s had too many issues running BGP under dynamips, and also had many bugs with OER's interoperability with BGP. We'll also be using OER v2.1 (12.4(15)T) instead of OER v2.2 (12.4(24)T) for this section.

Our new topology looks very similar to the old, but note that the interface numbers have all changed:

I'm removing all the default routing and swapping to specific prefixes with BGP. For simplicity I'm also going to redistribute BGP into OSPF. Obviously this isn't a real-world possibility, but BGP/IGP interoperability is beyond the scope of this document.

Here are the relevant parts of the config. Full reachability has been established; s0/1 is in shutdown on R2, and we're going to pref R2 as the exit point from the OSPF domain towards R4.

R2:
router bgp 100
no synchronization
bgp log-neighbor-changes
network 2.2.2.2 mask 255.255.255.255
network 10.0.24.0 mask 255.255.255.0
network 10.0.242.0 mask 255.255.255.0
network 192.168.0.0
redistribute ospf 1
neighbor 3.3.3.3 remote-as 100
neighbor 3.3.3.3 update-source Loopback0
neighbor 10.0.24.4 remote-as 200
no auto-summary

router ospf 1
redistribute bgp 100 metric 100 subnets

R3:
router bgp 100
no synchronization
bgp log-neighbor-changes
network 3.3.3.3 mask 255.255.255.255
network 10.0.34.0 mask 255.255.255.0
network 192.168.0.0
redistribute ospf 1 metric 500
neighbor 2.2.2.2 remote-as 100
neighbor 2.2.2.2 update-source Loopback0
neighbor 10.0.34.4 remote-as 200
no auto-summary

router ospf 1
redistribute bgp 100 metric 100 subnets

R4:
router bgp 200
no synchronization
bgp log-neighbor-changes
network 4.4.4.4 mask 255.255.255.255
network 9.9.9.9 mask 255.255.255.255
network 10.0.24.0 mask 255.255.255.0
network 10.0.34.0 mask 255.255.255.0
neighbor 10.0.24.2 remote-as 100
neighbor 10.0.34.3 remote-as 100
no auto-summary

Let's look at a simple feature first.

R1:
oer master
learn
aggregation-type bgp

We learn our traffic flows again, which this time are learned as /32 since that's how they're in the BGP table:

An interesting side effect of using aggregation-type bgp is that instead of introducing static routes, OER will adjust BGP (again, not in the running config) to prefer a certain interface.

In the scenario below, it chose to shuffle traffic towards 9.9.9.9/32 on to R3:

Note the protocol is set to "BGP" on the lower-right.

Let's take a look at the BGP table on R2:

Local pref 5000 on the path we're shuffling the traffic to. Elegant, I like it!

We can also see similar results with show oer border routes bgp:

If you want to set what the local-pref is, you can use:

mode route metric bgp local-pref [value]

This is all just fantastic for outbound traffic - what if we want to influence inbound?

oer master
logging
!
border 2.2.2.2 key-chain MY-KEY-CHAIN
interface Serial0/1 external
   maximum utilization receive percentage 50
interface Serial0/0 external
   maximum utilization receive percentage 50
!
border 3.3.3.3 key-chain MY-KEY-CHAIN
interface Serial0/0 external
   maximum utilization receive percentage 50

learn

inside bgp

I'm going to quote Cisco's description on inside bgp:

"The inside prefix learning that happens for inbound optimization is by looking at the BGP tables on the border routers. It is not done through NetFlow top talkers. We just ask BGP on the BRs for all networks that were advertised to eBGP peers over PfR external interfaces. Once we receive all the prefixes from the BRs, we just enter them into the monitoring database and start optimizing them (note that we ask for prefixes that originated in our AS only, so transit prefixes that were advertised by our BRs to eBGP peers, if any, will not be learnt)."

http://docwiki.cisco.com/w/index.php?title=PfR:Solutions:InternetInboundLoadBalancing&oldid=43868

In a nutshell, when you use inside bgp, OER will attempt to load balance inbound traffic by prepending it's local AS on links it wants to depref, and hoping that the load will be shifted to the other link.

Here it is in action:

In this case, OER has shuffled the echo-replies off of R2 towards R3 by prepending its AS one time. OER can do this up to six times in an attempt to move the traffic to the other link. In our case, one is enough.

We can also have OER send a community, triggering whatever depref options your ISP already has setup. A very common way to accomplish this with BGP is to send a community of AS:LP, for example, 200:90 would tell AS 200 to set this prefix to local preference 90:

R2:
ip bgp-community new-format

router bgp 100
neighbor 10.0.24.4 send-community

R4:
ip bgp-community new-format

route-map eval_bgp_comm permit 10
match community 1
set local-preference 90
route-map eval_bgp_comm permit 20
router bgp 200
neighbor 10.0.24.2 route-map eval_bgp_comm in

R1:
oer master
border 2.2.2.2 key-chain MY-KEY-CHAIN
interface Serial0/0 external
maximum utilization receive percentage 50
downgrade bgp community 200:90

And it works a dream:

Local Pref 90, other link gets preferred without any prepending, and the community is shown at the bottom.

That's pretty much the end of "learned" traffic flows. Moving forward, we're going to be looking at how to statically define traffic classes.

Static traffic classes are defined with OER maps. The syntax is very much like a route-map:

oer-map MYMAP 10
match traffic-class access-list chargen
set backoff 180 180 90
set delay threshold 20
set active-probe echo 9.9.9.9
ip access-list extended chargen
permit tcp host 5.5.5.5 host 9.9.9.9 eq chargen

Potential Pitfall: oer-maps have an undocumented requirement that can be difficult to figure out.
First we need to look at some background information regarding mode monitor.

mode monitor [active | both | fast | passive]

This is a global OER command, used to select how the traffic classes are monitored.

active = generate IP SLA probes to test for delay. By default, only ICMP packets are generated. Only active interfaces send probes, unless an out-of-policy event occurs, in which case alternative paths are probed.
both = Both active & passive (the default)
fast = similar to active, however, all interfaces, including inactive links, are probed. Provides faster failover in the event of out-of-policy. Fast is reguarly used with a low-delay probe setting in an oer-map, such as set probe frequency 2.
passive = no probes, use passive monitoring (measure time from TCP SYN to TCP SYN/ACK) only.

This can be set globally or in an oer-map. The oer-map's setting overrides global. Here's where the pitfall comes in - if you're using any mode other than passive, you must manually define your active probe for measuring delay. If you don't, you get an uncontrolled traffic class (OER ignores it). This can be somewhat seen in debug oer master prefix detail, but it's not an easy-to-understand debug.

Again, the solution to this problem is to use one (but not both) of the following commands in the oer-map:
set mode monitor passive
set active-probe echo IP.ADD.RE.SS ! the ip address should be reachable via the external interface

Oer-maps can match access lists, prefix lists, or even learn lists.
Only one item can be matched per oer-map "section". Only one oer-map can be applied at a time, but many "sections" can be defined.

You can set a large number of things, most of which we'll cover later.

Applying the list is simple:
oer master
policy-rules MYMAP

Every "section" of an oer-map overrides default policy for it's particular traffic class.
This can be viewed with the show oer master policy. The "Default Policy Settings" are what's set globally, and the "oer-map MYMAP 10" shows the entire policy, with * next to what we've changed for this particular traffic class.

R1MC#show oer master policy
Default Policy Settings:
backoff 90 90 90
delay relative 50
holddown 90
periodic 0
probe frequency 56
mode route control
mode monitor both
mode select-exit good
loss relative 10
jitter threshold 20
mos threshold 3.60 percent 30
unreachable relative 50
resolve delay priority 1 variance 10
resolve utilization priority 2 variance 10
oer-map MYMAP 10
sequence no. 8444249301975040, provider id 1, provider priority 30
host priority 0, policy priority 10, Session id 0
match ip access-lists: chargen
*backoff 180 180 90
*delay threshold 40
holddown 90
periodic 0
probe frequency 56
mode route control
mode monitor both
mode select-exit good
loss relative 10
jitter threshold 20
mos threshold 3.60 percent 30
unreachable relative 50
next-hop not set
forwarding interface not set
resolve delay priority 1 variance 10
resolve utilization priority 2 variance 10
Forced Assigned Target List:
active-probe echo 4.4.4.4 target-port 0
* Overrides Default Policy Setting

Oer-maps and policy routing go hand-and-hand, so I'm going to dive right into PBR as well.
So far we've looked at static routing and bgp routing with OER. PBR will allow us to match a particular type of traffic, as opposed to routing purely on source/destination.

Before I saw OER's PBR in use for the first time, I found it confusing. I'd mentioned earlier that BRs needed to have a layer 2 link between each other, and PBR is the reason why. The trick is that PBR is accomplished in unison on every BR simultaneously. If the MC determines that R3 is a better exit for a particular traffic class than R2 is, it sets a route-map in R2 pushing the specific traffic class over to R3. Let's say, for example, that we have requirements for DSCP EF traffic that are better met on R3's WAN link than on R2's WAN link. However, the rest of the traffic headed towards that same destination is drop insensitive and we don't want to crowd R3's WAN link with, say, FTP traffic. Let's say that traffic was headed towards 9.9.9.9 on R4. Our flow would look something like this:

The red links are the bulk FTP traffic headed towards R4. The blue links are the EF traffic. R2 still receives all the traffic headed towards R4 via any method (default route, injected static, etc), but it pushes the EF traffic back over the Ethernet link to R3 so that R3's policy route can then send the traffic to R4.

Now let's look at the outcome of the oer-map we wrote above. To recap, this is what it looks like:

ip access-list extended chargen
permit tcp host 5.5.5.5 host 9.9.9.9 eq chargen
oer-map MYMAP 10
match traffic-class access-list chargen
set backoff 180 180 90
set delay threshold 20
set active-probe echo 9.9.9.9

We've disabled dynamic learning, so this is the only traffic class being monitored by OER.
After a decision is made, this is the outcome:

Notice "PBR" in the protocol field on the right.
There's more we can do to see what's going on here.

R2:
show route-map dynamic

When R2 receives traffic matching the dynamically built access-list "oer#1", push it to R3.

R3:
show route-map dynamic

And R3 sends it over it's WAN link to R4.

Potential Pitfall: Matching broad IP ranges (I will use 0.0.0.0/0 as an example) for PBR requires having a matching static route setup that matches. If you've labbed OER for a while, you'll notice that during learning, or even after an oer-map is applied, that OER dynamically finds the current exit interface for the traffic class, and then afterwards determines if it's in-policy or out-of-policy.

The catch is that OER isn't going to route traffic towards a destination that it isn't sure can handle the traffic. This ties back into the concept of parent routes from part 1. Don't send traffic somewhere that the default routing table isn't willing to route it to.

The bottom line is you need a static route to 0.0.0.0/0 in order to accomodate matching 0.0.0.0/0. I've tried using default-information originate with R4 to push a default into R2 & R3, and while the BGP portion of this works fine, this does not work with OER.

Here's a sample of the issue & resolution:

ip access-list extended chargen
permit tcp any any eq chargen ! this is effectively matching a default (any any)

R2 & R3:

ip route 0.0.0.0 0.0.0.0 Serial0/0

If you left out the static route, you'd eventually end up with this (awful / non-descriptive) error message:

%OER_MC-5-NOTICE: Uncontrol Appl Prefix 0.0.0.0/0 defa 6 [1, 65535] [19, 19], Couldn't control

Wow, that's helpful isn't it?

Next we're going to take a look at link groups. From Cisco's wording, link-groups provide "the ability to define a group of exit links as a preferred set of links, or a fallback set of links for Performance Routing (PfR) to use when optimizing traffic classes specified in a PfR policy" (http://www.cisco.com/en/US/docs/ios-xml/ios/pfr/configuration/15-2mt/pfr-link-group.html)

Link groups are used together with OER maps to say which link you prefer for which type of traffic. This could be useful in a scenario where you had a high-performance/low-latency set of WAN links, and a cheaper set of WAN links. The cheaper links could be used for torrents, FTP, etc, while the high-performance links could be used for voice, video, etc. Each set of links could act as failover for the other set. The configuration can grow large but is actually rather simple.

In our scenario, we're going to pretend that R2's Serial0/0 is the performance link, and that R3's Serial0/0 is our "good value" budget link.

oer master

border 2.2.2.2
interface Serial0/0 external
link-group PERFORMANCE

border 3.3.3.3

interface Serial0/0 external
link-group BUDGET

oer-map MYMAP 10

match traffic-class access-list chargen
set backoff 180 180 90
set delay threshold 20
set active-probe echo 9.9.9.9

set link-group PERFORMANCE fallback BUDGET

oer-map MYMAP 20
match traffic-class access-list icmp
set active-probe echo 5.5.5.5
set link-group BUDGET fallback PERFORMANCE

Eventually, the ICMP flow will get shuffled off to the "budget" R3 serial link:

When using link-groups, don't forget to disable any type of balancing by range. We didn't cover range in this document, but it's the simple concept that if one link is seriously out of balance with another, we should shuffle some of that traffic over. That idea is inherently incompatible with link groups.

We're almost done - but first, The "miscellaneous" section!

I started this blog with a list of topics I wanted to cover, and some of them just didn't fit, or I chose not to fit, into certain areas.

match oer learn [delay | throughput]

An oer-map can match a learned traffic class, and can therefore apply policy to prefixes dynamically learned. Use this command under an oer-map to accomplish this.

mode route metric static tag [value]

This global OER command will allow you to set a tag on the static routes that OER implements. Very helpful for selective static route redistribution into the IGP.

mode select-exit [good | best]

This is a global OER command, which can also be used in an oer map with set mode select-exit [good | best]. It controls whether or not to leave a traffic flow on a well-utilized link just because it's in-policy, even if there's a viable, empty link available. mode select-exit good qualifies by in-policy only; mode select-exit best would choose the latter, and shuffle the traffic over to the underutilized link.

periodic [value]

This global OER command forces re-evaluation of traffic classes on a timer basis. It defaults to disabled. When using mode select-exit best, after a link is evaluated and assigned to an interface the first time, if it remains in-policy, it's never re-evaluated. periodic forces re-evaluation every [value] seconds.

BGP & no-export community

OER is intended as a single-AS solution. To enforce this, BGP prefixes injected into the BGP table have the no-export community set. In order to propagate this, be sure to use send-community with your BGP neighbors.

We've spent a lot of time looking at throughput and delay, but OER can also make decisions based on loss, jitter, and the MOS score. I've not labbed any of these, but it's important to understand that while loss is discovered passively, jitter and MOS score require an oer-map and an ip sla responder on the far side. The syntax for the oer-map for either jitter or MOS is:

oer-map MYMAP 20

set active-probe jitter 9.9.9.9 target-port 1025 codec g711ulaw

and on 9.9.9.9, you would enable:

ip sla responder

in global config.

Just to reiterate that, MOS uses jitter as its probe type.

Prioritizing what to resolve.

Under global OER or an oer-map, you can prioritize what resolution methods are most important. This example demonstrates:

resolve delay prioritity 1 variance 10

resolve loss priority 2 variance 30

resolve utilization priority 3 variance 20

A particuarly handy show command under any circumstance is show oer master border detail. This command will output what the MC is using as the interface bandwidth of each interface of the BRs, as well as showing current utilization.

That wraps us up! Thanks for reading, I hope you learned as much reading as I did putting this together.

Jeff Kronlage

Jeff Kronlage's CCIE Study Blog

Saturday, October 20, 2012

OER/PfR Configuration [Part 2 of 2]

5 comments: