Saturday, September 6, 2014

OSPF LFA & Remote LFA

Continuing on the same track as my recent posts regarding EIGRP FRR and BGP PIC/Add-path, today I'm writing about OSPF LFA. OSPF FRR/LFA accomplishes the same concept as EIGRP FRR, but in a much more elegant and thorough fashion.

As I did in my EIGRP article, I'm going to reference back to the BGP PIC article, as that has a lengthy explanation of why fast re-reroute is important. If you don't understand the use case, please read this first article:

http://brbccie.blogspot.com/2014/08/bgp-pic-and-add-path.html

Again building off former articles, the EIGRP method of LFA is dead simple: take the feasible successor and pre-install it in the FIB for faster convergence.

http://brbccie.blogspot.com/2014/08/eigrp-enhancements.html

I genuinely like this approach, because it's very easy to understand. If you're savvy enough to engineer for feasible successors, you can literally just turn on this feature and it works.

OSPF takes this idea to a whole new level. Obviously, OSPF does not have a concept of feasible successors, but it does have a huge advantage: because, in the same area, the OSPF database is identical among all routers, OSPF can run the SPF algorithm with a neighboring router as root. The advantage of this is being able to find a loop-free alternate path in complex topologies that would have failed the feasible successor check in EIGRP. When we look at Remote LFA, we can even tunnel to distant routers to form loop-free paths, all calculated via the router running FRR.

Note - much like EIGRP, OSPF on IOS does not support per-link LFA, so we will only be examining per-prefix LFA.  IOS-XR supports both per-prefix and per-link.



All links have an IP address of 192.168.YY.X, where YY is the lower router number followed by the higher router number, and X is the router number (i.e. on the link facing R4, R1's IP address is 192.168.14.1) .  Each router has a loopback0 address of X.X.X.X, where X is the router number.

Consider this diagram, with R1 attempting to reach R5 (5.5.5.5).

R1(config)#router ospf 1
R1(config-router)#fast-reroute per-prefix enable area 0 prefix-priority low

The primary path is obvious: R1 -> R2 -> R5
The backup path requires some thought...

If this were EIGRP, neither path would be valid for LFA. They'd both fail the feasibility condition:
R1->R3->R5 has an "advertised distance" of 10, which is greater than the "feasible distance" of 2. Likewise, R1->R4->R5 has an "advertised distance" of 10.

However, OSPF being link state can actually calculate the SPF from R2 and R4's perspective. Cisco calls this process "reverse SPF" -- RSPF. I'm not going to make this a large lesson on link state protocols, but let's quickly look at what R1 would discover about its neighbors:

R2:
  This is already the primary path, so eliminate R2.
R3:
  When attempting to reach R5, R3 will route back through R1. This will loop. Eliminate R3.
R4:
  R4 reaches R5 via the link between R4 and R5.  Valid Backup Route.

I deliberately built the scenario this way to show how a higher-metric route could beat a lower metric for the backup route - of course, in our case, the lower metric would've looped.

R1#sh ip route repair 5.5.5.5
Routing entry for 5.5.5.5/32
  Known via "ospf 1", distance 110, metric 3, type intra area
  Last update from 192.168.12.2 on GigabitEthernet1.12, 02:29:11 ago
  Routing Descriptor Blocks:
  * 192.168.12.2, from 5.5.5.5, 02:29:11 ago, via GigabitEthernet1.12
      Route metric is 3, traffic share count is 1
      Repair Path: 192.168.14.4, via GigabitEthernet1.14
    [RPR]192.168.14.4, from 5.5.5.5, 02:29:11 ago, via GigabitEthernet1.14
      Route metric is 26, traffic share count is 1

R1#sh ip cef 5.5.5.5
5.5.5.5/32
  nexthop 192.168.12.2 GigabitEthernet1.12
    repair: attached-nexthop 192.168.14.4 GigabitEthernet1.14

As with EIGRP, there are "tie-breakers" if you have multiple options for backup path. With OSPF, you can get a lot more granular than EIGRP. I still hate the term "tie-breakers", as I explained in my EIGRP blog, I think "2nd bestpath decision maker" explains it better.

The tie-breakers are as follows, with their respective default priorities:

- SRLG 10 
- Primary Path 20
- Interface Disjoint 30
- Lowest-Metric 40
- Linecard-disjoint 50
- Node protecting 60
- Broadcast interface disjoint 70 
- Load Sharing 256 

These tie-breakers are off by default:
- Downstream 
- Secondary-Path

The syntax to change the priorities - or turn on downstream or secondary-path - is as follows:

router ospf 1
  fast-reroute per-prefix tie-break interface-disjoint required index 5

If you use the fast-reroute per-prefix tie-break command at all, it disables all the other tie-breakers. So for example, if you wanted SRLG to be the 2nd tie breaker, you would have to turn it back on after the interface-disjoint command:

router ospf 1
  fast-reroute per-prefix tie-break interface-disjoint required index 5
  fast-reroute per-prefix tie-break srlg index 10

You may have also noticed the required keyword. This means that if that tie-breaker doesn't match/pass, then disallow that path completely.

My original plan was to show a scenario for every tie-breaker, but after it taking me two days to build a topology that showed each possible technique, I decided to just go with a written explanation on each tie-breaker and then give one semi-complex tie-breaker topology with a few examples.

- SRLG
SRLG - Shared Risk Link Group - is a manual setting, optionally assigned per-interface, with the intent of identifying "shared risk" elements that the router can't detect on it's own. For example, if two of your Ethernet links shared a downstream switch, you might put those two in the same SRLG.

Usage:
R1(config)#int gig1
R1(config-if)#srlg gid 1
R1(config-if)#int gig2
R1(config-if)#srlg gid 1
R1(config-if)#int gig3
R1(config-if)#srlg gid 2

- Primary Path
Primary Path prefers a backup path that's part of equal-cost multipath (ECMP), This is the antithesis of Secondary Path, which we'll cover below.

- Interface Disjoint
This is fairly obvious, prefer a backup next-hop that exits through a different interface. Note, Ethernet sub-interfaces are considered different interfaces.

- Lowest-Metric
Prefer the path with the lowest metric (note, this command doesn't offer a "required" keyword)

- Linecard-disjoint
Prefer a path that exits through a different linecard than the primary path (I have no way of labbing this as I'm using a CSR1K)

- Node protecting
Prefer a path that doesn't pass through the same next-hop router as the primary path. Note this means any interface on the same next-hop router. So if R2 is the next-hop of your primary path via 192.168.12.2, and your backup path goes through (either directly or indirectly, later in the path) 192.168.25.2 on R2, node protecting will depref that path - or with the required keyword, would prevent it from being used completely.

- Broadcast interface disjoint
Broadcast interface disjoint deprefs backup routes that pass through the same broadcast area as the primary path. The thought here is if the layer 2 device (presumably a switch) connecting the interfaces together fails, we might lose the backup path too.

- Load Sharing 
I haven't labbed this, but my understanding is this is basically a worst-case scenario. If you have two or more paths that can't be differentiated by all of the above tie-breakers, share the backup paths amongst any applicable prefixes.

- Downstream (off by default)
This is very similar to the EIGRP feasability condition - ensure that the metric, from the neighbor's RSPF perspective, is smaller than the total metric of our primary path from the calculating router's perspective. Using the original example above, the backup path we picked would not meet the criteria for this tie-breaker. It's important to reinforce this is not a default option, and OSPF does not require this EIGRP-feasibility-like requirement as OSPF is a link state protocol and can calculate non-looping paths without concerns for metric because it has the entire topology at hand.  

- Secondary-Path (off by default)
This is the antithesis of the Primary-Path tie-breaker above. This instructs the process to prefer a backup path that is not part of multipathing (ECMP). The idea here is if all your multipaths are required for your traffic flows - for example, if you are equal-cost multipathing across two 1-gig links, but consistently have 1.2gb of data crossing them, it would not be desirable to just run over one the opposing link in the ECMP if one failed. Secondary-Path prefers a path not in the ECMP for the backup. 

I'm going to run a couple of examples of tie-breaking, but in order to do that, I needed more paths in the topology. Pay close attention, I have shifted the OSPF costs from the prior topology:



* Please note costs listed below do not include the on-router cost to the loopback for clarity*
If you look at metric alone, the paths from R1->R5 look most desirable in this order:
R1 -> R3 -> R5 (Cost 2)
R1 -> R6 -> R3 -> R5 (Cost 4)
R1 -> R2 -> R5 (Cost 11)
R1 -> R4 -> R5 (Cost 25)

Clearly R3 is the winning primary path.

Let's go down the decision-making process for the backup path:

- SRLG 10 - Not applicable, we're not using SRLG (yet)
- Primary Path 20 - Not applicable, we have no ECMP.
- Interface Disjoint 30 - Applicable, but all are on separate interfaces already.
- Lowest-Metric 40 - Applicable, choose R6 as backup. Do not proceed further, as all paths have different costs.

So without any modification, our primary next-hop router will be R3, and backup next-hop router will be R6:

R1#sh ip route repair 5.5.5.5
Routing entry for 5.5.5.5/32
  Known via "ospf 1", distance 110, metric 3, type intra area
  Last update from 192.168.13.3 on GigabitEthernet1.13, 00:14:19 ago
  Routing Descriptor Blocks:
  * 192.168.13.3, from 5.5.5.5, 00:14:19 ago, via GigabitEthernet1.13
      Route metric is 3, traffic share count is 1
      Repair Path: 192.168.16.6, via GigabitEthernet1.16
    [RPR]192.168.16.6, from 5.5.5.5, 00:14:19 ago, via GigabitEthernet1.16
      Route metric is 6, traffic share count is 1

There's an obvious flaw in that plan however, they both rely on R3 being online. 

R1(config)#router ospf 1
R1(config-router)#fast-reroute per-prefix tie-break lowest-metric index 10
R1(config-router)#fast-reroute per-prefix tie-break node-protecting required index 20

R1(config-router)#do sh ip route repair 5.5.5.5
Routing entry for 5.5.5.5/32
  Known via "ospf 1", distance 110, metric 3, type intra area
  Last update from 192.168.13.3 on GigabitEthernet1.13, 00:01:09 ago
  Routing Descriptor Blocks:
  * 192.168.13.3, from 5.5.5.5, 00:01:09 ago, via GigabitEthernet1.13
      Route metric is 3, traffic share count is 1
      Repair Path: 192.168.14.4, via GigabitEthernet1.14
    [RPR]192.168.14.4, from 5.5.5.5, 00:01:09 ago, via GigabitEthernet1.14
      Route metric is 26, traffic share count is 1

Now the process has chosen the backup through R4, which eliminates R3 as a single point of failure.

Let's pretend that gig1.13, gig 1.14, and gig1.16 all cross the same L2 switch somewhere in their path. We want to protect against that too:

R1(config)#router ospf 1
R1(config-router)#fast-reroute per-prefix tie-break lowest-metric index 10
R1(config-router)#fast-reroute per-prefix tie-break node-protecting required index 20
R1(config-router)#fast-reroute per-prefix tie-break srlg required index 30

R1(config-router)#int gig1.13
R1(config-subif)#srlg gid 1
R1(config-subif)#int gig1.14
R1(config-subif)#srlg gid 1
R1(config-subif)#int gig1.16
R1(config-subif)#srlg gid 1
R1(config-subif)#int gig1.12
R1(config-subif)#srlg gid 2

R1(config-subif)#do sh ip route repair 5.5.5.5
Routing entry for 5.5.5.5/32
  Known via "ospf 1", distance 110, metric 3, type intra area
  Last update from 192.168.13.3 on GigabitEthernet1.13, 00:18:34 ago
  Routing Descriptor Blocks:
  * 192.168.13.3, from 5.5.5.5, 00:18:34 ago, via GigabitEthernet1.13
      Route metric is 3, traffic share count is 1

Uh-oh, no backup route. We were hoping for R1->R2->R5...

R2#sh ip cef 5.5.5.5
5.5.5.5/32
  nexthop 192.168.12.1 GigabitEthernet1.12

That's because R2 routes back through R1 - R1 would've run the RSPF with R2 as the root and disregarded the route.

We have two options at this point:
- Remove the required keyword from the SRLG and fall back to the prior answer
- Tinker with the metrics to make R2 a viable path.

R1(config)#int gig1.12
R1(config-subif)#ip ospf cost 10

R2(config)#int gig1.12
R2(config-subif)#ip ospf cost 10

R1(config-subif)#do sh ip route repair 5.5.5.5
Routing entry for 5.5.5.5/32
  Known via "ospf 1", distance 110, metric 3, type intra area
  Last update from 192.168.13.3 on GigabitEthernet1.13, 00:00:52 ago
  Routing Descriptor Blocks:
  * 192.168.13.3, from 5.5.5.5, 00:00:52 ago, via GigabitEthernet1.13
      Route metric is 3, traffic share count is 1
      Repair Path: 192.168.12.2, via GigabitEthernet1.12
    [RPR]192.168.12.2, from 5.5.5.5, 00:00:52 ago, via GigabitEthernet1.12
      Route metric is 21, traffic share count is 1

Now we have a backup via R2.

Before we move on to remote LFA, let's cover some smaller topics.

There were two pieces to the initial command that I did not explain:
fast-reroute per-prefix enable area 0 prefix-priority low

enable area 0 may seem obvious - we want backup paths for area 0. Note, you can only specify areas the router is directly connected to, so if, for example, you wanted backup paths in areas 0, 1, and 2, your router would have to be an ABR for areas 1 and 2. This is true of both direct LFA and remote LFA.

But there's another issue with specifying areas:

R5(config)#int lo1
R5(config-if)#ip address 55.55.55.55 255.255.255.255
R5(config-if)#exit
R5(config)#route-map lo1-extern
R5(config-route-map)#match interface lo1
R5(config-route-map)#exit
R5(config)#router ospf 1
R5(config-router)#redistribute connected route-map lo1-extern

R1(config)#do sh ip route repair 55.55.55.55
Routing entry for 55.55.55.55/32
  Known via "ospf 1", distance 110, metric 20, type extern 2, forward metric 2
  Last update from 192.168.13.3 on GigabitEthernet1.13, 00:01:27 ago
  Routing Descriptor Blocks:
  * 192.168.13.3, from 5.5.5.5, 00:01:27 ago, via GigabitEthernet1.13
      Route metric is 20, traffic share count is 1

No repair route for 55.55.55.55 - and we won't, because an external route is in no area. We have to change our initial configuration to fix this:

R1(config-router)#no ip fast-reroute per-prefix enable area 0 prefix-priority low
R1(config-router)#fast-reroute per-prefix enable prefix-priority low

A lack of an area implies all areas this router is connected to - including external routes.

R1(config-router)#do sh ip route repair 55.55.55.55
Routing entry for 55.55.55.55/32
  Known via "ospf 1", distance 110, metric 20, type extern 2, forward metric 2
  Last update from 192.168.13.3 on GigabitEthernet1.13, 00:01:42 ago
  Routing Descriptor Blocks:
  * 192.168.13.3, from 5.5.5.5, 00:01:42 ago, via GigabitEthernet1.13
      Route metric is 20, traffic share count is 1
      Repair Path: 192.168.12.2, via GigabitEthernet1.12
    [RPR]192.168.12.2, from 5.5.5.5, 00:01:42 ago, via GigabitEthernet1.12
      Route metric is 20, traffic share count is 1

What's the story on prefix-priority low?

IOS prioritizes convergence events by default by prefix length. If SPF has to be calculated for thousands of routes, it's assumed by default that /32s (typical for iBGP next-hops) are "high priority". You can define what routes are priority to OSPF with:

R1(config-router)#prefix-priority high route-map <your route map>

There are only two tiers, high and low. High indicates (by default, unless the route map is used) only calculate backup routes for /32s, Low means calculate backup routes for all routes.

So you're debugging and trying to figure out why one path was chosen over another. IOS has a fantastic output system for this:

R1(config-router)#fast-reroute keep-all-paths

This is basically a debugging command, and tells OSPF to keep the output from all the RSPFs it ran to calculate the backup path - including the ones it didnt choose as best.

show ip ospf rib is our 2nd magic command:

R1(config-router)#do sh ip ospf rib 5.5.5.5

            OSPF Router with ID (1.1.1.1) (Process ID 1)


                Base Topology (MTID 0)

OSPF local RIB
Codes: * - Best, > - Installed in global RIB
LSA: type/LSID/originator

*>  5.5.5.5/32, Intra, cost 3, area 0
     SPF Instance 62, age 00:13:50
     Flags: RIB, HiPrio
      via 192.168.13.3, GigabitEthernet1.13
       Flags: RIB
       LSA: 1/5.5.5.5/5.5.5.5
      repair path via 192.168.12.2, GigabitEthernet1.12, cost 21
       Flags: RIB, Repair, IntfDj, BcastDj, NodeProt
       LSA: 1/5.5.5.5/5.5.5.5
      repair path via 192.168.16.6, GigabitEthernet1.16, cost 6
       Flags: Ignore, Repair, IntfDj, BcastDj, SRLG
       LSA: 1/5.5.5.5/5.5.5.5
      repair path via 192.168.14.4, GigabitEthernet1.14, cost 26
       Flags: Ignore, Repair, IntfDj, BcastDj, SRLG, NodeProt
       LSA: 1/5.5.5.5/5.5.5.5

Look at all that fantastic output - it list the parameters per route so you can determine why the repair path was chosen. Let's break one of these down:

      repair path via 192.168.12.2, GigabitEthernet1.12, cost 21
       Flags: RIB, Repair, IntfDj, BcastDj, NodeProt
       LSA: 1/5.5.5.5/5.5.5.5

This is our current best backup path - "RIB" means it's installed, "Repair" means it's a backup path - so "RIB" + "Repair" means it's the installed backup path. IntfDj means it's on a separate interface from the primary path, BcastDj means it's not sharing a broadcast interface with the primary path, and NodeProt means the path does not include shared hops with the primary path.

Microloops can add complexity with fast-reroute. A microloop is what happens when one router converges significantly faster than a neighbor. Let's say two adjacent routers both receive new LSAs simultaneously. One router is high-performance, another is older. The high-performance router calculates the change and updates the FIB several seconds before the older router. Now we could end up with a scenario where the newer router starts forwarding traffic through the older router, but the older router's FIB hasn't updated yet, and it's forwarding through the faster router for that same prefix. For a couple of seconds, the two routers loop.

I'm not going to go into detail on this as it's a fringe topic, but here's the starting point for using this:
R1(config-router)#microloop avoidance ?
  disable           Microloop avoidance auto-enable prohibited
  protected         Microloop avoidance for protected prefixes only
  rib-update-delay  Delay before updating the RIB

In short, it allows you to deliberately slow down updating the FIB on the faster router for prefixes that are high-risk for this type of reconvergence.

If you don't want an interface being considered for fast-reroute:

R1(config-router)#int gig1.12
R1(config-subif)#ip ospf fast-reroute per-prefix candidate disable

And if you need a quick summary of what percentage of routes are and aren't protected:

R1#sh ip ospf fast-reroute prefix-summary

            OSPF Router with ID (1.1.1.1) (Process ID 1)
                    Base Topology (MTID 0)

Area 0:

Interface        Protected    Primary paths    Protected paths Percent protected
                             All  High   Low   All  High   Low    All High  Low
Lo0                    Yes     0     0     0     0     0     0     0%   0%   0%
Gi1.16                 Yes     1     1     0     0     0     0     0%   0%   0%
Gi1.14                 Yes     0     0     0     0     0     0     0%   0%   0%
Gi1.13                 Yes     7     3     4     4     2     2    57%  66%  50%
Gi1.12                 Yes     1     1     0     0     0     0     0%   0%   0%

Area total:                    9     5     4     4     2     2    44%  40%  50%

Process total:                 9     5     4     4     2     2    44%  40%  50%

That's a wrap for direct LFA. Now we'll look at remote LFA.



This is a simplistic topology but it has a huge problem for direct LFA.
Let's protect the path from R1 to R4.

We have two paths:
R1 -> R4 (cost 1)
R1 -> R2 -> R3 -> R4 (cost 12)

Obviously R1 -> R4 is the primary path,
What does R2 see as it's possible paths to R4?
R2 -> R1 -> R4 (Cost 2)
R2 -> R3 -> R4 (Cost 11)

R2 will always send traffic back to R1 when heading towards R4.

What about R3?
R3 -> R4 (Cost 6)
R3 -> R2 -> R1 (Cost 7)

R3 would work for a backup path... if only we could get to R3 without R2 knowing what we're up to.

Enter Remote LFA.

R1(config-router)#int gig1.14
R1(config-subif)#mpls ip
R1(config-subif)#int gig1.12
R1(config-subif)#mpls ip
R1(config-subif)#mpls ldp discovery targeted-hello accept

R2(config-subif)#int gig1.12
R2(config-subif)#mpls ip
R2(config-subif)#int gig1.23
R2(config-subif)#mpls ip
R2(config-subif)#mpls ldp discovery targeted-hello accept

R3(config-subif)#int gig1.23
R3(config-subif)#mpls ip
R3(config-subif)#int gig1.34
R3(config-subif)#mpls ip
R3(config-subif)#mpls ldp discovery targeted-hello accept

R4(config-subif)#int gig1.14
R4(config-subif)#mpls ip
R4(config-subif)#int gig1.34
R4(config-subif)#mpls ip
R4(config-subif)#mpls ldp discovery targeted-hello accept

R1(config-router)#router ospf 1
R1(config-router)#fast-reroute per-prefix remote-lfa tunnel mpls-ldp

There's a complex algorithm that makes this work, but it's somewhat irrelevant from a CCIE v5 perspective. 

Here's what you really need to know:
- Direct LFA had to have failed to turn up a path already (direct is always tried first)
- A tunnel is built over targeted LDP.
- The destination tunnel router is picked on the following criteria:
   -  It must be in the same area as the router running LFA
   - The tunnel endpoint is picked from among the group of routers that can be reached through a next-hop other than the one you're trying to protect.
   - Of that group of routers, it's narrowed down to the subset that can reach your repair prefix without passing through the protecting router.
   - Those that qualify are called the PQ space (refer to the RFC for a lot more detail, but it may be overkill for a CCIE candidate) 

R1#sh ip route repair 4.4.4.4
Routing entry for 4.4.4.4/32
  Known via "ospf 1", distance 110, metric 2, type intra area
  Last update from 192.168.14.4 on GigabitEthernet1.14, 00:29:36 ago
  Routing Descriptor Blocks:
  * 192.168.14.4, from 4.4.4.4, 00:29:36 ago, via GigabitEthernet1.14
      Route metric is 2, traffic share count is 1
      Repair Path: 3.3.3.3, via MPLS-Remote-Lfa1
    [RPR]3.3.3.3, from 4.4.4.4, 00:29:36 ago, via MPLS-Remote-Lfa1
      Route metric is 12, traffic share count is 1

R1#sh ip int br | i MPLS
MPLS-Remote-Lfa1       192.168.12.1    YES unset  up                    up

This whole process is reasonably automatic, just make sure your LDP is in good shape and targeted LDP is enabled and you're good to go.

You can optionally specify areas and maximum costs:

R1(config-router)#fast-reroute per-prefix remote-lfa area 0 maximum-cost 10

The areas work the same way they did with direct LFA - we're just saying we only want to protect area 0, 1, 2, 3, etc. For remote LFA, the router you're running LFA on has to be in the area you're trying to protect - you can't protect area 5 if you're only an ABR for areas 0 and 1.

The maximum cost option restricts which prefixes you should be building tunnels for. In other words, it has nothing to do with the metric to reach the tunnel endpoint - it has to do with the prefix you're trying to protect.

Hope you enjoyed!

Jeff

17 comments:

  1. Nice Post. Did you do this in GNS3 ?
    Which IOS did you use for simulating this ?

    ReplyDelete
    Replies
    1. Saurabh, most of my labs - including this one - are built on IOS-XE (3.11.01 / 15.4(1)) on the CSR1000v. You'll need a VMWare box to run it, but there's a lot of threads on ieoc.com that cover how to implement.

      Delete
  2. This comment has been removed by the author.

    ReplyDelete
  3. Amazing, thanks for this explanation,


    R3:
    When attempting to reach R5, R3 will route back through R1. This will loop. Eliminate R2.
    Should be ?
    When attempting to reach R5, R5 will route back through R2. This will loop. Eliminate R3.

    ReplyDelete
    Replies
    1. Yes, this was a type-o. Fixing it now.

      Delete
    2. Sorry :
      R3 will route back through R1
      Should be ?
      R3 will route back through R2

      Delete
    3. No, that part's correct. On the initial diagram R3's bestpath is via R1. So from R1's perspective we can't send the traffic to R3, it would push it right back at us.

      Delete
  4. Hi Jeff!

    I built my topology exactly as you did and my ospf config on IOS-XE-1 router is as follows:

    router ospf 1
    fast-reroute per-prefix enable area 0 prefix-priority high
    fast-reroute keep-all-paths
    network 1.1.1.1 0.0.0.0 area 0
    network 192.168.0.0 0.0.255.255 area 0

    despite I configured the ospf cost following your topology and I checked it several times, my repair route was chosen through R4, the highest metric available toward 5.5.5.1 network, interestingly. even I entered "fast-reroute per-prefix tie-break lowest-metric index 5" command on IOS-XE-1 router, again it had no any effect.

    x1(config-if)#do sh ip ospf rib 5.5.5.1

    *> 5.5.5.1/32, Intra, cost 3, area 0
    SPF Instance 30, age 00:02:58
    Flags: RIB, HiPrio
    via 192.168.13.3, GigabitEthernet3
    Flags: RIB
    LSA: 1/5.5.5.5/5.5.5.5
    repair path via 192.168.14.4, GigabitEthernet5, cost 26
    Flags: RIB, Repair, IntfDj, BcastDj, NodeProt
    LSA: 1/5.5.5.5/5.5.5.5

    I'm using VIRL that uses, IOS-XE version 03.14.00.s on CSR100v. any idea?

    ReplyDelete
    Replies
    1. I need to add to my previous comment, that I disabled the link on IOS-XR-1 router that goes toward R4, and issue the commands again to see what will happen. this time, no any outputs contain the repair path at all. instead they just show a primary path that is through R3. the neighborship is in place between all routers.

      Delete
  5. Great article Jeff! Keep it up the great work.

    ReplyDelete
  6. Very good explanation, thank you.

    ReplyDelete
  7. Very good post! Thank you very much

    ReplyDelete
  8. Hi ,
    Thank you Nice post obviously! in Remote LFA section , i think it is not needed to enable Direct LDP for all routers execept "tail-end" which is R3 in our topology.

    thanks

    ReplyDelete
  9. Thank you for such a good explanation on this topic, I finally understand the issue on my network based on this!

    ReplyDelete
  10. Hi Jeff,

    Nice post and helped engage my thoughts around this.

    I am not sure how you got the failure path via R6 from using the default fast-reroute commands under the OSPF process on R1.

    R6 has an ECMP path to R5 via R1 and R3, therefore it would fail the Inequality check 1 (LFA) and not be candidate for a failure path. The cost from R6 (N) to the Destination would be "3" and the this is not less than the path from R6 (N) to R1 (S) + the distance from R1 (S) to the destination (it would be equal):

    D(N,D) < D(N,S) + D(S,D)

    R6 (N) would fail all inequality checks, 1 (LFA), 2 (Downstream which is off by default) and 3 (node protection, which we already know).

    Calculations for R6

    R6 (FAILED)

    Inequality 1 - LFA

    D(N,D) = 3 < D(N,S) + D(S,D). = 1 +2 3


    Inequality 2 - DOWNSTREAM

    D(N,D) = 3 < D (S,D) =. 2


    Inequality 3 - NODE PROTECTION


    D(N,D) = 3 < D(N,E) + D(E,D). = 2 +1 = 3


    From the specific scenario I believe the failure path installed using default FRR would be via R4 as it passes inequality 1 (LFA):

    R4 (PASSED Inequality 1 and 3)

    Inequality 1 - LFA

    D(N,D) = 10. < D(N,S) + D(S,D). = 15 + 2 = 17


    Inequality 2 - DOWNSTREAM

    D(N,D) = 10 < D (S,D) =. 2



    Inequality 3 - NODE PROTECTION


    D(N,D) = 10 < D(N,E) + D(E,D). = 16 + 1 = 17

    This would be the only candidate from what I can see.

    If we were to increase the cost from R1 to R6, so that R6 had 1 path to the destination via R3, this would then pass the Inequality 1 and based upon metric would be the Failure path installed in the RIB.

    At this point if the costs were then increased from R1 to R6 and we used node protection it would then install R4 as the failure path, as R6 would fail Inquality check 3 (Node Protection).

    Just wanted to see your thoughts and how you saw the next hop as R6 when using the default FRR under the OSPF process, as I cannot replicate this behaviour based upon the costs used in the topology.

    Thanks
    Nick

    ReplyDelete