Sunday, November 25, 2012

MPLS Tunnel Next-Hop & LDP Filtering

I ran into a rather tricky-to-debug MPLS scenario.

We're going to setup a rather traditional MPLS configuration - two PEs, two CEs, and one BGP-free P MPLS-only router:  The CEs both represent the same customer and will be sharing a VRF. 


As I'm sure you're all already aware, setting up MP-BGP, VRFs, MPLS, etc, is quite a lot of config.  Nonetheless, in order to accurately convey the point I'd like to make, here's the relevant config from each router:

CE1:
interface FastEthernet0/0
 ip address 192.168.11.1 255.255.255.0

interface Loopback100
 ip address 159.1.1.1 255.255.255.0

router bgp 100
 network 159.1.1.0 mask 255.255.255.0
 neighbor 192.168.11.2 remote-as 999

CE2:
interface FastEthernet0/0
 ip address 192.168.22.1 255.255.255.0

interface Loopback100
 ip address 159.2.2.2 255.255.255.0

router bgp 200
 network 159.2.2.0 mask 255.255.255.0
 neighbor 192.168.22.2 remote-as 999
Not much to see so far - our internal (VPN-accessible) networks are simulated by Lo100 on each router.  We're BGP peered to the PEs.

PE1:
ip vrf VPN_A
 rd 1:1
 route-target export 111:111
 route-target import 111:111

interface FastEthernet0/0
 ip vrf forwarding VPN_A
 ip address 192.168.11.2 255.255.255.0

interface Loopback0
 ip address 1.1.1.1 255.255.255.255

router ospf 1
 network 0.0.0.0 255.255.255.255 area 0

interface FastEthernet0/1
 ip address 192.168.34.2 255.255.255.0
 mpls ip

router bgp 999
 no bgp default ipv4-unicast
 bgp log-neighbor-changes
 neighbor 2.2.2.2 remote-as 999
 neighbor 2.2.2.2 update-source Loopback0
 !
 address-family vpnv4
  neighbor 2.2.2.2 activate
  neighbor 2.2.2.2 send-community extended
 exit-address-family
 !
 address-family ipv4 vrf VPN_A
  neighbor 192.168.11.1 remote-as 100
  neighbor 192.168.11.1 activate
  no synchronization
 exit-address-family

PE2:
ip vrf VPN_A
 rd 1:1
 route-target export 111:111
 route-target import 111:111

interface FastEthernet0/0
 ip address 192.168.45.2 255.255.255.0
 mpls ip

interface Loopback0
 ip address 2.2.2.2 255.255.255.255

router ospf 1
 network 0.0.0.0 255.255.255.255 area 0

router bgp 999
 no bgp default ipv4-unicast
 bgp log-neighbor-changes
 neighbor 1.1.1.1 remote-as 999
 neighbor 1.1.1.1 update-source Loopback0
 !
 address-family vpnv4
  neighbor 1.1.1.1 activate
  neighbor 1.1.1.1 send-community extended
 exit-address-family
 !
 address-family ipv4 vrf VPN_A
  neighbor 192.168.22.1 remote-as 200
  neighbor 192.168.22.1 activate
  no synchronization
 exit-address-family

As you can see, PE1 and PE2 are nearly identical.  I've created a problem on the P router, and as such, I'm not going to show you the config just yet.  Suffice to say that it's not running BGP, and it is running LDP.  This problem tripped me up for some time; let's work through the important part -- the debugging steps.

At this point, the CEs are expecting to be able to communicate.  Let's see what happens:

CE1#ping 159.2.2.2 source lo100

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 159.2.2.2, timeout is 2 seconds:
Packet sent with a source address of 159.1.1.1
.....
Success rate is 0 percent (0/5)
Not really a big surprise since I told you I broke something, right?

CE1#sh ip route bgp
     159.2.0.0/24 is subnetted, 1 subnets
B       159.2.2.0 [20/0] via 192.168.11.2, 00:21:23
It has the BGP route to reach 159.2.2.2.... what about the other side?

CE2#sh ip route bgp
     159.1.0.0/24 is subnetted, 1 subnets
B       159.1.1.0 [20/0] via 192.168.22.2, 00:21:37
Yes, the other side knows how to get back.

Are we unidirectional?

CE1 & CE2:
debug ip icmp

This is a bit hard to show in a blog, but suffice to say neither side is hearing the other's pings.

OK, so, we get the routes on both CE routers, but neither of them actually has reachability.
Can the CEs ping the PEs?  One would assume, as the BGP session is up.  Let's just check to be safe:

CE1#ping 192.168.11.2
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 192.168.11.2, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 20/26/40 ms

CE2#ping 192.168.22.2
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 192.168.22.2, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 16/28/52 ms

OK, that works.

This is where I got stuck.  I'd never seen a scenario where BGP delivered the routes, I could ping the PEs, but not across the MPLS cloud.  And this is also when I found out that two everyday use commands can be wonders for debugging MPLS tunnels.

PE1:
access-list 101 permit ip any 159.2.2.0 0.0.0.255
debug ip packet 101

PE1#ping vrf VPN_A 159.2.2.2
*Mar  1 01:40:11.095: IP: tableid=1, s=192.168.11.2 (local), d=159.2.2.2 (FastEthernet0/1), routed via FIB
*Mar  1 01:40:11.099: IP: s=192.168.11.2 (local), d=159.2.2.2 (FastEthernet0/1), len 100, sending
*Mar  1 01:40:11.099:     ICMP type=8, code=0
*Mar  1 01:40:11.099: IP: s=192.168.11.2 (local), d=159.2.2.2 (FastEthernet0/1), len 100, MPLS encapsulation failed
*Mar  1 01:40:11.103:     ICMP type=8, code=0.

Lesson learned #1 - I hadn't realized before this that "debug ip packet" could be used for MPLS forwarding.

This is helpful - we know the packet isn't past the PE.  "MPLS encapsulation failed".  If we can't encapsulate something, where do we normally look next?  Well, the FIB!

PE1#sh ip cef vrf VPN_A 159.2.2.2
159.2.2.0/24, version 10, epoch 0, cached adjacency 192.168.34.1
0 packets, 0 bytes
  tag information set
    local tag: VPN-route-head
    fast tag rewrite with
        Recursive rewrite via 2.2.2.2/32, tags imposed {19}
  via 2.2.2.2, 0 dependencies, recursive
    next hop 192.168.34.1, FastEthernet0/1 via 2.2.2.2/32
    valid cached adjacency
    tag rewrite with
        Recursive rewrite via 2.2.2.2/32, tags imposed {19}

Lesson learned #2 - I didn't realize "show ip cef" could be used with MPLS forwarding, either.

So even after finding this I was still scratching my head for a bit.  That output is not easy to read, and this was my first time reading it.  Looks like we're missing something after the bolded sections.  Let's take a look at the MPLS forwarding table and see what's up.

PE1#show mpls forwarding-table
Local  Outgoing    Prefix            Bytes tag  Outgoing   Next Hop
tag    tag or VC   or Tunnel Id      switched   interface
16     Pop tag     3.3.3.3/32        0          Fa0/1      192.168.34.1
17     Untagged    192.168.45.0/24   0          Fa0/1      192.168.34.1
18     Untagged    2.2.2.2/32        0          Fa0/1      192.168.34.1
19     Untagged    159.1.1.0/24[V]   5016       Fa0/0      192.168.11.1

The first thing that should jump out at you is that most of the routes are shown as "Untagged".  We wouldn't expect a tag on our local route (159.1.1.0/24).  The "Pop tag" route is there because 3.3.3.3 is on our next hop (the "P" router), and the PHP (penultimate hop pop) process is making that happen.  So why is everything else untagged?

Time to look at the P router...

P#show mpls forwarding-table
Local  Outgoing    Prefix            Bytes tag  Outgoing   Next Hop
tag    tag or VC   or Tunnel Id      switched   interface
16     Pop tag     1.1.1.1/32        9217       Fa0/0      192.168.34.2
17     Pop tag     2.2.2.2/32        9003       Fa0/1      192.168.45.2

Well, what the heck?  It knows how to reach 1.1.1.1 and 2.2.2.2... it's PHP for both, but still...

Oh yeah, I put some LDP filtering in here from a previous task.  Let's check that out:

P#sh run | i advertise
no mpls ldp advertise-labels
mpls ldp advertise-labels for 15

P#sh ip access-list 15
Standard IP access list 15
    10 permit 3.3.3.3 (3 matches)

The task called for the P router not to include it's physical interfaces in the MPLS advertisements.  After all, those interface are never endpoints in our MPLS network, only transit - so they're not needed.

Clever folks out there have probably already spotted my mistake.  This is, frankly, the least important part of my post.  I wanted to show how to debug broken PE -> P -> PE forwarding, as that's where I got stuck.  This part was easy, once I got here.

The catch here is that the "mpls ldp advertise-labels for X" command does not work like a traditional routing protocol.  I was using it like a "network X.X.X.X" statement in an IGP.  What I was visualizing as the solution was that "P" would advertise 3.3.3.3, and continue to advertise 1.1.1.1 and 2.2.2.2 that it learned from its neighbors.  It turns out that "mpls ldp advertise-labels for X", when compared to an IGP, is a combo "Network X.X.X.X" statement and a distribute-list wrapped in to one command.  When I only permit 3.3.3.3 on access-list 15 above, it filtered everything it learned from its neighbors, toward the other neighbor.  So, PE1 didn't learn PE2's labels, and vice-versa.  The appropriate access-list looks like this:

ip access-list standard 15
 10 permit 1.1.1.1
 20 permit 2.2.2.2
 30 permit 3.3.3.3

Now with that updated ACL, let's go back and look at some things.  First thing's first, can we ping?!

CE1#ping 159.2.2.2 source lo100
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 159.2.2.2, timeout is 2 seconds:
Packet sent with a source address of 159.1.1.1
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 80/99/112 ms

Yes we can!

How's about those other show commands from earlier?

PE1#show ip cef vrf VPN_A 159.2.2.2
159.2.2.0/24, version 10, epoch 0, cached adjacency 192.168.34.1
0 packets, 0 bytes
  tag information set
    local tag: VPN-route-head
    fast tag rewrite with Fa0/1, 192.168.34.1, tags imposed: {17 19}  via 2.2.2.2, 0 dependencies, recursive
    next hop 192.168.34.1, FastEthernet0/1 via 2.2.2.2/32
    valid cached adjacency
    tag rewrite with Fa0/1, 192.168.34.1, tags imposed: {17 19}

That's a whole lot healthier looking than what we have before.  Now we have an egress interface specified instead of a blank spot.

And our forwarding table is happy once again:

PE1#show mpls forwarding-table
Local  Outgoing    Prefix            Bytes tag  Outgoing   Next Hop
tag    tag or VC   or Tunnel Id      switched   interface
16     Pop tag     3.3.3.3/32        0          Fa0/1      192.168.34.1
17     Untagged    192.168.45.0/24   0          Fa0/1      192.168.34.1
18     17          2.2.2.2/32        0          Fa0/1      192.168.34.1
19     Untagged    159.1.1.0/24[V]   6156       Fa0/0      192.168.11.1

We now have a tag when heading towards 2.2.2.2/32 (PE2).  Since that's where all our traffic goes in this scenario, we're happy campers.

** UPDATE ** Found out on a future scenario that "debug ip packet" is useless if the router isn't a PE for the VRF you're trying to debug.  Makes sense; it's not an IP packet, it's an MPLS packet.  The command you want under those circumstances is "debug mpls packet".  Interestingly, there's actually an MPLS access list you can write using the 2700-2799 range, but this isn't supported on 12.4(15)T, so I assume it's not on the lab exam.  I can confirm it does work on 12.4(24)T, however.

Cheers,

Jeff Kronlage


No comments:

Post a Comment