Friday, November 29, 2013

[mini] PPPoE in the DocCD

I ran across a PPPoE problem a couple days ago, and let me tell you, this is not my favorite topic.  I've only used it in production once, and I don't come across it in practice labs enough to keep it fresh in my mind. I've been skipping these questions when doing time-trial practice labs and just using traditional Ethernet whenever this was called for, and just taking a hit on the points.  Not a good plan, but I felt there were more important things to focus on.

One of the other reasons I haven't wanted to focus on it, knowing that I only see it once in a blue moon, is that the documentation is so spread out I could never figure out where all the various pieces are.  The lab questions always call for server and client installs, and they're on different pages, and spread out across those two pages.

I decided a good interim step on this problem is to nutshell exactly where the pieces are in the documentation.

First, you want the Broadband Access Aggregation and DSL Configuration Guide.  It's on the main "Configuration" page for 12.4T that you've been going to in the DocCD.  See below.

The next page has a lot of options on it. Fortunately we only need two of them, and they're right on top of each other:

- PPPoE "server" is on Providing Protocol Support for Broadband Access Aggregation of PPPoE Sessions.
- PPPoE client is on "PPP over Ethernet Client"

We'll start on the server side first.  "R1" will be our server router.  Not providing a diagram, just two devices connected in Fa0/0 involved.

You need three sections on the "Providing Protocol Support for Broadband Access Aggregation of PPPoE Sessions" page:

- "Configuring a Virtual Template Interface"
- "Defining a PPPoE Profile"
- "Assigning a PPPoE Profile to an Ethernet Interface"

I put them in the order I felt they should be done in, so let's start with "Configuring a Virtual Template Interface".  Frankly, if you don't know how to this, this is worth memorizing.  It comes up in more places than just PPPoE (PPP over Frame Relay, namely). 

Let's apply the necessary pieces as we walk through this:

R1(config)#interface virtual-template 1
R1(config-if)#ip address  ! you don't actually have to use IP unnumbered
R1(config-if)#mtu 1492   ! not really a requirement but a really good idea
R1(config-if)#peer default ip address dhcp-pool TEST-POOL

To be fair, the "peer default" bit for assigning IP addresses to clients isn't actually in the above documentation snippet, but it is elsewhere on the page if you search for it.  It's also not a requirement, you could assign IPs statically.

Next step -

R1(config-if)#bba-group pppoe global
R1(config-bba-group)# virtual-template 1

Yep, that's all you really must have to get the bba-group working.  Now let's assign it to an interface.

R1(config)#interface fa0/0
R1(config-if)#pppoe enable
R1(config-if)#no shut

The pppoe enable command will expand to pppoe enable group global on its own, if you do a "show run".

We did reference a DHCP pool up above; we'll need to create that.

R1(config)#ip dhcp pool TEST-POOL

That's all - now for the client side.  As we saw earlier (same image repeated from above), the client side is directly underneath the "server" side.

Once you're in there, there's once again many options, however the two you need are pretty easy to spot.  Note carefully that we are on the "12.2(13)T 12.4T and Later Releases" section.  There's one just above this for pre-12.2(13)T.

Configuring the dialer interface first makes more sense, so we'll start there:

R2(config)#int dialer 1
R2(config-if)#mtu 1492
R2(config-if)#encapsulation ppp
R2(config-if)#ip address negotiated
R2(config-if)#dialer pool 1

R2(config-if)#pppoe-client dial-pool-number 1
R2(config-if)#no shut

That's it - if you did it correctly, you should get output something like this on your client:

*Mar  1 00:28:51.103: %DIALER-6-BIND: Interface Vi1 bound to profile Di1
*Mar  1 00:28:51.191: %LINK-3-UPDOWN: Interface Virtual-Access1, changed state to up
*Mar  1 00:28:52.235: %LINEPROTO-5-UPDOWN: Line protocol on Interface Virtual-Access1, changed state to up

R2(config-if)#do sh ip int dialer1 | i Internet address
  Internet address is

R2(config-if)#do ping
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to, timeout is 2 seconds:
Success rate is 100 percent (5/5), round-trip min/avg/max = 16/23/36 ms



Sunday, November 10, 2013

[mini] Embarassing BGP as-override misunderstanding

It can be hard to post on the Internet about dramatically misunderstanding a technology. 

In my defense, I've never worked for an MPLS provider, so I've never used as-override outside of a lab - actually I'm not sure I've ever used it in a lab before tonight, either.

For those unfamiliar with the basic idea, as-override is used in MP-BGP/VRF/MPLS scenarios where the customer wants to re-use an AS number on several sites.  Since the CE routers see the traffic from the PE routers as eBGP, they see their own AS number in the path and reject the update from the PE.  as-override is the PE mechanism to overcome this problem.

Let's take a four-router scenario - two CE routers and two PE.

It might look something like this:

CE1 (AS 100) -> PE1 (AS 250) -> PE2 (AS 250) -> CE2 (AS 100)

Clearly, when PE2 advertises CE1's routes to CE2, CE2 should reject them.

Fixing this on the CE side is very easy; you can change the AS number or use allowas-in to allow the CE to ignore the fact that its own AS number is present while receiving BGP updates.

As a network consultant I regularly deal with MPLS site activations, and twice now I've had the carrier offer to use as-override to fix the problem above, and I've declined, one time opting to change the AS number on the CE, another time I used allowas-in. I'd gotten the idea that, given that the carrier technician was signed into the PE connected to my CE, that that's the only place where the as-override would go.  Boy was I wrong.

I spent about 90 minutes this evening trying to get as-override working in the scenario described above.  CE1 would send AS 100 to PE1.  PE1 was configured with as-override facing CE1, and what I expected to have happen was PE1 strip out AS 100 on its way to PE2.  Incorrect! 

I'd repeatedly pull up PE2's BGP table:

PE2#sh ip bgp vpnv4 vrf CCIE | s
*>i1.1.1.1/32             0    100      0 100 I

BGP output doesn't paste the best into a non-monospaced document, but in short, it shows the prefix is still learned from AS 100 still (the other "100" adjacent to that is the local preference).  I sat there scratching my head, wondering how CE2 was going to be able to learn this (quick answer - it can't).

It turns out as-override is not an ingress setting at all.  It's an egress setting.  All it does is tell the PE that as-override is configured on that when it's passing routes to a CE, to do a find-and-replace of the CE's AS number and replace it with the local PE's AS number.

In other words, in our scenario:

CE1 (AS 100) -> PE1 (AS 250) -> PE2 (AS 250) -> CE2 (AS 100)

If I were to set as-override on PE1, that would enable CE1 to receive CE2's routes - not vice-versa.

CE1(config)#do sh ip bgp | i
*>                           0 250 250 I

We see that CE1 sees (CE2's loopback) as going through AS 250 twice, instead of AS 250 followed by AS 100.

Thought this might help others out there stuck on a similar misunderstanding.



Thursday, November 7, 2013

[mini] Why does LDP "require" a /32 Loopback?

A few days ago I asked a coworker why LDP sessions had issues if they weren't peered on /32s.  He answered, it doesn't have to be a /32, but the IGP and LDP had to agree on the mask length.  So I asked the more specific question - why does it have to agree on the mask length? He didn't know.  And neither did I.

Everyone seems to know that /32s are best practice for the LDP router ID.  But it's hard to find a good, clear explanation of why this is.

Let's start with some obvious facts.

- "The router considers all the IP addresses of all operational interfaces.... If these addresses include loopback interface addresses, the router selects the largest loopback address."

As always, my posts are geared for the CCIE lab, and it's a fair bet most of your gear on the lab is going to have a loopback.  So, expect the router ID to be a loopback, unless it's specified otherwise.

- You can specify the interface with mpls ldp router-id <interface>.  If you don't want it to be a loopback, or you want a certain loopback to be chosen over another, then use this command. If you want to change the router-id while LDP is already up you have to use the force command, i.e. mpls ldp router-id lo7 force.  If you don't use force, and LDP was already online, you'll have to reboot in order for the switch to take place.

- You can set the range of labels that LDP is allowed to use with mpls label range <lower> <upper>  I find this useful in debugging, because you can make your labels match your router number and it's easier to read the output.  LDP show commands are not always easy to interpret if you're not used to reading them.

-  "The LDP default behavior is to allocate local labels for all non-BGP prefixes."

So what's that mean to us?  It might be better phrased as "The LDP default behavior is to allocate local labels choosing the best administrative distance as long as it's not from BGP".

- This problem is most commonly seen with OSPF (although you could see it from a summary route as well).  The sure-fire way to demonstrate it is to create a /24 loopback and not change the default network type.  OSPF automatically uses network type LOOPBACK, which is always advertised as a /32.

- With MPLS VPNs, BGP actually distributes the labels for the VRFs, not LDP.  You learn the stacked VRF tag, relevant only to the egress PE, from BGP.  You also learn the global routing table's next hop.  The next-hop is used to find out the LDP label.

Let's take a look at how this plays out.

R3 is trying to reach R1 in VRF CCIE.  R3's IP address is and R1's IP address is  R2 is sitting in the middle of the two.

R3#ping vrf CCIE
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to, timeout is 2 seconds:
Success rate is 0 percent (0/5)
As we can see, ping is failing.

R3#sh ip route vrf CCIE
Routing entry for
  Known via "bgp 100", distance 200, metric 0, type internal
  Last update from 00:26:04 ago
  Routing Descriptor Blocks:
  * (Default-IP-Routing-Table), from, 00:26:04 ago
      Route metric is 0, traffic share count is 1
      AS Hops 0
We have a route to reach it.

R3#show ip cef vrf CCIE, version 3, epoch 0, cached adjacency
0 packets, 0 bytes
  tag information set
    local tag: VPN-route-head
    fast tag rewrite with Fa0/0,, tags imposed: {200 103}
  via, 0 dependencies, recursive
    next hop, FastEthernet0/0 via
    valid cached adjacency
    tag rewrite with Fa0/0,, tags imposed: {200 103}

I used the mpls label range command (mentioned above) in order to restrict the tags to start with their own router ID.  In this case, we should be using MPLS "transit" tag of 200, and a MPLS "VRF" tag of 103.

R3#show mpls ldp bindings | b
  tib entry:, rev 6
        local binding:  tag: 300
        remote binding: tsr:, tag: 200
<output omitted>

We know that tag 200 references R1's primary routing table loopback IP (

R3#show mpls forwarding-table
Local  Outgoing    Prefix            Bytes tag  Outgoing   Next Hop
tag    tag or VC   or Tunnel Id      switched   interface
300    200      0          Fa0/0

We know that means sending traffic out Fa0/0 towards R2 ( with tag 200.

Ok, so this router should be able to send traffic, right?

R2#debug mpls packet
MPLS packet debugging is on
R3#ping vrf CCIE rep 2 timeout 1
Type escape sequence to abort.
Sending 2, 100-byte ICMP Echos to, timeout is 1 seconds:
Success rate is 0 percent (0/2)

*Mar  1 00:36:08.651: MPLS: Fa0/1: recvd: CoS=6, TTL=255, Label(s)=0
*Mar  1 00:36:09.067: MPLS: Fa0/1: recvd: CoS=6, TTL=255, Label(s)=0
R2 gets the MPLS packet just fine!  And that's all it does.  Notice my debug doesn't say anything about forwarding it on.

R2#show mpls ldp binding | b
  tib entry:, rev 10
        local binding:  tag: 200
        remote binding: tsr:, tag: 300
<output omitted>

We see R2 has locally bound tag 200 for, and has received a tag from R3 for, but ... no tag from R1?

Let's look at the routing tables.

R2#sh ip route
Routing entry for
  Known via "ospf 1", distance 110, metric 2, type intra area
  Last update from on FastEthernet0/0, 00:00:02 ago
  Routing Descriptor Blocks:
  *, from, 00:00:02 ago, via FastEthernet0/0
      Route metric is 2, traffic share count is 1
R2 sees this as a /32.

R3#sh ip route
Routing entry for
  Known via "ospf 1", distance 110, metric 3, type intra area
  Last update from on FastEthernet0/0, 00:39:16 ago
  Routing Descriptor Blocks:
  *, from, 00:39:16 ago, via FastEthernet0/0
      Route metric is 3, traffic share count is 1

R3 sees this as a /32.  Consequently, R3 has no problem sending the MPLS packet to R2.

R1#sh ip route
Routing entry for
  Known via "connected", distance 0, metric 0 (connected, via interface)
  Routing Descriptor Blocks:
  * directly connected, via Loopback0
      Route metric is 0, traffic share count is 1
And R1 sees it as a ... /24 connected route.  As mentioned above, OSPF is the common culprit here. It's advertising a /32 to everyone else, except the local router, which still sees it as a /24.  In fact...

R2#sh mpls ldp binding | b
  tib entry:, rev 11
        remote binding: tsr:, tag: exp-null
<output omitted>

R1 is advertising a /24 to R2.  MPLS bindings work a bit different than the routing table, R2's LDP process isn't simply going to choose the best route to R1, it's matching labels to prefixes, and the prefixes are considered unique if they're not identical.  So R2 just drops the packet, as it has no more bindings for

The fix is to just make the two prefix lengths the same. They don't need to be /32s!  The easiest way to make this happen in this scenario is to change the OSPF network type away from LOOPBACK and stop forcing the /32 advertisement:

R1(config)#int lo0
R1(config-if)#ip ospf network point-to-point
R2#sh mpls ldp binding | b
  tib entry:, rev 16
        local binding:  tag: 203
        remote binding: tsr:, tag: exp-null
        remote binding: tsr:, tag: 305
<output omitted>

We can see R2 now has a binding from R1 and R3 that matches the same prefix length.

R3#ping vrf CCIE
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to, timeout is 2 seconds:
Success rate is 100 percent (5/5), round-trip min/avg/max = 60/66/76 ms

And forwarding works end-to-end.

In a nutshell: LDP associates labels with both the IP address and subnet mask.  The prefix length does have to match to become part of the same MPLS forwarding path.  However, the prefix length does not have to be /32 - it's just a good, safe practice.