Saturday, January 5, 2013

Shaping & Policing on Frame Relay

This post will be mostly about FRTS, but will touch on FRTP (P=Policing), as well.  With FRTP being an unlikely topic on the CCIE R&S lab, I'm only going to cover basic scenarios.  That said, we'll start with FRTS.

If there's one primary memory I have about FRTS, it'd be from the last time I deployed it.  We had a pseudo-frame-relay circuit from Verizon.  I say "pseudo", because the backhaul transport was actually MPLS, so the DLCIs really only existed between the PE and CE gear.  But it acted like Frame Relay nonetheless, and had the same shaping issues inherent to hub & spoke frame relay.  The hub had a far larger pipe than the spokes.  We ran VoIP from hub to spoke, and when the data spiked from hub to spoke, it crushed the VoIP, because the LLQ never saw congestion at the hub side.

But that's not the memorable part.

The memorable part was when I was explaining this concept to my coworkers, and I drew "FRTS" on the whiteboard, and one of my coworkers got up and drew a small "a" between "F" and "R" - "FaRTS", and that's still how I pronounce it in my head to this day.

FRTS has three general uses:
1) The line rate is faster than the CIR.  We want our traffic queued up instead of dropped, or we want some traffic guaranteed delivery (VoIP) and other traffic dropped.
2) One side of the DLCI is faster than the other.  When the fast side sends to the slow side, frames will be lost at the provider's frame-relay switch.  This is doubly bad: any traffic you want prioritized may or may not get delivered (my scenario from above), and the service provider will drop your traffic instead of you having the option of queueing it up.
3) FRTS can be used for adaptive shaping.  Adaptive shaping is used in a scenario where the provider allows the customer bursting over the CIR, but doesn't guarantee delivery for traffic over the CIR.  When congestion occurs inside the providers' network, a FECN is sent to the destination, which can then reflect the FECN back to the source as a BECN.  When the source receives the BECN, it can slow down to a preconfigured value, hence the "adaptive" shaping.

Throughout this post I'm going to refer to CIR, PIR, Tc, Bc & Be.  These can be complex terms and mean different things in different scenarios.  I've written a different article regarding these terms; it can be found at:
http://brbccie.blogspot.com/2012/12/a-different-perspective-on-cir-pir-tc_1785.html

There are four different ways to implement FRTS:
1) Generic Traffic Shaping - using the traffic-shape command
2) Legacy FRTS - using the map-class structure
3) Mixed Legacy & MQC - using map-class structures that call policy-maps
4) MQC/GTS - using policy-maps directly
We'll tackle these in the order above.  This also happens to be the chronological order they were developed in.  Each of these methods was developed to address a different need.

Generic Traffic Shaping
This, the earliest method of shaping, can shape on most any interface, not just Frame Relay. 
The usage is relatively basic, and is applied directly to the interface or sub-interface:
R1(config-if)#traffic-shape rate <CIR> <Bc> <Be> <buffer limit>

Buffer limit is the depth of the shaping queue.

It can also take an access list, in the format traffic-shape group <acl-name> <CIR>

GTS has support for slowing down with BECN.  For those unfamiliar with FECN/BECN, when traffic congestion is detected by the service provider's frame relay, a FECN is sent away from the source towards the destination.  The destination then should reflect the FECN back to the original source as a BECN.  When the original sender receives the BECN, it should slow down to a predefined rate.

GTS has options both for the reflection of FECNs, and for reacting to BECNs.
To reflect FECNs: traffic-shape fecn-adapt
To slow down in response to BECNs: traffic-shape adaptive <mincir>

The config can be verified with show traffic-shape

The big, obvious detriment to using this old method is that traffic shaping is not per-VC. 

Legacy FRTS
The term "legacy" being attached to a technology is usually a turn-off.  In this case, in order to get the complete feature set, you'll be using Legacy FRTS or the to-be-discussed-later variation of Legacy + MQC.  So "legacy" or not, this is still the method you'll generally want.

This was Cisco's first, and most solid, attempt at shaping for frame-relay.  It was developed specifically for frame relay, unlike GTS above. 

The basic implementation is reasonably simple:

map-class frame-relay frts_102
 frame-relay cir 128000
 frame-relay bc 16000
 frame-relay be 16000

interface Serial0/0
 ip address 172.16.0.1 255.255.255.0
 encapsulation frame-relay
 no fair-queue
 frame-relay traffic-shaping
 frame-relay interface-dlci 102
  class frts_102
 frame-relay interface-dlci 103

Here we've created a map-class (not to be confused with MQC's class-map) "frts_102", and specified CIR of 128K, Bc of 16k, and Be of 16k.  The map-class can do a whole lot more, which we'll see shortly.  We then apply it to DLCI 102 under the serial interface.  But before that takes effect, we need to specify frame-relay traffic-shaping on the interface.  Beware; big pitfall here.  That turns shaping on for every DLCI on the interface.  In this configuration, DLCI 103 just got shaped as well.  The problem with that is DLCI 103 doesn't have a map-class assigned to it, so it gets the default setting of 56k shaping.  Bad news if it was running at 1.5Mbit previously!  So, make sure you have map-classes assigned for every DLCI in advance if you're using this method.

You also may notice the presence of a command if you're poking around the menu:

R1(config-map-class)#frame-relay ?
 <output omitted>
  tc                 Policing Measurement Interval (Tc)

Pay close attention to the "Policing" term, you don't set Tc during shaping.  Tc is derived from the Bc setting, as is common to most shaping commands.  We'll talk more about this during policing, below.

Let's take the config from above and add support for slowing down in response to a BECN.

map-class frame-relay frts_102
 frame-relay cir 128000
 frame-relay bc 16000
 frame-relay be 16000
 frame-relay mincir 32000
 frame-relay adaptive-shaping becn

frame-relay adaptive-shaping becn adds support for reducing speed if a BECN is received. 
frame-relay mincir 32000 tells the shaper to slow down to 32K if a BECN is received.  Often, this will be the "real" CIR, with the CIR in frame-relay cir acting more like a PIR. 

Additionally, to reflect FECN as BECN, you would:

map-class frame-relay frts_102
  frame-relay cir 128000
  frame-relay bc 16000
  frame-relay be 16000
  frame-relay mincir 32000
  frame-relay adaptive-shaping becn
  frame-relay fecn-adapt

And to go one step further, if you wanted to slow down to CIR just based on how deep your own interface queue is, you could use:

map-class frame-relay frts_102
  frame-relay cir 128000
  frame-relay bc 16000
  frame-relay be 16000
  frame-relay mincir 32000
  frame-relay adaptive-shaping becn
  frame-relay fecn-adapt
  frame-relay adaptive-shaping interface-congestion [queue-depth]

If you wanted per-VC WFQ, instead of per-VC FIFO, you would:

map-class frame-relay frts_102
 frame-relay fair-queue

Looking for the results of this can be kind of tricky.  I've gotten used to "show queueing"...

R1#show queueing
Current fair queue configuration:
  Interface           Discard    Dynamic  Reserved  Link    Priority
                      threshold  queues   queues    queues  queues
  Serial0/1           64         256      0         8       1

Well, OK.  That'd be great, except we're using Serial0/0, not Serial0/1 (which is in shutdown).

R1#show frame-relay pvc 102

PVC Statistics for interface Serial0/0 (Frame Relay DTE)

DLCI = 102, DLCI USAGE = LOCAL, PVC STATUS = STATIC, INTERFACE = Serial0/0
<output omitted>
  Queueing strategy: weighted fair
  Current fair queue configuration:
   Discard     Dynamic      Reserved
   threshold   queue count  queue count
    64          16           0
  Output queue size 0/max total 600/drops 0

There it is...  also of note, you have to specify the PVC.  If you just do "show frame-relay pvc" you get none of those details.

The frame-relay fair-queue command can also be fine-tuned with the CDT, number of dynamic queues, and number of RSVP-reservable queues.

CQ and PQ can also be used as queueing methods.  Their commands are:
frame-relay priority-group <pq_number>
frame-relay custom-queue-list <cq_number>

For a basic PQ prioritizing RTP traffic, use frame-relay ip rtp priority. I will cover this more in detail further below as part of a discussion on link fragmentation & interleaving.

frame-relay holdq <depth> can adjust the depth of FIFO and CBWFQ queues.  FIFO is really the only one applicable for legacy FRTS, as the only way to get a CBWFQ is to use legacy FRTS + MQC, which we'll cover later.

frame-relay interface-queue priority enable FR PIPQ for the DLCI, which is beyond the scope of this document.  Details can be found at:
http://www.cisco.com/en/US/docs/ios/12_1t/12_1t1/feature/guide/dtfrpipq.html

frame-relay traffic-rate is something of a macro; it allows CIR and PIR to be set in one single command.  Cisco defaults are calculated for Tc/Bc/Be.

Mixed Legacy & MQC
In my opinion, this is the best form of FRTS.  It has the maximum amount of features, and can leverage the feature coverage of legacy FRTS with the fancing queuing & familiarity of the MQC.

There are a lot of possibilities with this mix, so I will cover basic usage, plus highlight some "cookbook" solutions.

The basic usage is to have a map-class call a policy-map.  For example:

map-class frame-relay frts_102
 service-policy output MQC_POLICY

policy-map MQC_POLICY
 class class-default
  shape average 64000 8000 8000
  shape adaptive 32000

interface Serial0/0
 no frame-relay traffic-shaping
 frame-relay interface-dlci 102
  class frts_102

This basic usage is relatively easy to understand.  Note the removal of frame-relay traffic-shaping from the interface; not only is it not required, it's actually incompatible with this method. 

show traffic-shape is no longer used here, instead, use the MQC method of show policy-map interface

R1#show policy-map int
 Serial0/0: DLCI 102 -

  Service-policy output: MQC_POLICY
    Class-map: class-default (match-any)
      0 packets, 0 bytes
      5 minute offered rate 0 bps, drop rate 0 bps
      Match: any
      Traffic Shaping
           Target/Average   Byte   Sustain   Excess    Interval  Increment
             Rate           Limit  bits/int  bits/int  (ms)      (bytes)
            64000/64000     2000   8000      8000      125       1000
        Adapt  Queue     Packets   Bytes     Packets   Bytes     Shaping
        Active Depth                         Delayed   Delayed   Active
        BECN   0         0         0         0         0         no

This example just barely scratches the surface, however, as many combinations of legacy + MQC are possible here.   For example, the pure MQC method doesn't allow for per-VC fragmentation.  And of course, legacy FRTS doesn't allow for CBWFQ.  You can get both now:

class-map match-all HTTP
 match protocol http

class-map match-all RTP
 match protocol rtp

policy-map MQC_CHILD
 class HTTP
  bandwidth percent 20
 class RTP
  priority percent 50
 class class-default
  fair-queue
  random-detect

policy-map MQC_POLICY
 class class-default
  shape average 64000 8000 8000
  shape adaptive 32000
  service-policy MQC_CHILD

map-class frame-relay frts_102
 frame-relay fragment 600
 service-policy output MQC_POLICY

interface Serial0/0
 frame-relay interface-dlci 102
  class frts_102

Now that's a lot to look at.  Here's the logic:
1) Legacy FRTS map-class 102 matches DLCI 102 by attaching to interface-dlci 102.  Fragmentation is enabled here.
2) Policy-map MQC_POLICY is called by map-class 102.  Hierarchical shaping is enabled here.
3) Policy-map MQC_CHILD is called by policy-map MQC_POLICY, and specifies a CBWFQ for the DLCI, while being shaped to the parameters in MQC_POLICY.

There's no other way to accomplish per-DLCI fragmentation, shaping, CBWFQ and LLQ on a single DLCI other than this method.  It's also of note that per-DLCI fragmentation is a bad idea, but the lab exam isn't about good ideas, it's about knowing all the options.  We'll talk more about fragmentation in a bit.

There are two features specific to Legacy FRTS + MQC that cannot be accomplished with any other method of FRTS.  Those are VATS (Voice Adaptive Traffic Shaping) and VAF (Voice Adaptive Fragmentation). 

The VATS feature decreases the sending rate to mincir whenever there are voice packets in the LLQ.  In this fashion, voice packets can never become DE by the service provider, as they never exceed the provider's CIR.  However, when no packets are in the LLQ, bursting to PIR is permitted, allowing the extra bandwidth to be utilized.

A sample config for VATS is as follows:

class-map match-all VOICE
 match protocol rtp

policy-map MQCFRTS
 class class-default
  shape peak 256000 12800 12800
  shape adaptive 128000
  shape fr-voice-adapt deactivation 30 
  service-policy LLQ

policy-map LLQ
 class VOICE
  priority 64

map-class frame-relay VATS
 service-policy output MQCFRTS

interface Serial0/0
 frame-relay interface-dlci 102
  class VATS

The commands specific to this feature are in bold.  Shape Adaptive is necessary to provide mincir, shape fr-voice-adapt turns on VATS, and priority enables LLQ.  The deactivation feature after shape fr-voice-adapt specifies how long to wait after the last packet leaves the LLQ before returning the speed to PIR.

VAF uses similar logic; fragmentation is almost useless without VoIP involved, so we may as well only fragment when packets are in the LLQ.

A sample VAF configuration:

class-map match-all VOICE
  match protocol rtp

policy-map MQCFRTS
  class class-default
    shape average 64000

policy-map LLQ
  class VOICE
    priority 32

map-class frame-relay VAF
  service-policy output MQCFRTS

interface Serial0/0
  frame-relay fragmentation voice-adaptive deactivation 50
  frame-relay fragment 480 end-to-end
  frame-relay interface-dlci 102
    class VAF

One last item of note, the Legacy FRTS + MQC can be applied per VC without impacting the entire interface.  Certainly a bonus!

MQC/GTS
The final, newest, and suprisingly feature-free method involves creating a policy-map and applying it directly to the interface.  This differs from "regular" Generic Traffic Shaping in that you match DLCIs rather than shaping the entire interface.  This is obviously Cisco's attempt to make a more modern version of FRTS using the MQC methods we're used to, but it comes across as a half-baked attempt.

class-map match-all DLCI_102
 match fr-dlci 102

class-map match-all DLCI_103
 match fr-dlci 103

policy-map MQCGTS
 class DLCI_102
  shape average 64000 8000 8000
 class DLCI_103
  shape average 128000 16000 16000

interface Serial0/0
 service-policy output MQCGTS

There are some advantages, but nothing unique:
 - Fancy queueing: CBWFQ, LLQ, etc, can all be called as child classes, the same as we did with Legacy/MQC above.
- The format is more familiar if you're not used to creating map-classes
- Invidivual DLCIs can be matched; the entire interface need not be impacted

There are many disadvantages:
- The shape adaptive command looks like it's available from the context help, but it does nothing if you apply it.  There's no way to respond to a BECN
- VAFR and VAS can't be used
- Fragmentation must be enabled at the interface level

Now that we've seen the four methods, I want to revisit link fragmentation & interleaving.  I originally planned on showing this implementation with each FRTS method, but as I compiled my notes, I realized it was a big enough topic to stand on it's own.

The point of fragmentation is to guarantee that no one packet will take longer than 10ms to serialize.  By keeping the time down, a "big fat packet", from a protocol such as FTP, can't get in the way of a little priority packet such as VoIP.  Queuing solves part of the problem by escorting priority packets to the front of the line, but that doesn't help if a large packet is already being serialized (put on the wire) just before the little priority packet arrives.  On very low speed links, the time it takes to put 1500 bytes on a wire can be long enough to create a problem with VoIP. 

T1s and faster generally do not need to be fragmented, as they can serialize 1500 byte packets in less than 10ms already.  Remember when calculating if you need fragmentation to base it on line rate, not CIR!  We don't care how fast the service provider wants us to go over a second, how fast the wire can actually perform is what to measure off of. 

Imagine a 128k frame relay link with mixed g729 RTP/VoIP and FTP packets.  A 128k link can serialize 128,000 bits per second.  If you're using an MTU of 1500 bytes, every packet could be as large as 12,000 bits.  12000/128000 = ~.093, or 93ms.  If we want RTP packets to be stalled no longer than 10ms, that packet needs to be roughly 1/10th the size it is now.  In this scenario, we'd fragment to 150 bytes (1200 bits), which changes our formula to 1200/128000 = ~.009, or 9ms.  After the packet gets chopped up into 10 little segments, the first one or two may have gone through when our little priority packet arrives, and this priority packet can be sent no longer than 9ms after arrival.  So to reiterate, let's say fragments 1 and 2 of our big fat packet have gone through, and 3 is just starting to serialize.  Little priority packet arrives, is prioritized, waits the 9ms, and transmits.  After that, the rest of fragments 4-10 are theoretically able to be transmitted (provided no more priority packets arrive in the interim).  This method of inserting priority packets in between fragments of larger non-priority packets is called interleaving.  The term for this process is LFI, or Link Fragmentation & Interleaving.

We've looked at some options above that allowed per-VC LFI.  There's really not much point in this in a production environment.  The physical interface queue is what would become congested, and if you're only fragmenting one VC, and you have four others that are not fragmented, those other four can just as easily send a big fat FTP packet through and choke your fragmented VC.

So here's how you accomplish this feature on each of the above methods:
Generic Traffic Shaping -

interface Serial0/0
 frame-relay fragment 150 end-to-end
 traffic-shape rate 128000

When using this method, you have to enable fragmentation on the entire interface, shown above. That's not necessarily a bad thing, this should be done anyway.

Verification:

R1#show frame-relay frag
interface dlci frag-type size in-frag out-frag dropped-frag
Se0/0 102 end-to-end 150 0 0 0
Se0/0 103 end-to-end 150 0 0 0

Note both DLCIs are being fragmented.

Legacy FRTS -

map-class frame-relay fragclass
 frame-relay fragment 150
 frame-relay ip rtp priority 16384 16383 96
 frame-relay fair-queue

There's quite a bit going on here behind the scenes.  frame-relay fragment 150 only enables fragmentation, not LFI.  In order to achieve LFI, we need some sort of PQ or LLQ to tell the process what to interleave.   The frame-relay fragment command also creates a limited PQ - high & low queues.  The command frame-relay ip rtp priority 16384 16383 96 populates the high queue.  Otherwise, everything would end up in low WFQ, and while fragmentation would be taking place, interleaving would not.  Note the WFQ above is a mandatory command inserted by IOS when you enable fragmentation.

Verification can happen with:
R1#show frame-relay fragment
interface                dlci frag-type  size in-frag    out-frag   dropped-frag
Se0/0                    102  end-to-end 150  0          0          0
R1#show frame-relay pvc 102 | i fragment | priority
  fragment type end-to-end fragment size 150
  ip rtp priority parameters 16384 32767 96000

Mixed Legacy & MQC -

class-map match-all voice
 match protocol rtp

map-class frame-relay fragclass
 frame-relay fragment 150
 service-policy output shaper

policy-map shaper
 class class-default
  shape average 128000 16000 16000
  service-policy LLQ

policy-map LLQ
 class voice
  priority 96

This is similar to Legacy FRTS above, however, now the LLQ is serving the role that frame-relay ip rtp priority was previously.

show frame-relay fragment works as it did before, however, the real action is in show policy-map interface now:

R1# show policy-map interface
 Serial0/0: DLCI 102 -

  Service-policy output: shaper

    Class-map: class-default (match-any)
      0 packets, 0 bytes
      5 minute offered rate 0 bps, drop rate 0 bps
      Match: any
      Traffic Shaping
           Target/Average   Byte   Sustain   Excess    Interval  Increment
             Rate           Limit  bits/int  bits/int  (ms)      (bytes)
           128000/128000    4000   16000     16000     125       2000
        Adapt  Queue     Packets   Bytes     Packets   Bytes     Shaping
        Active Depth                         Delayed   Delayed   Active
        -      0         0         0         0         0         no

      Service-policy : LLQ
        Class-map: voice (match-all)
          0 packets, 0 bytes
          5 minute offered rate 0 bps, drop rate 0 bps
          Match: protocol rtp
   <output truncated>

MQC/GTS -

interface Serial0/0
  frame-relay fragment 150 end-to-end
  service-policy output MQCGTS

I'm not going to show the policy-map for MQCGTS because it's basically irrelevant.  Similar to GTS, fragmentation needs to be enabled for the entire interface.

Now let's touch on FRTP - Frame Relay Traffic Policing.

When I first learned how to use a Cisco router as a frame-relay switch, I used this method, which you've probably seen:

frame-relay switching

Interface Serial0/1
 encapsulation frame-relay
 clockrate 128000
 frame-relay intf-type dce
 frame-relay route 102 interface Serial0/2 201

Interface Serial0/2
 <omitted for brevity>
 frame-relay route 201 interface Serial0/1 102

This is now considered the legacy method of configuring frame-relay switching, and it doesn't support FRTP. 

The modern method is as follows:

frame-relay switching

Interface Serial0/1
 encapsulation frame-relay
 clockrate 128000
 frame-relay intf-type dce
 frame-relay interface-dlci 102 switched

Interface Serial0/2
 <omitted for brevity>
 frame-relay interface-dlci 201 switched

connect one_to_two Serial0/0 102 Serial0/1 201

To add policing to this, you just make a map-class:

map-class frame-relay policer
 frame-relay cir in 64000
 frame-relay cir out 56000
 frame-relay bc in 8000
 frame-relay bc out 7000
 frame-relay be in 8000
 frame-relay be out 56000

The bolded commands are the ones I actually typed, everything else gets added by IOS.  Note the "in" commands are for policing, the "out" commands are for shaping.

Then you apply the map-class to the DLCI:

interface Serial0/0
 frame-relay interface-dlci 102 switched
  class policer

Then you enable policing at the interface level:

interface Serial0/0
 frame-relay policing

I'm sure you noticed above that IOS added a great deal of shaping commands even though we don't have shaping enabled.  Strangely enough, you can police inbound and shape outbound on the same interface at the same time.  I'm having a hard time coming up with a reason for this, as policing is inherently a service provide mechanism, and shaping is almost always used by the customer.  Perhaps a scenario where the SP needed to police inbound for obvious reasons, but the customer had a slow router that couldn't receive traffic at line rate, so they asked the SP to slow it down?  Strange.

I've gone ahead and enabled both:

interface Serial0/0
 frame-relay policing
 frame-relay traffic-shaping

The command to verify is:

R3#show frame-relay pvc 100 | b shaping
  shaping Q full 203660    pkt above DE 0           policing drop 340533
  pvc create time 00:38:50, last time pvc status changed 00:27:12
  policing enabled, 7443 pkts marked DE
  policing Bc  8000        policing Be  8000        policing Tc  125 (msec)
  in Bc pkts   19233       in Be pkts   7443        in xs pkts   340533
  in Bc bytes  2140632     in Be bytes  1177072     in xs bytes  199110132
  cir 56000     bc 7000      be 0         byte limit 875    interval 125
  mincir 28000     byte increment 875   Adaptive Shaping none
  pkts 11236     bytes 1168544   pkts delayed 9553      bytes delayed 993512
  shaping inactive
  traffic shaping drops 0
  Queueing strategy: fifo
  Output queue 0/40, 203660 drop, 9553 dequeued

Last topic: I mentioned above that frame-relay tc <Tc> was used for policing.  This is used if you want to give zero-CIR to the customer - where all traffic is considered exceeding.  Normally, Tc is derived from the CIR and Bc values, if you're not providing CIR or Bc, given the variables we have here, the PIR must be calculated as Be multiplied by the number of Tcs per second.  Configuration might look something like this:

map-class frame-relay policer
 frame-relay cir in 0
 frame-relay bc in 0
 frame-relay be in 8000
 frame-relay tc 125

That would produce a CIR of 0, with a PIR of 64k -- 8000 x 8 intervals (1000 ms / 125 ms) = 64,000.

Hope you enjoyed....

Jeff

No comments:

Post a Comment