Sunday, June 30, 2013

Netflow

This post will be geared towards CCIE lab topics.  I will use Solarwinds' freebie Netflow analyzer in some examples, but the topics, in general, will be geared towards exporting data, not towards collecting it.

Let's kick off with a discussion of versions.  Anyone who's used Netflow before knows version 5 is the one typically used, with some newer implementations using version 9.  So what's the story on all the "lost versions"?

v1 - First implementation, still supported on 12.4T, restricted to IPv4 without subnet masks.
v2-v4 - Internal Cisco versions, never released
v5 - Most commonly used version, IPv4-only, supports subnet masks
v6 - I couldn't find any information at all
v7 - Extension of v5, reportedly used on some Catalyst switches
v8 - First version to support aggregation.  v8's improvements made it into v9
v9 - Latest Cisco standard, supports IPv6, aggregation, and Flexible Netflow (FNF).
"v10" - aka IPFIX, this is the open standard for Netflow and will presumably replace it eventually.  It's called "v10" because the version header in the packet of IPFIX is "10", and is basically an open standard implementation of v9.

We will be focusing primarily on v5 and v9, and touching a little bit on v8.  There's no good argument for using v1, and IOS 12.4(15)T only supports v1, v5, v8 (limited) and v9.  IPFIX/v10 isn't available in 12.4(15)T.  Fortunately - or perhaps unfortunately for those who are looking at this document for reasons other than academic reasons - the Catalyst 3560 that is on the lab exam doesn't support Netflow at all, so we're not going to touch on Catalyst Netflow at all.  Of note, more modern 3560s, such as the 3560-X, do support Netflow.

If you want to know more about the various Netflow versions, here is a fantastic explanation:
http://www.youtube.com/watch?v=rcDQi7M1uo4

At a high-level, here is how Netflow works:
- "Flows" are identified by the collector.  Prior to v9, flows are identified as having the same source IP, source port, destination IP, destination port, protocol (TCP, UDP, ICMP, etc), and input interface.  If they all match, they're considered the same flow.
- Flows are collected to the Netflow cache on the router
- After a timeout, either due to length of the flow exceeding a maximum, the flow explicitly terminating (FIN/RST flag), or no packets being received for the flow for a length of time, the data is collected, along with other appropriate flows, and sent to the Netflow collector. The default timeouts are 30 minutes for active flows, and 15 seconds for inactive flows.
- The Netflow collector collects, and then presents, the data in whatever format you chose.

On a side note, I mentioned above that the protocol is determined on a high-level by protocol number: TCP, UDP, ICMP, etc.  In newer versions of IOS (15.0+), NBAR can be integrated into Netflow for more granular protocol results.  As that is presently outside the scope of the CCIE lab, I will not be discussing it here.

Let's look at some basic Netflow v5 usage.  Here is our lab topology:


R7 (Lo0 7.7.7.7) and R8 (Lo0 8.8.8.8) will be communicating with each other, with R1 running Netflow, and exporting to the Windows XP VM running Solarwinds Free Real-Time Netflow Analyzer:
http://www.solarwinds.com/products/freetools/netflow-analyzer.aspx

We'll enable TCP small servers so that we can utilize chargen to create TCP flows.
R7(config)#service tcp-small-servers
R8(config)#service tcp-small-servers

R7#telnet 8.8.8.8 19 /source-interface lo0
Trying 8.8.8.8, 19 ... Open
 !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefg
!"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefgh
<ctrl-c>

R8#telnet 7.7.7.7 19 /source-int lo0
Trying 7.7.7.7, 19 ... Open
 !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefg
!"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefgh
<ctrl-c>

Even after terminating the output with ctrl-c, the session is still running in the background:

R7#show tcp brief
TCB       Local Address               Foreign Address             (state)
6603F1F0  7.7.7.7.12570               8.8.8.8.19                  ESTAB
6603E708  7.7.7.7.19                  8.8.8.8.51405               ESTAB

Now let's setup Netflow on R1:
R1(config)#int fa0/1
R1(config-if)#ip flow ingress
R1(config-if)#int fa1/0
R1(config-if)#ip flow ingress
R1(config-if)#int fa2/0
R1(config-if)#ip flow ingress
R1(config-if)#exit
R1(config)#ip flow-export version 5
R1(config)#ip flow-export destination 172.16.0.100 2055

It's unlikely you'd be able to start collection in Solarwinds at this point.  The freebie Solarwinds will only let you start collection if it's receiving Netflow packets, and it's unlikely any flows have been sent yet, as the time-out for ongoing flows is presently set to 1 hour.  Let's turn it down:

R1(config)#ip flow-cache timeout active 1

We'll go ahead and turn down the inactive flow timer as well:

R1(config)#ip flow-cache timeout inactive 10

I've fired up the collector on the 172.16.0.100:



As you can see, we've started capturing some traffic.

You probably noticed my use of ip flow ingress on all interfaces that are passing traffic.  This is a new command with Netflow v9.  The old command was ip route-cache flow.  It's still supported but it's almost functionality identical to ip flow ingress.  The one small difference is with sub-interfaces.  If you apply ip flow ingress to a main interface, you're going to get the native VLAN traffic reported only.  If you want the entire interface, you apply ip route-cache flow to the main interface, and it basically acts as a macro, applying ip flow ingress to every sub-interface (even ones created in the future) for you.

One of the most baffling things for me was the ip flow egress command.  There's some very important things to know about its usage.  First of all, do not use it unless you are using Netflow v9.  Netflow v5 doesn't have a concept of ingress and egress.  There's no field in the v5 packet for direction. 

So how do you collect egress traffic information on v1 or v5?  This is simple.  ip flow ingress is applied to every interface and the collector reverses the information behind the scenes.  Logically, if the collector can see all the ingress flows, it would know about all the egress flows, too (what comes in most go out!). We'll talk more about ip flow egress when we get to Netflow v9.

Random Sampled Netflow
As you might imagine, a busy Netflow exporter could not only create a lot of extra CPU and memory usage for the router, but it could create too much traffic on the wire or even swamp the collector.  Sampled Netflow was created to fix this problem.  Sampled Netflow would take every 1 out of X packets and sample it.  The problem with this mechanism is that it may continuously miss flows that are happen in between 1 and X.  Say you are looking at every 1 in 100 packets, and you continuously have a burst every 50th packet.  Sampled Netflow will never see this burst.  Introducing random sampled Netflow, which still grabs every 1 in X packet, but introduces a random element so that it's not precisely every 1 in 100.  Sampled Netflow isn't supported on any equipment on the CCIE lab, but random sampled Netflow is.

Implementation is reasonable simple:

R1(config)#flow-sampler-map NETFLOW-TEST
R1(config-sampler)#mode random one-out-of 10
R1(config-sampler)#exit
R1(config)#int fa0/1
R1(config-if)#no ip flow ingress
R1(config-if)#flow-sampler NETFLOW-TEST
R1(config-if)#int fa1/0
R1(config-if)#no ip flow ingress
R1(config-if)#flow-sampler NETFLOW-TEST
R1(config-if)#int fa2/0
R1(config-if)#no ip flow ingress
R1(config-if)#flow-sampler NETFLOW-TEST

Note I've turned off ip flow ingress on all interfaces first.  ip flow ingress trumps random sampled Netflow. 



We see we're still getting output to the collector.

We can add input filters to random sampled Netflow.  This just tells the Netflow collector to only collect flows that match the access list.

R1(config)#flow-sampler-map FILTERED_NETFLOW
R1(config-sampler)# mode random one-out-of 1
R1(config-sampler)#
R1(config-sampler)#ip access-list extended traffic_acl
R1(config-ext-nacl)# permit ip host 7.7.7.7 host 8.8.8.8
R1(config-ext-nacl)#
R1(config-ext-nacl)#class-map match-all traffic_cm
R1(config-cmap)# match access-group name traffic_acl
R1(config-cmap)#
R1(config-cmap)#policy-map netflow
R1(config-pmap)# class traffic_cm
R1(config-pmap-c)#   netflow-sampler FILTERED_NETFLOW
R1(config-pmap-c)#
R1(config-pmap-c)#int fa0/1
R1(config-if)# no flow-sampler NETFLOW-TEST
R1(config-if)# service-policy input netflow
R1(config-if)#
R1(config-if)#int fa1/0
R1(config-if)# no flow-sampler NETFLOW-TEST
R1(config-if)# service-policy input netflow
R1(config-if)#
R1(config-if)#int fa2/0
R1(config-if)# no flow-sampler NETFLOW-TEST
R1(config-if)# service-policy input netflow

Wordy configuration, isn't it?  You'll notice I changed the "random sampled" Netflow back to "one out of one" packets.  This isn't necessary, but it does demonstrate how you can have non-sampled Netflow but still have input filters.  The configuration isn't that complex, match an ACL with the traffic you want to evaluate on a class-map, match the class-map in a policy-map, and apply the netflow-sampler in the policy-map.  Then apply to interfaces!



And still collecting!

Netflow v9 is a big topic.  The first thing to understand is that there is no set number of fields of a Netflow v9 packet.  They can be defined.  This is know as Flexible Netflow (FNF).  Because of this, a template needs to be periodically sent out to define what the flows will contain, in order to instruct  the collector what to do with the information.

The two other notable changes, due to its flexible nature, is that IPv6 and direction are now supported.  We'll discuss both of them.

Let's start with IPv4 egress collection.

R1(config-if)#int fa0/1
R1(config-if)#no service-policy input netflow
R1(config-if)#int fa1/0
R1(config-if)#no service-policy input netflow
R1(config-if)#int fa2/0
R1(config-if)#no service-policy input netflow
R1(config-if)#exit
R1(config)#ip flow-export version 9
R1(config)#int fa2/0
R1(config-if)#ip flow egress



There we have it - only egress data, as expected.  So why is this egress data any better than just using the inverse ingress on the collector side?  There are three main reasons:

- If you only want to collect flows on one interface and still want the egress traffic.  Obviously in order for egress to work otherwise, you have to collect ingress from every interface.  With egress, you could put ip flow ingress and ip flow egress on the same interface and get both.
- If you want Netflow to sample multicast traffic.  Multicast traffic can't be effectively matched on ingress, because before the router processes the traffic, it's not known what interface or interfaces it will be exiting.
- If WAN links are using compression.  Using the "all interfaces ingress" method of calculating egress creates a problem with compression.  The "outbound" flow is calculated before the compression is applied with that method, potentially showing the link using more bandwidth than it has available.  Using ip flow egress calculates after compression.

Let's take a look at what the actual packets look like, courtesy of Wireshark.



Sorry for having to click the image, the Wireshark output is just too big to insert natively into the blog.

Note the final line: "no template found"

This is normal for Netflow v9.  Since Netflow exporting is inherently one-way, there's no way for the collector to ask for the template when it fires up.  The template is like the a Rosetta stone, the collector doesn't know what to do with the data it's given. 



Luckily the templates come pretty regularly.  Wait a minute, templates?  We didn't configure a template.  Technically speaking this isn't FNF.  Netflow v9 has a default template that's used unless you configure FNF, which we'll do further on in the blog.

The next packet contained a template.  Also included in the next packet was another data sample.  Now we can understand what's in the flow data:



And now we can also see the important "Direction" field.

You can also adjust how frequently the template is sent:
ip flow-export template refresh-rate 2

This would sent the template every other packet.

Netflow Top Talkers is a feature supported on all versions of Netflow (except IPv6, in 12.4T) that will let you see the top talkers for performance debugging purposes.  It can be useful if you don't have a collector.

R1(config)#ip flow-top-talkers
R1(config-flow-top-talkers)#top 10
R1(config-flow-top-talkers)#sort-by bytes

R1#show ip flow top-talkers
SrcIf         SrcIPaddress    DstIf         DstIPaddress    Pr SrcP DstP Bytes
Fa0/1         7.7.7.7         Fa2/0*        8.8.8.8         06 0013 C8CD    44K
Fa1/0         7.7.7.7         Fa2/0*        8.8.8.8         06 311A 0013  4400
2 of 10 top talkers shown. 2 flows processed.

You'll notice the * next to DstIf.  This indicates an egress flow.

Let's take a look at IPv6 Netflow.

R1(config-if)#int fa2/0
R1(config-if)#no ip flow egress  ! disabling IPv4 Netflow
R1(config-if)#ipv6 flow ingress
R1(config-if)#ipv6 flow egress
R1(config-if)#exit
R1(config)#ipv6 flow-export version 9  !  somewhat redundant
R1(config)#ipv6 flow-export destination 172.16.0.100 2055

R8#telnet CC1E::7 19 /source-int lo0
Trying CC1E::7, 19 ... Open
 !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefg
!"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefgh

R7#telnet CC1E::8 19 /source-int Lo0
Trying CC1E::8, 19 ... Open
 !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefg
!"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefgh

I won't bother showing you any Solarwinds results at this point, because the freebie edition doesn't support IPv6.  We'll have to rely on Wireshark output.



Aside from optionally playing with the fields in FNF, that's about all there is to IPv6 Netflow.  Note there is no IPv6 edition of top-talkers in 12.4(15)T.

Something I found interesting in Netflow v9 is that basic modifications can be made to the data output without actually using FNF.  For example:

Rack1R4(config)#ip flow-export ver 9 ?
  bgp-nexthop  record BGP NextHop
  origin-as    record origin AS
  peer-as      record peer AS

You can optionally include some BGP information right off the "flow-export" command.  Before I'd used FNF I was confused as to how these fields would interact with FNF -- what if you included a parameter with ip flow-export but then didn't include it in FNF?  Then I discovered FNF doesn't even use the ip flow-export command, so it became a non-issue.

Before we move on to FNF, let's look at one last topic.  This is really a v8 topic, but since it also interfaces with v9, I stuck it in here:

Netflow Aggregation
It may be of more efficient use to group IPs rather than seeing individual flows for every source/destination.  What if we grouped all source, or all destination, based on the routing table?  This may be a better real-world use, as it's a fair bet that the routing table's prefix-size is a pretty good indicator of how systems would be grouped.  This feature was the reason behind Netflow v8, and unless you enable Netflow v9 manually, you'll still get v8 packets.

R1(config-if)#int fa2/0
R1(config-if)#no ipv6 flow ingress
R1(config-if)#no ipv6 flow egress
R1(config-if)#ip flow-aggregation cache destination-prefix
R1(config-flow-cache)# cache entries 1024
R1(config-flow-cache)# export destination 172.16.0.100 2055
R1(config-flow-cache)# mask destination minimum 16 ! indicates never go bigger than a /16
R1(config-flow-cache)# enabled
R1(config-flow-cache)#
R1(config-flow-cache)#int fa1/0
R1(config-if)#ip flow ingress
R1(config-if)#int fa0/1
R1(config-if)#ip flow ingress

This would aggregate based on destination prefix; if you wanted to aggregate based on source prefix, you would substitute:

R1(config-if)#ip flow-aggregation cache destination-prefix source-prefix
R1(config-flow-cache)# mask destination source minimum 16

I've only got one flow, and it's attached to a /32, so this isn't going to be too impressive for output, but I do want to show the v8 packet:



Now you can say you've seen a Netflow v8 packet!  Not exactly anything to write home about ...

R1(config)#ip flow-aggregation cache destination-prefix
R1(config-flow-cache)#export version 9

Now we're back to v9 packets.

Here's a rather curious command -- ip flow-egress input-interface

You may have wondered why I made such an elaborate lab for Netflow. Demonstrating this command is the reason why.  I've got two equal cost EIGRP routes from R7 to R8, via R2 and R3.  I'm using CEF per-packet load sharing on R7 (ip load-sharing per-packet) to be sure roughly half the packets from the chargen (TCP 19) go down each path. In this fashion, R1 will receive 50% of the packets destined for R8 on Fa0/1 and the other 50% on Fa1/0.

The reason is to demonstrate the ability to swap the egress and ingress fields as key fields. What is a key field?  As you're aware, in v9, the exported fields can be changed. The key fields are a "must match" - in other words, they need to be present in the flow, or that flow will not be cached & exported. The key fields also must all match across all packets for them to be considered part of the same flow. The non-key fields don't need to match, and will be exported only if they're present.  Not all fields are interchangeable, many that can be used as key fields cannot be used as non-key fields.

As an example, obvious key fields could be source & destination IP, with a non-key field of destination AS number. 

ip flow-egress input-interface shifts the default egress key field from the output interface to the input-interface.  What's that mean to us? 

I've stopped all the chargen sessions except one from R8 to R7 (in other words, R7 telnetted to R8's Lo0 on TCP port 19).  I've removed all interface-level Netflow commands and added ip flow egress to Fa2/0.  R1 will see one flow by default, because the egress-interface is the default key field for egress flows.

Let's double-check that theory.



Hard to prove over screenshots, but this is the recurring pattern - one template + one flow.  The template is coming consistently because I configured it to arrive every other packet (better for fast labbing). Then we see the one flow, which is the only one we'll see, because source/dest (and all other key fields) are the same, as well as egress interface.

What if we wanted to see the flows separately?

R1(config)#ip flow-egress input-interface

While both fields are still in the packet, the one that matters for matching the flow is now the input interface instead of egress interface.  Let's look at the change:



Now we see one template plus two flows, one for each ingress interface.

Flexible Netflow (FNF)

Let's build out a sample of FNF.  Solarwinds freebie edition doesn't support FNF, so once again we'll be looking at the outcome from Wireshark.

First, we need to remove all traditional Netflow v9 commands.  The command set we've been using thus far only works with the default v9 template, changing it makes the rest of the traditional commands unnecessary:

R1(config)#no ipv6 flow-export destination 172.16.0.100 2055
R1(config)#no ip flow-top-talkers
R1(config)#no ip flow-aggregation cache destination-prefix
R1(config)#no ip flow-export destination 172.16.0.100 2055
R1(config)#no ip flow-export template refresh-rate 2
R1(config)#no ip flow-export version 9
R1(config)#no ip flow-cache timeout active 1
R1(config)#no ip flow-cache timeout inactive 10
R1(config)#no ip flow-egress input-interface
R1(config)#int fa2/0
R1(config-if)#no ip flow egress

FNF reminds me a bit of building a MQC QoS policy.

You create:
Flow Records, which set your key and non-key fields
Flow Exporter, which details where and how to send the exports
Flow Monitors, which match the flow records and exporters, and are then applied to an interface.

On 12.4(15)T, IPv6 isn't supported on FNF; I had to use the default template to get IPv6 flows exported. 

There are over a hundred fields that can be exported, so I'm just going to show one sample here, as a small book could be written about FNF in and of itself.

R1(config)#flow record FLOW-RECORD-TEST
R1(config-flow-record)# match ipv4 source address
R1(config-flow-record)# match ipv4 destination address
R1(config-flow-record)# collect flow direction   ! IMPORTANT
R1(config-flow-record)# collect interface input
R1(config-flow-record)# collect routing next-hop address ipv4

match denotes a key field, collect denotes a non-key field.

Note I flagged the collect flow direction line.  By default, FNF does not export anything, so as best practice you should export collect flow-direction.  Otherwise, the collector will not know if the flow is ingress or egress, although I've heard that most collectors assume ingress if this record is absent.

R1(config-flow-record)#flow exporter FLOW-EXPORTER-TEST
R1(config-flow-exporter)# destination 172.16.0.100
R1(config-flow-exporter)# source FastEthernet1/0
R1(config-flow-exporter)# transport udp 2055
R1(config-flow-exporter)# template data timeout 60

This is pretty obvious; setting the destination, port, template timeout, etc.

R1(config-flow-exporter)#flow monitor FLOW-MONITOR-TEST
R1(config-flow-monitor)# record FLOW-RECORD-TEST
R1(config-flow-monitor)# exporter FLOW-EXPORTER-TEST
R1(config-flow-monitor)# cache timeout active 60

R1(config-flow-monitor)#interface fa2/0
R1(config-if)# ip flow monitor FLOW-MONITOR-TEST input
R1(config-if)# ip flow monitor FLOW-MONITOR-TEST output

R1#show flow monitor
Flow Monitor FLOW-MONITOR-TEST:
  Description:       User defined
  Flow Record:       FLOW-RECORD-TEST
  Flow Exporter:     FLOW-EXPORTER-TEST
  Cache:
    Type:              normal
    Status:            allocated
    Size:              4096 entries / 196620 bytes
    Inactive Timeout:  15 secs
    Active Timeout:    60 secs
    Update Timeout:    1800 secs

And the packets?



There it is - just the fields we asked for.

Hope you enjoyed,

Jeff

Tuesday, June 11, 2013

Everything NAT

IOS has a plethora of NAT features that span from simple 1:1 NATs to policy NATs to basic round-robin load balancing. I've done a lot of NAT in my career, but most of it has been on an ASA.  Some of these features are not so obvious on IOS, and I've sometimes had a hard time producing specific functionality when I had to do anything beyond a basic NAT or PAT.  Here, I have deep-dived every NAT feature I can find, including use cases.  We will start with introducing the easy features, cover some directional issues, and then move on to advanced features.

Our lab is as follows:



The subnet between R1, R2, R3 and R4 will be 192.168.0.0/24, with the fourth octet being the router number.  The link between R4 and R5 will be 30.0.0.0/24 with the fourth octet being the router number. Each router will have a loopback of X.X.X.X where X is its router number (i.e. R3 = 3.3.3.3).  R4 will perform all the NATing.

Static 1:1 NAT
R1, R2, and R3 all have a default route pointing towards R4.  These will be our "inside".  R5 doesn't have a route for anything other than its own loopback and connected 30.0.0.0/24 connected segment.  This will be our "outside".

Let's get R1 and R5 talking to each other.

R4(config)#int fa0/1
R4(config-if)#ip nat inside
R4(config-if)#int fa0/0
R4(config-if)#ip nat outside
R4(config)#ip nat inside source static 192.168.0.1 30.0.0.20

R1#ping 30.0.0.5
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 30.0.0.5, timeout is 2 seconds:
..!!!
Success rate is 60 percent (3/5), round-trip min/avg/max = 48/52/60 ms

R5#debug ip icmp
ICMP packet debugging is on
R5#
*Mar  1 00:26:35.911: ICMP: echo reply sent, src 30.0.0.5, dst 30.0.0.20
*Mar  1 00:26:37.899: ICMP: echo reply sent, src 30.0.0.5, dst 30.0.0.20
*Mar  1 00:26:37.979: ICMP: echo reply sent, src 30.0.0.5, dst 30.0.0.20
*Mar  1 00:26:38.027: ICMP: echo reply sent, src 30.0.0.5, dst 30.0.0.20

Really straightforward.  This flips the source address from 192.168.0.1 to 30.0.0.20 when moving from inside to outside.  From outside to inside the destination address will be flipped from 30.0.0.20 back to 192.168.0.1.  R4 will ARP for 30.0.0.20 on the outside, which we can see via the alias table:

R4(config)#do show ip alias
Address Type             IP Address      Port
Interface                4.4.4.4
Interface                30.0.0.4
Dynamic                  30.0.0.20
Interface                192.168.0.4

If for some reason we don't want R4 to ARP for 30.0.0.20, we could use no-alias:

R4(config)#ip nat inside source static 192.168.0.1 30.0.0.20 no-alias
R4(config)#do sh ip alias
Address Type             IP Address      Port
Interface                4.4.4.4
Interface                30.0.0.4
Interface                192.168.0.4

R5#clear arp
R5#ping 30.0.0.20
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 30.0.0.20, timeout is 2 seconds:
.....
Success rate is 0 percent (0/5)

Let's make R4 ARP again.

R4(config)#no ip nat inside source static 192.168.0.1 30.0.0.20 no-alias
R4(config)#ip nat inside source static 192.168.0.1 30.0.0.20

R5#ping 30.0.0.20
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 30.0.0.20, timeout is 2 seconds:
.!!!!
Success rate is 80 percent (4/5), round-trip min/avg/max = 28/54/80 ms

R1#debug ip icmp
ICMP packet debugging is on
R1#
*Mar  1 00:38:39.007: ICMP: echo reply sent, src 192.168.0.1, dst 30.0.0.5
*Mar  1 00:38:39.067: ICMP: echo reply sent, src 192.168.0.1, dst 30.0.0.5
*Mar  1 00:38:39.103: ICMP: echo reply sent, src 192.168.0.1, dst 30.0.0.5
*Mar  1 00:38:39.175: ICMP: echo reply sent, src 192.168.0.1, dst 30.0.0.5

You see the mapping is bidirectional, R5 can reach R1.

Let's create some more traffic and check out the NAT table.

R1#telnet 30.0.0.5
Trying 30.0.0.5 ... Open

Password required, but none set
[Connection to 30.0.0.5 closed by foreign host]

R4#sh ip nat translations
Pro Inside global      Inside local       Outside local      Outside global
tcp 30.0.0.20:62371    192.168.0.1:62371  30.0.0.5:23        30.0.0.5:23
--- 30.0.0.20          192.168.0.1        ---                ---
There's some interesting stuff here.  We see the entry created by our nat statement:
--- 30.0.0.20          192.168.0.1        ---                ---

We'll go over this more further down the document.  Let's focus on:
Pro Inside global      Inside local       Outside local      Outside global
tcp 30.0.0.20:62371    192.168.0.1:62371  30.0.0.5:23        30.0.0.5:23

Why is this here?  I thought this was NAT, not PAT, so we shouldn't need all these port numbers.  For that matter we don't even care about the outside local/global addresses, really.

This is because of a feature activated by ip nat create flow-entries.  This is a default-on feature to accelerate the NAT process.  If you want to disable it, you'd use:

R1#telnet 30.0.0.5
Trying 30.0.0.5 ... Open

Password required, but none set
[Connection to 30.0.0.5 closed by foreign host]

R4(config)#do sh ip nat trans
Pro Inside global      Inside local       Outside local      Outside global
--- 30.0.0.20          192.168.0.1        ---                ---

That's more what you'd expect to see, even if it is slower.  I've now re-enabled ip nat create flow-entries.

Static PATs. 

First we'll remove our NAT.

R4(config)#no ip nat inside source static 192.168.0.1 30.0.0.20
R4(config)#ip nat inside source static tcp 192.168.0.1 19 30.0.0.20 5000

This should map port 19 (chargen) on the inside to port 5000 on the outside.

R1(config)#service tcp-small-servers   ! enable chargen on R1

R4(config)#do sh ip nat trans
Pro Inside global      Inside local       Outside local      Outside global
tcp 30.0.0.20:5000     192.168.0.1:19     ---                ---

There's the translation from our static PAT.

We see that we can no longer telnet there:

R5#telnet 30.0.0.20
Trying 30.0.0.20 ...
% Connection refused by remote host

What about telnetting to port 5000?

R5#telnet 30.0.0.20 5000
Trying 30.0.0.20, 5000 ... Open
 !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefg
!"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefgh
"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghi

Clearly, we can reach chargen on R1.

We've also added the expected entry in the NAT table:

R4(config)#do sh ip nat trans
Pro Inside global      Inside local       Outside local      Outside global
tcp 30.0.0.20:5000     192.168.0.1:19     30.0.0.5:13368     30.0.0.5:13368
tcp 30.0.0.20:5000     192.168.0.1:19     ---                ---

What if we were translating some protocol that needed an ALG (Application Layer Gateway)?  Turns out IOS's NAT process has some fixups built-in for applications that contain IP and port information inside the packet.  This process happens be default.  If you want to disable it, you'd use:

ip nat inside source static tcp 192.168.0.1 19 30.0.0.20 5000 no-payload

Dynamic NAT

I've removed the static NAT/PAT config.

R4(config)#access-list 90 permit 192.168.0.0 0.0.0.255
R4(config)#ip nat pool nat-pool 30.0.0.50 30.0.0.70 netmask 255.255.255.0
R4(config)#ip nat inside source list 90 pool nat-pool

This will perform a 1:1 NAT translation, dynamically, for the first 20 hosts on 192.168.0.0/24 on to 30.0.0.50 through 70. 

R1#telnet 30.0.0.5
Trying 30.0.0.5 ... Open

Password required, but none set
[Connection to 30.0.0.5 closed by foreign host]

R4(config)#do sh ip nat trans
Pro Inside global      Inside local       Outside local      Outside global
tcp 30.0.0.50:52584    192.168.0.1:52584  30.0.0.5:23        30.0.0.5:23
--- 30.0.0.50          192.168.0.1        ---                ---

We see that 192.168.0.1 has translated to 30.0.0.50 as expected.  Now that this is setup we'll see that dynamic NAT is actually reversible:

R5#telnet 30.0.0.50
Trying 30.0.0.50 ... Open

Password required, but none set
[Connection to 30.0.0.50 closed by foreign host]

R4(config)#do sh ip nat trans
Pro Inside global      Inside local       Outside local      Outside global
tcp 30.0.0.50:23       192.168.0.1:23     30.0.0.5:63636     30.0.0.5:63636
tcp 30.0.0.50:52584    192.168.0.1:52584  30.0.0.5:23        30.0.0.5:23
--- 30.0.0.50          192.168.0.1        ---                ---

What about the other routers?

R2#ping 30.0.0.5
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 30.0.0.5, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 32/55/96 ms

R4(config)#do sh ip nat trans
Pro Inside global      Inside local       Outside local      Outside global
--- 30.0.0.50          192.168.0.1        ---                ---
icmp 30.0.0.51:1       192.168.0.2:1      30.0.0.5:1         30.0.0.5:1
--- 30.0.0.51          192.168.0.2        ---                ---

We now see the new dynamic mapping, 192.168.0.2 = 30.0.0.51. 

What if we wanted to do a bulk 1:1 NAT?

R4(config)#do clear ip nat trans *
R4(config)#no ip nat inside source list 90 pool nat-pool
R4(config)#ip nat inside source static network 192.168.0.0 30.0.0.0 /24

This will do a pretty clever thing, and match the fourth octet on a 1:1 basis when generating traffic from inside -> outside.  Outside -> inside is reversible after the inside->outside translation has taken place and is in the table.

R3#telnet 30.0.0.5
Trying 30.0.0.5 ... Open

Password required, but none set
[Connection to 30.0.0.5 closed by foreign host]

R4(config)#do sh ip nat trans
Pro Inside global      Inside local       Outside local      Outside global
--- 30.0.0.3           192.168.0.3        ---                ---
--- 30.0.0.0           192.168.0.0        ---                ---

We see .3 = .3, as anticipated.

Let's convert the earlier example to dynamic PAT instead.

Dynamic PAT

R4(config)#ip nat inside source list 90 pool nat-pool overload

Now our "nat-pool" nat pool still references 20 IPs, which is unnecessary, this would work fine with one IP address.  But let's test anyway:

R1#telnet 30.0.0.5
Trying 30.0.0.5 ... Open

Password required, but none set
[Connection to 30.0.0.5 closed by foreign host]

R2#ping 30.0.0.5
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 30.0.0.5, timeout is 2 seconds:
.!!!!
Success rate is 80 percent (4/5), round-trip min/avg/max = 36/45/56 ms

R4(config)#do sh ip nat trans
Pro Inside global      Inside local       Outside local      Outside global
tcp 30.0.0.52:26904    192.168.0.1:26904  30.0.0.5:23        30.0.0.5:23
icmp 30.0.0.52:2       192.168.0.2:2      30.0.0.5:2         30.0.0.5:2

We see that both our sessions are now sourced dynamically off 30.0.0.52, instead of one IP per device.

You can also PAT to an interface:

R4(config)#no ip nat inside source list 90 pool nat-pool overload
R4(config)#ip nat inside source list 90 interface fa0/0

I ran the same telnet/ping from R1 and R2 here, not shown.

R4(config)#do sh ip nat trans
Pro Inside global      Inside local       Outside local      Outside global
tcp 30.0.0.4:30046     192.168.0.1:30046  30.0.0.5:23        30.0.0.5:23
icmp 30.0.0.4:3        192.168.0.2:3      30.0.0.5:3         30.0.0.5:3

We see all the sessions coming off the interface IP, 30.0.0.4.  Note I did not use the overload command above.  I could've, but it's implied when you PAT off an interface in this fashion.

Let's say you want a catch-all host behind your PAT.  It would get all the traffic not going somewhere else.  This is similar to the "DMZ host" feature that's on a lot of economy routers.  Let's make R2 our catch-all:

ip nat inside source static 192.168.0.2 interface Fa0/0

Just to reconfirm that R1 can still reach from inside->outside:

R1#ping 30.0.0.5
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 30.0.0.5, timeout is 2 seconds:
.!!!!
Success rate is 80 percent (4/5), round-trip min/avg/max = 36/46/68 ms

I've enabled local login on R2 (not shown here).

R5#telnet 30.0.0.4
Trying 30.0.0.4 ... Open

User Access Verification
Password:
R2>

R4(config)#do sh ip nat trans
Pro Inside global      Inside local       Outside local      Outside global
tcp 30.0.0.4:23        192.168.0.2:23     30.0.0.5:52330     30.0.0.5:52330
--- 30.0.0.4           192.168.0.2        ---                ---

That covers all the basics, let's move on...

NAT Table & Order of Operations

So far we've been looking at "domain based" NAT.  "domains" meaning inside & outside.  As we've been seeing, the NAT table for domain-based NAT is viewed by show ip nat translations.  But we haven't examined what these entries mean too much.

We're actually going to cover four topics here:

- Examining the NAT table
- Eliminating routing by using NAT
- ip nat outside
- Nat Virtual Interface (NVI)

I've eliminated all the existing NAT configuration and we're starting from scratch.

I've also removed the default route on R1 that's pointing at R4:
R1(config)#no ip route 0.0.0.0 0.0.0.0 192.168.0.4

R4(config)#ip nat inside source static 192.168.0.1 30.0.0.50
R4(config)#ip nat outside source static 30.0.0.5 192.168.0.50

The idea is to have R1 perceive R5 as 192.168.0.50, and R5 to perceive R1 as 30.0.0.50.

Let's have a look at the NAT tables.



I personally don't care for the layout of the domain-based NAT table.  Here's the way I decipher it.

These are both source NATs.  The top one is created by ip nat outside, the bottom one is created by ip nat inside.  I'm going to generate a traffic flow so that we can see the outcome of this better.

(Note I've fixed something behind-the-scenes here so that I can demonstrate this point first.  We'll discuss later.)



I've created a ping on R1:
SOURCE: 192.168.0.1
DESTINATION: 192.168.0.50

R4 performed two NATs:
1) A source NAT from 192.168.0.1 to 30.0.0.50
2) A reverse source NAT of 192.168.0.50 to 30.0.0.5.  The source NAT is for the outside->inside direction (30.0.0.5 -> 192.168.0.50), and this is the "reversible" method we've been discussing.

That's all fine and dandy, but here's my quicky method for seeing what this all means:



If inside->outside, our pre-translation packet is the inside pair (inside local, outside local) or "1 -> 2" (192.168.0.1 -> 192.168.0.50) and our post-translation packet is the outside pair (inside global -> outside global) or "3 -> 4" (30.0.0.50 -> 30.0.0.5).

Outside -> Inside is exactly flipped:



Original packet is (Outside Global, Inside Global) or "1 -> 2" (30.0.0.5 -> 30.0.0.50); and our post-translation packet is the inside pair, reversed (Outside Local -> Inside Local) or "3 -> 4" (192.168.0.50 -> 192.168.0.1).

As such, we are now able to get by on translations and ARPs, no routing is required.... sort of.

I mentioned I'd "fixed"  something undisclosed above, let's look at what would have gone wrong here.  I removed the fix.

Disabling CEF so that we can debug the transit packets...

R4(config)#int fa0/0
R4(config-if)#no ip route-cache
R4(config-if)#int fa0/1
R4(config-if)#no ip route-cache
R4(config-if)#do debug ip packet
IP packet debugging is on
R4(config-if)#do debug ip nat
IP NAT debugging is on

R1#telnet 192.168.0.50
Trying 192.168.0.50 ...
% Connection refused by remote host

Clearly broken, what'd R4's debug have to say?

R4(config-if)#
*Mar  1 07:41:33.458: IP: tableid=0, s=192.168.0.1 (FastEthernet0/1), d=192.168.0.50 (FastEthernet0/1), routed via RIB
*Mar  1 07:41:33.458: IP: s=192.168.0.1 (FastEthernet0/1), d=192.168.0.50 (FastEthernet0/1), len 44, rcvd 3
*Mar  1 07:41:33.466: IP: tableid=0, s=192.168.0.50 (local), d=192.168.0.1 (FastEthernet0/1), routed via FIB
*Mar  1 07:41:33.466: IP: s=192.168.0.50 (local), d=192.168.0.1 (FastEthernet0/1), len 40, sending

The issue is on line 1.  We're routing from Fa0/1 to Fa0/1.  That's because even though R4 ARPed for 192.168.0.50, it sees the egress interface as the one it came in on. 

This is where order of operations comes in.  Inside->Outside and Outside->Inside NAT are handled differently.

Inside->Outside "routes first" and NATs second.  I put "routes first" in quotes, because it's more like "picks an interface first" (which I suppose is routing). Outside->Inside nat NATs first and "routes second". 

Problem is, the packet is basically deemed invalid before the NAT even happens.  We need a more specific route to fix this issue.  This is what I "fixed" earlier.

R4(config)#ip route 192.168.0.50 255.255.255.255 fa0/0

This /32 route will push traffic for 192.168.0.50 on to the outside interface.

R1#telnet 192.168.0.50
Trying 192.168.0.50 ... Open

Password required, but none set
[Connection to 192.168.0.50 closed by foreign host]

Now we can connect!

R4(config)#
*Mar  1 07:53:13.510: IP: tableid=0, s=192.168.0.1 (FastEthernet0/1), d=192.168.0.50 (FastEthernet0/0), routed via RIB
*Mar  1 07:53:13.510: NAT: s=192.168.0.1->30.0.0.50, d=192.168.0.50 [49922]
*Mar  1 07:53:13.514: NAT: s=30.0.0.50, d=192.168.0.50->30.0.0.5 [49922]
*Mar  1 07:53:13.514: IP: s=30.0.0.50 (FastEthernet0/1), d=30.0.0.5 (FastEthernet0/0), g=30.0.0.5, len 44, forward

And now we're seeing traffic going from Fa0/1 to Fa0/0, and then the two pre-discussed NATs happening.

A slightly cleaner way to make this happen:

R4(config)#no ip route 192.168.0.50 255.255.255.255 fa0/0
R4(config)#ip nat outside source static 30.0.0.5 192.168.0.50 add-route

The "add-route" command creates the static route towards 192.168.0.50 on Fa0/0 automatically:

R4(config)#do sh ip route static
     192.168.0.0/24 is variably subnetted, 2 subnets, 2 masks
S       192.168.0.50/32 [1/0] via 30.0.0.5

There's another method, as well.  NVI-based NAT.
I've read a lot of blogs saying NVI-based NAT is the "new NAT method".  I don't think this is the case, or at least not yet.  Refer to this article from Cisco:
http://www.cisco.com/en/US/tech/tk648/tk361/technologies_q_and_a_item09186a00800e523b.shtml

"NVI stands for NAT Virtual Interface. It allows NAT to translate between two different VRFs."

There's a lot of features that aren't available on NVI-based NAT yet (such as SNAT, and some route-map configurations), and based on the above statement, I am wondering if they're planned for the future?

Anyway, how does this help our NAT/route order-of-operation problem?

NVI-based NAT "double routes".  It picks an egress interface, NATs, and then re-picks an egress interface.  This behavior is symmetric for both "inside" and "outside".  In fact, as we will see, NVI NAT eliminates the concept of inside and outside completely.

I've removed all the existing NAT configuration from R4.

R4(config)#int fa0/0
R4(config-if)#ip nat enable
R4(config)#int fa0/1
R4(config-if)#ip nat enable
R4(config)#ip nat source static 192.168.0.1 30.0.0.50
R4(config)#ip nat source static 30.0.0.5 192.168.0.50

That's it - no inside, no outside, just simple translations.

R1#ping 192.168.0.50
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 192.168.0.50, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 40/56/76 ms

R4#show ip nat nvi translations
Pro Source global      Source local       Destin  local      Destin  global
--- 192.168.0.50       30.0.0.5           ---                ---
icmp 30.0.0.50:16      192.168.0.1:16     192.168.0.50:16    30.0.0.5:16
--- 30.0.0.50          192.168.0.1        ---                ---

Note the new "show" command for NVI.
You'll also note no reference of inside or outside on the show output, everything is considered source or destination.  This makes the debugging, nat statements, etc much easier to figure out. 

A couple catches on NVI NAT.  As I mentioned above, SNAT (discussed later) is unsupported, as are route-maps for 1:1 static NATs (also discussed later).

Before we push on to policy NATs, let's take a quick look at a way to use IOS NAT as a poor man's load balancer.

I've enabled telnet on R1, R2 and R3; let's distribute inbound telnet connections from R5 amongst the three in a round-robin fashion.  I've also given all three routers a default route aimed at R4.  I've removed all the existing NAT config, again.

R4(config)#int fa0/0
R4(config-if)#ip nat outside
R4(config-if)#int fa0/1
R4(config-if)#ip nat inside
R4(config-if)#exit
R4(config)#ip access-list sta vip
R4(config-std-nacl)#permit 30.0.0.25
R4(config-std-nacl)#exit
R4(config)#ip nat pool server-pool 192.168.0.1 192.168.0.3 netmask 255.255.255.0 type rotary
R4(config)#ip nat inside destination list vip pool server-pool

In this case, 30.0.0.25 is our virtual IP (VIP) on the outside. 

and...

R5#telnet 30.0.0.25
Trying 30.0.0.25 ...
% Connection timed out; remote host not responding

That was anti-climatic. 

I'm not sure what causes the problem, but sometimes when I set this up, the router refuses to automatically ARP for the VIP:

R4(config)#do sh ip alias
Address Type             IP Address      Port
Interface                4.4.4.4
Interface                30.0.0.4
Interface                192.168.0.4

But we can force the behavior:

R4(config)#ip alias 30.0.0.25 23
R4(config)#do sh ip alias
Address Type             IP Address      Port
Interface                4.4.4.4
Interface                30.0.0.4
Alias                    30.0.0.25      23
Interface                192.168.0.4

and now it should work:

R5#telnet 30.0.0.25
Trying 30.0.0.25 ... Open

User Access Verification
Password:
R1>exit

[Connection to 30.0.0.25 closed by foreign host]
R5#telnet 30.0.0.25
Trying 30.0.0.25 ... Open

User Access Verification
Password:
R2>exit
[Connection to 30.0.0.25 closed by foreign host]
R5#telnet 30.0.0.25
Trying 30.0.0.25 ... Open

User Access Verification
Password:
R3>

Now that we have that working, what if R1-R3 want to access the outside?

R1#ping 30.0.0.5
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 30.0.0.5, timeout is 2 seconds:
.....
Success rate is 0 percent (0/5)

No luck, we haven't setup an inside->outside translation.
Let's build a dynamic PAT.

R4(config)#ip access-list sta inside
R4(config-std-nacl)#permit 192.168.0.0 0.0.0.255
R4(config)#ip nat inside source list inside interface fa0/0

R1#ping 30.0.0.5
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 30.0.0.5, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 36/52/112 ms

Now we have outside->inside load balancing, and inside->outside dynamic PAT.

If you've ever used a "lesser" router and tried to forward a range of ports (say TCP 10 through 30) from the outside to an inside address, you probably did it with relative ease.  You may have also struggled trying to get this to work in the Cisco world, which does "port forwards" one at a time via static PAT.  There's a workaround to be had using this same NAT Rotary feature:

First we remove the old config:

R4(config)#no ip nat pool server-pool 192.168.0.1 192.168.0.3 netmask 255.255.255.0 type rotary
R4(config)#no ip nat inside destination list vip pool server-pool

Then we implement the workaround.  We want to forward TCP 10 - 30 to R1.

Create a rotary pool of just R1:

R4(config)#no ip nat pool server-pool 192.168.0.1 192.168.0.1 netmask 255.255.255.0 type rotary

Create an access-list specifying the traffic to "rotary load-balance" to our single server:

R4(config)#ip access-list extended port-forwarding
R4(config-ext-nacl)#permit tcp any any range 10 30
R4(config-ext-nacl)#exit

And apply the policy:

R4(config)#ip nat inside destination list port-forwarding pool server-pool

And test:

R5#telnet 30.0.0.4
Trying 30.0.0.4 ... Open

User Access Verification
Password:
R1>exit

[Connection to 30.0.0.4 closed by foreign host]

Telnet (TCP 23) works.

R5#telnet 30.0.0.4 19
Trying 30.0.0.4, 19 ... Open
 !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefg
!"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefgh
chargen (TCP 19) works as well.

Let's move on to policy NAT now.

Policy NAT with Extended Access Lists

The simplest way to create a policy NAT is to just use an extended access list.  Up until now, we've been using standard access-lists, which create a simple logic: If the source is on this list, change it.  Now we can say things like "If source is on this list and you're headed towards a specific IP range, then change it".  In my experience, this is most useful for VPNs, where you want to PAT towards the Internet but dynamic NAT to another range over the VPN tunnel.  I'm not going to build that elaborate of a lab, but now you have a reference point for production use.

Some additions to our diagram:


I've given R4 two options for routing "outside": R5 and R6.
R5 has the same IPs as before - the interface IPs between R4 and R5 are 30.0.0.0/24
R6 is using 31.0.0.0/24 between R4 and R6. 

Pretend R7 doesn't exist for now-- we'll get there.

I've removed all the existing NAT config on R4.

R4(config)#int fa0/1
R4(config-if)#ip nat inside
R4(config-if)#int fa0/0
R4(config-if)#ip nat outside
R4(config-if)#int fa1/0
R4(config-if)#ip nat outside

Now that you can reach either R5 or R6 from R4, we need to NAT differently depending on which direction we're going.

One access-list for traffic headed towards R5:
R4(config)#ip access-list ext towards-r5
R4(config-ext-nacl)#permit ip 192.168.0.0 0.0.0.255 30.0.0.0 0.0.0.255

Another access-list for traffic headed towards R6:
R4(config)#ip access-list ext towards-r6
R4(config-ext-nacl)#permit ip 192.168.0.0 0.0.0.255 30.1.0.0 0.0.0.255

Match them in the NAT statements with the appropriate interface:
R4(config)#ip nat inside source list towards-r5 interface fa0/0 overload
R4(config)#ip nat inside source list towards-r6 interface fa1/0 overload

R1#ping 30.0.0.5
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 30.0.0.5, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 16/47/92 ms

R1#ping 30.1.0.6
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 30.1.0.6, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 20/52/116 ms

Pings to R5 and R6 succeed -

R4#sh ip nat trans
Pro Inside global      Inside local       Outside local      Outside global
icmp 30.1.0.4:22       192.168.0.1:22     30.1.0.6:22        30.1.0.6:22
icmp 30.0.0.4:23       192.168.0.1:23     30.0.0.5:23        30.0.0.5:23

and are appropriately NATed.

Now if you're paying attention, you may have already noticed the limitation of this method when used with my diagram.  "Pretend R7 doesn't exist", I said.  What if we're trying to reach 7.7.7.7 using the extended-access list policy NAT method?  Both our access-lists would read the same:

permit ip 192.168.0.0 0.0.0.255 7.7.7.7 0.0.0.0

That's not going to work. In fact, what if the destination was the Internet?  Your access-lists might look like:

permit ip 192.168.0.0 0.0.0.255 any

That's not going to work either. 

Here's where route-maps come in to policy PATs.

Policy NAT with Route-Maps

Cleanup from earlier...
R4(config)#no ip nat inside source list towards-r5 interface fa0/1 overload
R4(config)#no ip nat inside source list towards-r6 interface fa1/0 overload

Build
R4(config)#ip access-list extended towards-outside
R4(config-ext-nacl)#permit ip 192.168.0.0 0.0.0.255 any

Route-maps can do two functions:
1) match access-lists
2) match egress interfaces or next-hops

Some examples will also show them setting interfaces (or next-hops), but I've not seen a functional difference between using "set interface" and "match interface" with policy NAT.  If anyone knows a difference, please comment!

R4(config)#route-map towards-r6 permit 10
R4(config-route-map)#match ip address towards-outside
R4(config-route-map)#match interface FastEthernet1/0
R4(config)#route-map towards-r5 permit 10
R4(config-route-map)#match ip address towards-outside
R4(config-route-map)#match interface FastEthernet0/0

R4(config)#ip nat inside source route-map towards-r5 interface FastEthernet0/0 overload
R4(config)#ip nat inside source route-map towards-r6 interface FastEthernet1/0 overload
R4(config)#ip route 0.0.0.0 0.0.0.0 30.0.0.5
R4(config)#ip route 0.0.0.0 0.0.0.0 30.1.0.6 10

This would simulate a poor man's redundant Internet solution - different static IPs on different ISPs, routing out one at a time.  If Fa0/0 goes down, Fa1/0 will take over.  Let's give it a try:

R1#ping 7.7.7.7
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 7.7.7.7, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 60/76/112 ms

R4#sh ip nat trans
Pro Inside global      Inside local       Outside local      Outside global
icmp 30.0.0.4:40       192.168.0.1:40     7.7.7.7:40         7.7.7.7:40

And now for the failover test:

R4#conf t
Enter configuration commands, one per line.  End with CNTL/Z.
R4(config)#int fa0/0
R4(config-if)#shut

R1#ping 7.7.7.7
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 7.7.7.7, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 48/74/100 ms

R4#sh ip nat trans
Pro Inside global      Inside local       Outside local      Outside global
icmp 30.1.0.4:41       192.168.0.1:41     7.7.7.7:41         7.7.7.7:41

Note that the "match ip address" clause in the route-map is really not necessary in this case, but I included it to show the functionality.  "match interface" is sufficient to make the NAT decision.

We saw earlier than dynamic NAT is typically reversible.  Not so much with route-maps for dynamic NAT.

Reversible Dynamic NAT with Route-Maps

R4(config)#route-map towards-r5 permit 10
R4(config-route-map)#no match interface FastEthernet0/0  ! interface not supported w/ reversible

R4(config)#no ip nat inside source route-map towards-r5 interface FastEthernet0/0 overload
R4(config)#int fa0/0
R4(config-if)#no shut

R4(config)#ip nat pool dynamic-pool 30.0.0.10 30.0.0.100 netmask 255.255.255.0
R4(config)#ip nat inside source route-map towards-r5 pool dynamic-pool reversible

The reversible keyword is required in order to make this scenario happen with route-maps.

Static NAT with Route-Maps

I've cleared all the NAT off R4 again.

R4(config)#int fa0/1
R4(config-if)#ip nat inside
R4(config-if)#int fa0/0
R4(config-if)#ip nat outside
R4(config-if)#int fa1/0
R4(config-if)#ip nat outside

I showed how to do policy-PAT already, but 1:1 is a whole different story.  Let's say this is a server farm, we have two different ISPs, but we're not running BGP and we have separate IP ranges statically assigned from both ISPs.  How do we do a hot/cold failover but maintain the static NAT?

Let's make 192.168.0.1 our "server" and try to forward traffic from two outside IPs towards it.

R4(config)#ip nat inside source static 192.168.0.1 30.0.0.1
R4(config)#ip nat inside source static 192.168.0.1 30.1.0.1
% 192.168.0.1 already mapped (192.168.0.1 -> 30.0.0.1)

You had probably already guessed that that wasn't going to work.

Here's the route-map method to accomplish this:

R4(config)#no ip nat inside source static 192.168.0.1 30.0.0.1
R4(config)#ip access-list extended towards-outside
R4(config-ext-nacl)#permit ip 192.168.0.0 0.0.0.255 any

R4(config)#ip route inside source static 192.168.0.1 30.0.0.1 route-map R1TOANY_VIAISP1
R4(config)#ip route inside source static 192.168.0.1 30.1.0.1 route-map R1TOANY_VIAISP2

R4(config)#do sh ip nat trans
Pro Inside global      Inside local       Outside local      Outside global
--- 30.0.0.1           192.168.0.1        ---                ---
--- 30.1.0.1           192.168.0.1        ---                ---

We still have these in place:
ip route 0.0.0.0 0.0.0.0 30.0.0.5
ip route 0.0.0.0 0.0.0.0 30.1.0.6 10

Verification -

R5#telnet 30.0.0.1
Trying 30.0.0.1 ... Open

User Access Verification
Password:
R1>

R4(config)#int fa0/0
R4(config-if)#shut

R6#telnet 30.1.0.1
Trying 30.1.0.1 ... Open

User Access Verification
Password:
R1>

R4(config)#int fa0/0
R4(config-if)#no shut

There's another way to accomplish something similar.  The extendable command makes sort of a reverse-PAT function for 1:1 NATs.

R4(config)#no ip nat inside source static 192.168.0.1 30.0.0.1 route-map R1TOANY_VIAISP1
R4(config)#no ip nat inside source static 192.168.0.1 30.1.0.1 route-map R1TOANY_VIAISP2

R4(config)#ip nat inside source static 192.168.0.1 30.0.0.1 extendable
R4(config)#ip nat inside source static 192.168.0.1 30.1.0.1 extendable

R4(config)#do sh ip nat trans
Pro Inside global      Inside local       Outside local      Outside global
--- 30.0.0.1           192.168.0.1        ---                ---
--- 30.1.0.1           192.168.0.1        ---                ---

Hmm, the NAT table looks about the same.

R5#telnet 30.0.0.1
Trying 30.0.0.1 ... Open

User Access Verification
Password:
R1>

R4(config)#do sh ip nat trans
Pro Inside global      Inside local       Outside local      Outside global
tcp 30.0.0.1:23        192.168.0.1:23     30.0.0.5:16608     30.0.0.5:16608
--- 30.0.0.1           192.168.0.1        ---                ---
--- 30.1.0.1           192.168.0.1        ---                ---

We'd expect an entry like this based on the default ip nat create flow-entries.  However, this time, it's taken... more literally.  The router is doing what I can only describe as a bi-directional PAT. 

R6#telnet 30.1.0.1
Trying 30.1.0.1 ... Open

User Access Verification
Password:
R1>

R4(config)#do sh ip nat trans
Pro Inside global      Inside local       Outside local      Outside global
tcp 30.0.0.1:23        192.168.0.1:23     30.0.0.5:16608     30.0.0.5:16608
tcp 30.1.0.1:23        192.168.0.1:23     30.1.0.6:16284     30.1.0.6:16284
--- 30.0.0.1           192.168.0.1        ---                ---
--- 30.1.0.1           192.168.0.1        ---                ---

Those entries are there for more than just acceleration, they're actually required now.  In fact, I got curious and disabled ip nat create flow-entries:

R4(config)#no ip nat create flow-entries
R4(config)#do clear ip nat trans *

<sample telnets were done here, not shown>

R4(config)#do sh ip nat trans
Pro Inside global      Inside local       Outside local      Outside global
tcp 30.0.0.1:23        192.168.0.1:23     30.0.0.5:59295     30.0.0.5:59295
tcp 30.1.0.1:23        192.168.0.1:23     30.1.0.6:16284     30.1.0.6:16284
tcp 30.1.0.1:23        192.168.0.1:23     30.1.0.6:63205     30.1.0.6:63205
--- 30.0.0.1           192.168.0.1        ---                ---
--- 30.1.0.1           192.168.0.1        ---                ---

Tough luck, you're getting the flow entries anyway, because this process doesn't work without them.

Arbitrary IPs; Redistributing NAT

Here's something I was always curious about - NATing to totally arbitrary IPs in IOS.
You've certainly gathered by now that you can NAT to anything, even IPs that aren't on any of your interfaces.  That's generally pretty useless because other devices aren't aware how to reach the IPs, whether your router ARPs for them or not.

I've cleared all the prior NAT configuration.  I'd like to NAT 192.168.0.0/24, our inside range, to 207.50.50.0/24.  I don't want to put 207.50.50.0/24 on an interface. I'm going to use NVI NAT for the example, but something similar could be accomplished with domain NAT. 

R4(config)#interface fa0/0
R4(config-if)#ip nat enable
R4(config)#interface fa0/1
R4(config-if)#ip nat enable

R4(config)#ip nat pool arbitrary 207.50.50.1 207.50.50.200 prefix-length 24 add-route

You may remember "add-route" from domain NAT; here it is for NVI NAT.  Note it's applied on the pool with NVI NAT instead of in the NAT statement itself.

R4(config)#ip access-list extended inside-range
R4(config-ext-nacl)#permit ip 192.168.0.0 0.0.0.255 any

R4(config)#ip nat source list inside-range pool arbitrary

NAT is all setup now.

R4(config)#do sh ip route static
S    207.50.50.0/24 [0/0] via 0.0.0.0, NVI0

This static route can now be introduced into our outside routing protocol through redistribution.  Or, you could just use a bgp "network" statement: network 207.50.50.0 mask 255.255.255.0.  In our case, the outside is presently running OSPF, so:

R4(config)#router ospf 1
R4(config-router)#redistribute static subnets

Verify -

R5#sh ip route ospf
O E2 207.50.50.0/24 [110/20] via 30.0.0.4, 00:19:24, FastEthernet0/0
R5#debug ip icmp

R1#ping 30.0.0.5
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 30.0.0.5, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 36/48/88 ms

R5#
*Mar  1 00:36:11.983: ICMP: echo reply sent, src 30.0.0.5, dst 207.50.50.1
*Mar  1 00:36:12.019: ICMP: echo reply sent, src 30.0.0.5, dst 207.50.50.1
*Mar  1 00:36:12.055: ICMP: echo reply sent, src 30.0.0.5, dst 207.50.50.1
*Mar  1 00:36:12.115: ICMP: echo reply sent, src 30.0.0.5, dst 207.50.50.1
*Mar  1 00:36:12.159: ICMP: echo reply sent, src 30.0.0.5, dst 207.50.50.1

Something similar could also be accomplished by creating a static route to null0 - ip route 207.50.50.0 255.255.255.0 null0 - and redistributing it. 

Stateful NAT (SNAT)

We're still missing one big topic in this article still: SNAT, or Stateful NAT.  It's a way of sharing NAT tables across multiple routers, typically via HSRP, for the purpose of hot/hot shared NAT or hot/cold shared NAT.  This method could literally take a blog to itself... in fact, it did!  I had to put it into production a few months ago, so I took the time to blog about it then:

http://brbccie.blogspot.com/2013/03/stateful-nat-with-asymmetric-routing.html

Hope you enjoyed,

Jeff Kronlage