Sunday, June 30, 2013

Netflow

This post will be geared towards CCIE lab topics.  I will use Solarwinds' freebie Netflow analyzer in some examples, but the topics, in general, will be geared towards exporting data, not towards collecting it.

Let's kick off with a discussion of versions.  Anyone who's used Netflow before knows version 5 is the one typically used, with some newer implementations using version 9.  So what's the story on all the "lost versions"?

v1 - First implementation, still supported on 12.4T, restricted to IPv4 without subnet masks.
v2-v4 - Internal Cisco versions, never released
v5 - Most commonly used version, IPv4-only, supports subnet masks
v6 - I couldn't find any information at all
v7 - Extension of v5, reportedly used on some Catalyst switches
v8 - First version to support aggregation.  v8's improvements made it into v9
v9 - Latest Cisco standard, supports IPv6, aggregation, and Flexible Netflow (FNF).
"v10" - aka IPFIX, this is the open standard for Netflow and will presumably replace it eventually.  It's called "v10" because the version header in the packet of IPFIX is "10", and is basically an open standard implementation of v9.

We will be focusing primarily on v5 and v9, and touching a little bit on v8.  There's no good argument for using v1, and IOS 12.4(15)T only supports v1, v5, v8 (limited) and v9.  IPFIX/v10 isn't available in 12.4(15)T.  Fortunately - or perhaps unfortunately for those who are looking at this document for reasons other than academic reasons - the Catalyst 3560 that is on the lab exam doesn't support Netflow at all, so we're not going to touch on Catalyst Netflow at all.  Of note, more modern 3560s, such as the 3560-X, do support Netflow.

If you want to know more about the various Netflow versions, here is a fantastic explanation:
http://www.youtube.com/watch?v=rcDQi7M1uo4

At a high-level, here is how Netflow works:
- "Flows" are identified by the collector.  Prior to v9, flows are identified as having the same source IP, source port, destination IP, destination port, protocol (TCP, UDP, ICMP, etc), and input interface.  If they all match, they're considered the same flow.
- Flows are collected to the Netflow cache on the router
- After a timeout, either due to length of the flow exceeding a maximum, the flow explicitly terminating (FIN/RST flag), or no packets being received for the flow for a length of time, the data is collected, along with other appropriate flows, and sent to the Netflow collector. The default timeouts are 30 minutes for active flows, and 15 seconds for inactive flows.
- The Netflow collector collects, and then presents, the data in whatever format you chose.

On a side note, I mentioned above that the protocol is determined on a high-level by protocol number: TCP, UDP, ICMP, etc.  In newer versions of IOS (15.0+), NBAR can be integrated into Netflow for more granular protocol results.  As that is presently outside the scope of the CCIE lab, I will not be discussing it here.

Let's look at some basic Netflow v5 usage.  Here is our lab topology:


R7 (Lo0 7.7.7.7) and R8 (Lo0 8.8.8.8) will be communicating with each other, with R1 running Netflow, and exporting to the Windows XP VM running Solarwinds Free Real-Time Netflow Analyzer:
http://www.solarwinds.com/products/freetools/netflow-analyzer.aspx

We'll enable TCP small servers so that we can utilize chargen to create TCP flows.
R7(config)#service tcp-small-servers
R8(config)#service tcp-small-servers

R7#telnet 8.8.8.8 19 /source-interface lo0
Trying 8.8.8.8, 19 ... Open
 !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefg
!"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefgh
<ctrl-c>

R8#telnet 7.7.7.7 19 /source-int lo0
Trying 7.7.7.7, 19 ... Open
 !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefg
!"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefgh
<ctrl-c>

Even after terminating the output with ctrl-c, the session is still running in the background:

R7#show tcp brief
TCB       Local Address               Foreign Address             (state)
6603F1F0  7.7.7.7.12570               8.8.8.8.19                  ESTAB
6603E708  7.7.7.7.19                  8.8.8.8.51405               ESTAB

Now let's setup Netflow on R1:
R1(config)#int fa0/1
R1(config-if)#ip flow ingress
R1(config-if)#int fa1/0
R1(config-if)#ip flow ingress
R1(config-if)#int fa2/0
R1(config-if)#ip flow ingress
R1(config-if)#exit
R1(config)#ip flow-export version 5
R1(config)#ip flow-export destination 172.16.0.100 2055

It's unlikely you'd be able to start collection in Solarwinds at this point.  The freebie Solarwinds will only let you start collection if it's receiving Netflow packets, and it's unlikely any flows have been sent yet, as the time-out for ongoing flows is presently set to 1 hour.  Let's turn it down:

R1(config)#ip flow-cache timeout active 1

We'll go ahead and turn down the inactive flow timer as well:

R1(config)#ip flow-cache timeout inactive 10

I've fired up the collector on the 172.16.0.100:



As you can see, we've started capturing some traffic.

You probably noticed my use of ip flow ingress on all interfaces that are passing traffic.  This is a new command with Netflow v9.  The old command was ip route-cache flow.  It's still supported but it's almost functionality identical to ip flow ingress.  The one small difference is with sub-interfaces.  If you apply ip flow ingress to a main interface, you're going to get the native VLAN traffic reported only.  If you want the entire interface, you apply ip route-cache flow to the main interface, and it basically acts as a macro, applying ip flow ingress to every sub-interface (even ones created in the future) for you.

One of the most baffling things for me was the ip flow egress command.  There's some very important things to know about its usage.  First of all, do not use it unless you are using Netflow v9.  Netflow v5 doesn't have a concept of ingress and egress.  There's no field in the v5 packet for direction. 

So how do you collect egress traffic information on v1 or v5?  This is simple.  ip flow ingress is applied to every interface and the collector reverses the information behind the scenes.  Logically, if the collector can see all the ingress flows, it would know about all the egress flows, too (what comes in most go out!). We'll talk more about ip flow egress when we get to Netflow v9.

Random Sampled Netflow
As you might imagine, a busy Netflow exporter could not only create a lot of extra CPU and memory usage for the router, but it could create too much traffic on the wire or even swamp the collector.  Sampled Netflow was created to fix this problem.  Sampled Netflow would take every 1 out of X packets and sample it.  The problem with this mechanism is that it may continuously miss flows that are happen in between 1 and X.  Say you are looking at every 1 in 100 packets, and you continuously have a burst every 50th packet.  Sampled Netflow will never see this burst.  Introducing random sampled Netflow, which still grabs every 1 in X packet, but introduces a random element so that it's not precisely every 1 in 100.  Sampled Netflow isn't supported on any equipment on the CCIE lab, but random sampled Netflow is.

Implementation is reasonable simple:

R1(config)#flow-sampler-map NETFLOW-TEST
R1(config-sampler)#mode random one-out-of 10
R1(config-sampler)#exit
R1(config)#int fa0/1
R1(config-if)#no ip flow ingress
R1(config-if)#flow-sampler NETFLOW-TEST
R1(config-if)#int fa1/0
R1(config-if)#no ip flow ingress
R1(config-if)#flow-sampler NETFLOW-TEST
R1(config-if)#int fa2/0
R1(config-if)#no ip flow ingress
R1(config-if)#flow-sampler NETFLOW-TEST

Note I've turned off ip flow ingress on all interfaces first.  ip flow ingress trumps random sampled Netflow. 



We see we're still getting output to the collector.

We can add input filters to random sampled Netflow.  This just tells the Netflow collector to only collect flows that match the access list.

R1(config)#flow-sampler-map FILTERED_NETFLOW
R1(config-sampler)# mode random one-out-of 1
R1(config-sampler)#
R1(config-sampler)#ip access-list extended traffic_acl
R1(config-ext-nacl)# permit ip host 7.7.7.7 host 8.8.8.8
R1(config-ext-nacl)#
R1(config-ext-nacl)#class-map match-all traffic_cm
R1(config-cmap)# match access-group name traffic_acl
R1(config-cmap)#
R1(config-cmap)#policy-map netflow
R1(config-pmap)# class traffic_cm
R1(config-pmap-c)#   netflow-sampler FILTERED_NETFLOW
R1(config-pmap-c)#
R1(config-pmap-c)#int fa0/1
R1(config-if)# no flow-sampler NETFLOW-TEST
R1(config-if)# service-policy input netflow
R1(config-if)#
R1(config-if)#int fa1/0
R1(config-if)# no flow-sampler NETFLOW-TEST
R1(config-if)# service-policy input netflow
R1(config-if)#
R1(config-if)#int fa2/0
R1(config-if)# no flow-sampler NETFLOW-TEST
R1(config-if)# service-policy input netflow

Wordy configuration, isn't it?  You'll notice I changed the "random sampled" Netflow back to "one out of one" packets.  This isn't necessary, but it does demonstrate how you can have non-sampled Netflow but still have input filters.  The configuration isn't that complex, match an ACL with the traffic you want to evaluate on a class-map, match the class-map in a policy-map, and apply the netflow-sampler in the policy-map.  Then apply to interfaces!



And still collecting!

Netflow v9 is a big topic.  The first thing to understand is that there is no set number of fields of a Netflow v9 packet.  They can be defined.  This is know as Flexible Netflow (FNF).  Because of this, a template needs to be periodically sent out to define what the flows will contain, in order to instruct  the collector what to do with the information.

The two other notable changes, due to its flexible nature, is that IPv6 and direction are now supported.  We'll discuss both of them.

Let's start with IPv4 egress collection.

R1(config-if)#int fa0/1
R1(config-if)#no service-policy input netflow
R1(config-if)#int fa1/0
R1(config-if)#no service-policy input netflow
R1(config-if)#int fa2/0
R1(config-if)#no service-policy input netflow
R1(config-if)#exit
R1(config)#ip flow-export version 9
R1(config)#int fa2/0
R1(config-if)#ip flow egress



There we have it - only egress data, as expected.  So why is this egress data any better than just using the inverse ingress on the collector side?  There are three main reasons:

- If you only want to collect flows on one interface and still want the egress traffic.  Obviously in order for egress to work otherwise, you have to collect ingress from every interface.  With egress, you could put ip flow ingress and ip flow egress on the same interface and get both.
- If you want Netflow to sample multicast traffic.  Multicast traffic can't be effectively matched on ingress, because before the router processes the traffic, it's not known what interface or interfaces it will be exiting.
- If WAN links are using compression.  Using the "all interfaces ingress" method of calculating egress creates a problem with compression.  The "outbound" flow is calculated before the compression is applied with that method, potentially showing the link using more bandwidth than it has available.  Using ip flow egress calculates after compression.

Let's take a look at what the actual packets look like, courtesy of Wireshark.



Sorry for having to click the image, the Wireshark output is just too big to insert natively into the blog.

Note the final line: "no template found"

This is normal for Netflow v9.  Since Netflow exporting is inherently one-way, there's no way for the collector to ask for the template when it fires up.  The template is like the a Rosetta stone, the collector doesn't know what to do with the data it's given. 



Luckily the templates come pretty regularly.  Wait a minute, templates?  We didn't configure a template.  Technically speaking this isn't FNF.  Netflow v9 has a default template that's used unless you configure FNF, which we'll do further on in the blog.

The next packet contained a template.  Also included in the next packet was another data sample.  Now we can understand what's in the flow data:



And now we can also see the important "Direction" field.

You can also adjust how frequently the template is sent:
ip flow-export template refresh-rate 2

This would sent the template every other packet.

Netflow Top Talkers is a feature supported on all versions of Netflow (except IPv6, in 12.4T) that will let you see the top talkers for performance debugging purposes.  It can be useful if you don't have a collector.

R1(config)#ip flow-top-talkers
R1(config-flow-top-talkers)#top 10
R1(config-flow-top-talkers)#sort-by bytes

R1#show ip flow top-talkers
SrcIf         SrcIPaddress    DstIf         DstIPaddress    Pr SrcP DstP Bytes
Fa0/1         7.7.7.7         Fa2/0*        8.8.8.8         06 0013 C8CD    44K
Fa1/0         7.7.7.7         Fa2/0*        8.8.8.8         06 311A 0013  4400
2 of 10 top talkers shown. 2 flows processed.

You'll notice the * next to DstIf.  This indicates an egress flow.

Let's take a look at IPv6 Netflow.

R1(config-if)#int fa2/0
R1(config-if)#no ip flow egress  ! disabling IPv4 Netflow
R1(config-if)#ipv6 flow ingress
R1(config-if)#ipv6 flow egress
R1(config-if)#exit
R1(config)#ipv6 flow-export version 9  !  somewhat redundant
R1(config)#ipv6 flow-export destination 172.16.0.100 2055

R8#telnet CC1E::7 19 /source-int lo0
Trying CC1E::7, 19 ... Open
 !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefg
!"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefgh

R7#telnet CC1E::8 19 /source-int Lo0
Trying CC1E::8, 19 ... Open
 !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefg
!"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefgh

I won't bother showing you any Solarwinds results at this point, because the freebie edition doesn't support IPv6.  We'll have to rely on Wireshark output.



Aside from optionally playing with the fields in FNF, that's about all there is to IPv6 Netflow.  Note there is no IPv6 edition of top-talkers in 12.4(15)T.

Something I found interesting in Netflow v9 is that basic modifications can be made to the data output without actually using FNF.  For example:

Rack1R4(config)#ip flow-export ver 9 ?
  bgp-nexthop  record BGP NextHop
  origin-as    record origin AS
  peer-as      record peer AS

You can optionally include some BGP information right off the "flow-export" command.  Before I'd used FNF I was confused as to how these fields would interact with FNF -- what if you included a parameter with ip flow-export but then didn't include it in FNF?  Then I discovered FNF doesn't even use the ip flow-export command, so it became a non-issue.

Before we move on to FNF, let's look at one last topic.  This is really a v8 topic, but since it also interfaces with v9, I stuck it in here:

Netflow Aggregation
It may be of more efficient use to group IPs rather than seeing individual flows for every source/destination.  What if we grouped all source, or all destination, based on the routing table?  This may be a better real-world use, as it's a fair bet that the routing table's prefix-size is a pretty good indicator of how systems would be grouped.  This feature was the reason behind Netflow v8, and unless you enable Netflow v9 manually, you'll still get v8 packets.

R1(config-if)#int fa2/0
R1(config-if)#no ipv6 flow ingress
R1(config-if)#no ipv6 flow egress
R1(config-if)#ip flow-aggregation cache destination-prefix
R1(config-flow-cache)# cache entries 1024
R1(config-flow-cache)# export destination 172.16.0.100 2055
R1(config-flow-cache)# mask destination minimum 16 ! indicates never go bigger than a /16
R1(config-flow-cache)# enabled
R1(config-flow-cache)#
R1(config-flow-cache)#int fa1/0
R1(config-if)#ip flow ingress
R1(config-if)#int fa0/1
R1(config-if)#ip flow ingress

This would aggregate based on destination prefix; if you wanted to aggregate based on source prefix, you would substitute:

R1(config-if)#ip flow-aggregation cache destination-prefix source-prefix
R1(config-flow-cache)# mask destination source minimum 16

I've only got one flow, and it's attached to a /32, so this isn't going to be too impressive for output, but I do want to show the v8 packet:



Now you can say you've seen a Netflow v8 packet!  Not exactly anything to write home about ...

R1(config)#ip flow-aggregation cache destination-prefix
R1(config-flow-cache)#export version 9

Now we're back to v9 packets.

Here's a rather curious command -- ip flow-egress input-interface

You may have wondered why I made such an elaborate lab for Netflow. Demonstrating this command is the reason why.  I've got two equal cost EIGRP routes from R7 to R8, via R2 and R3.  I'm using CEF per-packet load sharing on R7 (ip load-sharing per-packet) to be sure roughly half the packets from the chargen (TCP 19) go down each path. In this fashion, R1 will receive 50% of the packets destined for R8 on Fa0/1 and the other 50% on Fa1/0.

The reason is to demonstrate the ability to swap the egress and ingress fields as key fields. What is a key field?  As you're aware, in v9, the exported fields can be changed. The key fields are a "must match" - in other words, they need to be present in the flow, or that flow will not be cached & exported. The key fields also must all match across all packets for them to be considered part of the same flow. The non-key fields don't need to match, and will be exported only if they're present.  Not all fields are interchangeable, many that can be used as key fields cannot be used as non-key fields.

As an example, obvious key fields could be source & destination IP, with a non-key field of destination AS number. 

ip flow-egress input-interface shifts the default egress key field from the output interface to the input-interface.  What's that mean to us? 

I've stopped all the chargen sessions except one from R8 to R7 (in other words, R7 telnetted to R8's Lo0 on TCP port 19).  I've removed all interface-level Netflow commands and added ip flow egress to Fa2/0.  R1 will see one flow by default, because the egress-interface is the default key field for egress flows.

Let's double-check that theory.



Hard to prove over screenshots, but this is the recurring pattern - one template + one flow.  The template is coming consistently because I configured it to arrive every other packet (better for fast labbing). Then we see the one flow, which is the only one we'll see, because source/dest (and all other key fields) are the same, as well as egress interface.

What if we wanted to see the flows separately?

R1(config)#ip flow-egress input-interface

While both fields are still in the packet, the one that matters for matching the flow is now the input interface instead of egress interface.  Let's look at the change:



Now we see one template plus two flows, one for each ingress interface.

Flexible Netflow (FNF)

Let's build out a sample of FNF.  Solarwinds freebie edition doesn't support FNF, so once again we'll be looking at the outcome from Wireshark.

First, we need to remove all traditional Netflow v9 commands.  The command set we've been using thus far only works with the default v9 template, changing it makes the rest of the traditional commands unnecessary:

R1(config)#no ipv6 flow-export destination 172.16.0.100 2055
R1(config)#no ip flow-top-talkers
R1(config)#no ip flow-aggregation cache destination-prefix
R1(config)#no ip flow-export destination 172.16.0.100 2055
R1(config)#no ip flow-export template refresh-rate 2
R1(config)#no ip flow-export version 9
R1(config)#no ip flow-cache timeout active 1
R1(config)#no ip flow-cache timeout inactive 10
R1(config)#no ip flow-egress input-interface
R1(config)#int fa2/0
R1(config-if)#no ip flow egress

FNF reminds me a bit of building a MQC QoS policy.

You create:
Flow Records, which set your key and non-key fields
Flow Exporter, which details where and how to send the exports
Flow Monitors, which match the flow records and exporters, and are then applied to an interface.

On 12.4(15)T, IPv6 isn't supported on FNF; I had to use the default template to get IPv6 flows exported. 

There are over a hundred fields that can be exported, so I'm just going to show one sample here, as a small book could be written about FNF in and of itself.

R1(config)#flow record FLOW-RECORD-TEST
R1(config-flow-record)# match ipv4 source address
R1(config-flow-record)# match ipv4 destination address
R1(config-flow-record)# collect flow direction   ! IMPORTANT
R1(config-flow-record)# collect interface input
R1(config-flow-record)# collect routing next-hop address ipv4

match denotes a key field, collect denotes a non-key field.

Note I flagged the collect flow direction line.  By default, FNF does not export anything, so as best practice you should export collect flow-direction.  Otherwise, the collector will not know if the flow is ingress or egress, although I've heard that most collectors assume ingress if this record is absent.

R1(config-flow-record)#flow exporter FLOW-EXPORTER-TEST
R1(config-flow-exporter)# destination 172.16.0.100
R1(config-flow-exporter)# source FastEthernet1/0
R1(config-flow-exporter)# transport udp 2055
R1(config-flow-exporter)# template data timeout 60

This is pretty obvious; setting the destination, port, template timeout, etc.

R1(config-flow-exporter)#flow monitor FLOW-MONITOR-TEST
R1(config-flow-monitor)# record FLOW-RECORD-TEST
R1(config-flow-monitor)# exporter FLOW-EXPORTER-TEST
R1(config-flow-monitor)# cache timeout active 60

R1(config-flow-monitor)#interface fa2/0
R1(config-if)# ip flow monitor FLOW-MONITOR-TEST input
R1(config-if)# ip flow monitor FLOW-MONITOR-TEST output

R1#show flow monitor
Flow Monitor FLOW-MONITOR-TEST:
  Description:       User defined
  Flow Record:       FLOW-RECORD-TEST
  Flow Exporter:     FLOW-EXPORTER-TEST
  Cache:
    Type:              normal
    Status:            allocated
    Size:              4096 entries / 196620 bytes
    Inactive Timeout:  15 secs
    Active Timeout:    60 secs
    Update Timeout:    1800 secs

And the packets?



There it is - just the fields we asked for.

Hope you enjoyed,

Jeff

7 comments:

  1. Jeff, you are the best!
    I have a CCIE lab booked on Wednesday and this article is just in time!

    ReplyDelete
    Replies
    1. Best of luck! My last attempt was in October, working on my 2nd try.

      Delete
    2. Thanks Jeff. Great blog. Good luck both, working on my second try also :)

      I have been playing with ManageEngines Netflow collector while I work on Netflow. It's a 30 day trial and supports v9/FNF and IPv6. It also has a neat security events feature that using flows to detect scans and bad traffic.

      Delete
  2. Jeff, Your blog is awesome. Thank you.

    ReplyDelete
  3. In the testing I have completed enabling ip flow egress with v5 works, but it just doesn't show the direction. Is there anything wrong with enabling egress collection on all active interfaces instead?

    Without enabling egress collection how else can you report on the ToS remarked by the local router as it leave an interface when using NetFlow v5?

    ReplyDelete
    Replies
    1. That's an interesting question, but I think the biggest problem would come not from Netflow itself, but from how the collector interpreted it. Wouldn't you need a v5 collector that understood the traffic was going in an inverse direction, but didn't speak v9 (otherwise using v9 makes more sense), in order for that to be relevant?

      Delete
    2. Hi Jeff, thanks for your reply. I tried configuring ingress flow collection on all interfaces and also ingress / egress on WAN with v5. When I exported both test configs to Manage Engine's flow collector the output for NetFlow v5 looked identical. It just shows the same input / output for both the input and output interfaces as it can't differentiate the flow direction. The only difference was that the ToS marking set by the egress service policy was reported when using the ingress / egress flow collector on the WAN.

      I've since found the below link for CA who advise that it is possible to configure ip flow egress across all interfaces, at least for their flow collector software anyway. I suspect there could be some difference between flow collector vendors though.

      https://www.ca.com/us/services-support/ca-support/ca-support-online/knowledge-base-articles.tec562174.html

      NetFlow v9 is the way to go, but we have some customers that have v5 servers and I am trying to standardise some config templates. A common complaint from customer is that they do not see the marking set by the egress service policy.

      The other problems with not using ingress / egress on WAN is with misreporting in WAN compression and multicast environments.

      Delete