Saturday, December 21, 2019

Beards goes API...on LiveNX...!!!

In the last blog we scratched the surface of what can be done through APIs with Cisco's DNA Center...

Now let's turn our attention to LiveAction's LiveNX Netflow collector!

But why do you need API access to LiveNX?

Well, I'd been searching for a good Netflow collector and had found LiveNX and a open source collector that supported NBAR tagged Netflow and collecting data from Cisco's performance monitors...this was several years back and the choices were limited!

But from time to time I wanted to extract various slices of data for analysis, and as the basis for recommendations back to a customer (for example 'I see that your voice and jabber is being marked different in and out of the WAN! Maybe that's why your call quality is poor...' or 'The loss and jitter for Site X is much higher than the rest of the sites! I think we should investigate further...')

So looking through the GUI in LiveNX is great (again the playback capability is amazing...especially with the semantic filtering that you can do i.e. 'flow.app=netflix' to just show the netflix traffic) but sometimes you just wanna look at the data itself outside of the tool...

...hence APIs!!!!

LiveNX APIs

LiveAction provides a great guide to using the APIs within LiveNX and it's expanded considerably since the 5.2.1 version I originally tested out in the lab.

Some familiarity with the reports that LiveNX can generate is essential to figuring out what you wanna look for within the data...

And here's the benefit of LiveNX...remember it's storing everything! As long as you've got space it's gonna store your Netflow data!

So how do we pull data from the LiveNX server? I'll skip the stuff covered in the LiveNX guide...

But I've found the useful bits are around what data reports are useful for analysis:

Let's start by pulling information about the device/router we wanna focus on (I usually start with a sample of the routers within an environment - good points of congestion and traffic concentration, and they have the best NBARv2/FNF/perf mon capabilities)


curl -k -H "Authorization: Bearer <LiveNX-API-Key>" "https://<LiveNX-server>:8093/v1/devices/"

{
  "meta" : {
    "href" : "https://192.168.2.52:8093/v1/devices/",
    "http" : {
      "method" : "GET",
      "statusCode" : 200,
      "statusReason" : "OK"
    }
  },
  "devices" : [ {
    "href" : "https://192.168.2.52:8093/v1/devices/FTX183881DJ",
    "id" : "FTX183881DJ",
    "serial" : "FTX183881DJ",
    "address" : "192.168.2.1",
    "systemName" : "c881",
    "hostName" : "c881",
    "osVersionString" : "15.6(3)M5",
    "vendorProduct" : {
      "model" : "ciscoC881K9",
      "displayName" : "ciscoC881K9",
      "description" : "ciscoC881K9"
    },
    "siteIp" : [ "10.0.1.0/24", "192.168.2.0/24" ],
    "interfaces" : [ {
      "name" : "FastEthernet0"
    }, {
      "name" : "FastEthernet1"
    }, {
      "name" : "Vlan200"
    }, {
      "name" : "FastEthernet4",
      "inputCapacity" : 150000,
      "outputCapacity" : 25000,
      "wan" : "WAN",
      "serviceProvider" : "Spectrum Business"
    } ],
  } ]

}

So here we see the serial number of the device we wanna query, and we've proved that we can pull data using the API...(watch out for cut and paste between non plain text applications and your terminal screen...countless number of times I've cut a curl script from a Word doc where it's automagically replaced my plain quotes with open and close quotes...)

Summary flow information

Ok so now to get the total Netflow data outbound for a week from our WAN interface Fa4:


curl -k -H "Authorization: Bearer <LiveNX-API-Key>" "https://<LiveNX-server>:8093/v1/reports/flow/79/runTimeSeries.csv?deviceSerial=FTX183881DJ&binDuration=5min&direction=outbound&startTime=1575849600000&endTime=1576231200000&interface=FastEthernet4"

It comes out in a .csv format which you can massage into a nice chart for the traffic like this...



Here we see my outbound VNC/RDP screen scrape traffic along with my YouTube, Netflix and Amazon Prime Video acknowledgements out of my Fa4 interface. Note take some of the peaks with a pinch of salt...30Mbps outbound VNC isn't realistic in my mind...so I generally eliminate the absolute peaks from my analysis...

This would be the best starting point for your analysis - it tells you what traffic is seen, where the daily peaks are, what the busy hours for the traffic is, and how close to max'ing out the link capacity you are coming (it also often uncovers a lot of traffic a customer may be totally unaware of). In some cases you will see a tonne of 'unknown' traffic from the NBAR perspective...for me this means we need to create more custom NBAR entries that reclassify this traffic into something the customer can relate to.

Sampling a week is a good start...sometimes to corroborate these numbers it's good to also pull a 'show ip nbar protocol-discovery' on the interface and look at the peaks (max-bit-rates) - they will give you insight into what QoS queue sizes will be suitable for the peak traffic - expect to compromise with the queues, and retune the queues as your customer's traffic grows and changes. QoS is never static in my experience! Check the QoS outbound queue drops with a 'show policy-map interface' command and make some recommendations based on the data you see.


We looked at outbound traffic (cos that's where the congestion will most likely happen - from LAN to WAN) but what about the volume of inbound traffic?

curl -k -H "Authorization: Bearer <LiveNX-API-Key>" "https://<LiveNX-server>:8093/v1/reports/flow/8/runTimeSeries.csv?deviceSerial=FTX183881DJ&binDuration=5min&direction=inbound&startTime=`date --date='08:00 last Monday' +%s000`&endTime=`date --date='18:00 this Friday' +%s000`&interface=FastEthernet4"

Q: Hey Beards, you replaced the startTime/endTime numbers with execution of a command? Does it do the same thing?

Absolutely! the long format epoch number is what the API needs, and generating it from the date utility makes life easier. Well spotted!

Now from this output we will get a perspective on the volume of typically web traffic coming into the site. In our case we see YouTube sending between 2 and 8Mbps to my lab PCs, 5-9Mbps for video-over-http, 3-10Mbps for Amazon instant video and a 300Kbps of VNC control traffic (mouse movements, etc.)

This is usually where a customer gets concerned about excessive use of social media consuming their valuable business WAN capacity. And if you have alternate WAN circuits where you can steer non business critical traffic to that alternate path to avoid impacting the business (if your Acceptable Usage Policy with the users allows use of social media!)


Now what else is useful?

Full flow information

Well looking at the marking of traffic for consistency is useful...look at reports of both inbound and outbound full flow information using report 79...some reports can be run as Aggregation reports while other can also be run as Time Series reports. Check what is possible and what parameters can be used by doing something like:


curl -k -H "Authorization: Bearer <LiveNX-API-Key>" "https://<LiveNX-server>:8093/v1/reports/flow/79/"


Within the reports you may just want to narrow in on particular applications - in our case here we can narrow in on our Netflix traffic:


curl -k -H "Authorization: Bearer <LiveNX-API-Key>" "https://<LiveNX-server>:8093/v1/reports/flow/79/runTimeSeries.csv?deviceSerial=FTX183881DJ&binDuration=5min&direction=inbound&flexSearch=flow.app=netflix&startTime=1575849600000&endTime=1576231200000&interface=FastEthernet4"

We're using LiveNX's semantic search feature to select just the Netflix flow information.


ART and RTP performance monitor information

Q: OK what about the collected performance monitor data? Isn't that available via the APIs too?

Yep, latency measurements from ART/MACE are only available as a Aggregation report (no Time Series equivalent report is possible but you could probably change your start/end times to get more granularity):

curl -k -H "Authorization: Bearer <LiveNX-API-Key>" "https://<LiveNX-server>:8093/v1/reports/flow/27/runAggregation.csv?view=basic&flowType=avc&deviceSerial=FTX183881DJ&startTime=`date --date='08:00 last Monday' +%s000`&endTime=`date --date='18:00 this Friday' +%s000`&interface=FastEthernet4"

Good for highlighting CND client/network delays (taking a slow WAN circuit/path) and SND server side delays (where an application server is performing poorly). Remember this is timing between SYN and SYN/ACK packets in a TCP stream so if you measure it from the DC end of the connection the numbers will reflect the WAN (client portion) of the delay, whereas measuring it from the branch end (closest to the clients) the numbers will look unrealistically small for the CND portion but larger for the SND portion because it's also lumping the WAN delay in with the actual server delay).

And the voice/video RTP traffic performance monitors showing loss and jitter can also be retrieved via API calls:

curl -k -H "Authorization: Bearer <LiveNX-API-Key>" "https://<LiveNX-server>:8093/v1/reports/flow/63/runTimeSeries.csv?direction=inbound&deviceSerial=FTX183881DJ&startTime=`date --date='08:00 last Monday' +%s000`&endTime=`date --date='18:00 this Friday' +%s000`&interface=FastEthernet4"

curl -k -H "Authorization: Bearer <LiveNX-API-Key>" "https://<LiveNX-server>:8093/v1/reports/flow/63/runTimeSeries.csv?direction=outbound&deviceSerial=FTX183881DJ&startTime=`date --date='08:00 last Monday' +%s000`&endTime=`date --date='18:00 this Friday' +%s000`&interface=FastEthernet4"

This shows us measured losses within the RTP stream of voice/video packets, as well as the variance in the delay/arrival time of packets - these are absolutely key to good quality voice and video...although a cautionary note on trusting these figures absolutely...I've seen some wacky looking numbers from time to time!



So we've gone through using APIs to pull data out of a customer's LiveNX system...and passed on some guidance around what to look for in the data!

Hope you gained some insights and a few new tools for your nettie toolbelt!

Beards out...Ho ho ho to all you believers out there!     ? ; {)







Friday, December 20, 2019

Beards goes API...on DNAC...!

The life of a nettie has sure changed in the last 20+ years...

Especially in the last 5 or so years! And especially for me....

And these days you can't escape the 'Software Defined' nature of networking!

For example I find half of my days digging into Cisco DNA Center and finding out about some of the data you can extract via a few well crafted API calls...and even some sneaky browser scrapes of API calls using our wonderful new tool - the web Developers Console!

Let's whet your appetite with some examples...

Cisco DNA Center

As a controller it's the collection and control point for the campus LAN network but it's been built around a series of data models and interlocking processes and capabilities under the hood.

For example we can extract DNAC's view of the inventory of all network devices it manages...

https://<DNAC-address>/api/v1/network-device

And you get back (either through your browser or via POSTMAN) the inventory as a series of devices (I've trimmed it down to only show you a couple of devices in this output and pulled out some of the other details you get back):


{
   "response":[
      {
         "type":"Cisco Catalyst 9300 Switch",
         "softwareType":"IOS-XE",
         "softwareVersion":"16.6.6",
         "hostname":"Core1",
         "family":"Switches and Hubs",
         "managementIpAddress":"172.31.x.x",
         "platformId":"C9300-24P",
         "series":"Cisco Catalyst 9300 Series Switches",
         "role":"CORE",
         "id":"3ff88560-fda8-4be6-92dc-89d3bb982ece"
      },
      {
         "type":"Cisco Catalyst 9300 Switch",
         "softwareType":"IOS-XE",
         "softwareVersion":"16.6.6",
         "hostname":"Edge1",
         "family":"Switches and Hubs",
         "managementIpAddress":"172.31.x.x",
         "platformId":"C9300-24P",
         "series":"Cisco Catalyst 9300 Series Switches",
         "role":"ACCESS",
         "id":"23f44036-e3e4-4bac-97f6-e0b7e707ca81"
      },
   ],
   "version":"1.0"
}

Q: But Beards, I didn't get anything back?!

Ahh, well here's the 'key' to API calls (groan..)

You don't want just anyone firing off API calls against your servers so you need to either have a browser session already logged into the DNAC/API server (with an authenticated access 'token/key' stored within the browser) or you get an 'X-Auth-Token' from an 'authtoken' request using an encoded username/pwd, thus allowing you to make the subsequent calls (again POSTMAN is a great way of crafting and testing your API calls and you'll get familiar with running an API call to get the token before running your queries - not gonna spend too much time discussing this here...)

Q: Hey Beards, now I get some stuff back but why does my output look like a long string of mess rather than your pretty formatted output? 

Well let me introduce you to the world of JSON formatting tools (do a web search and find one you like) - usually they will convert your ugly long string into something a little bit more readable!

Note from the output you get back some easily understood information and other things that make less sense (but will become clearer and more useful as we progress...)

Now let's try another call - this time the physical-topology data from DNAC...

https://192.168.250.191/api/v1/topology/physical-topology

With this output we get back a list of the 'nodes' and the 'links' interconnecting those 'nodes'...

{
   "response":{
      "nodes":[
         {
            "label":"Edge1",
            "id":"23f44036-e3e4-4bac-97f6-e0b7e707ca81"
         },
         {
            "label":"Core1",
            "id":"3ff88560-fda8-4be6-92dc-89d3bb982ece"
         }
      ],
      "links":[
         {
            "source":"3ff88560-fda8-4be6-92dc-89d3bb982ece",
            "startPortName":"GigabitEthernet1/0/2",
            "startPortIpv4Address":"172.31.x.x",
            "startPortSpeed":"1000000",
            "target":"23f44036-e3e4-4bac-97f6-e0b7e707ca81",
            "endPortName":"GigabitEthernet1/0/23",
            "endPortIpv4Address":"172.31.x.x",
            "endPortSpeed":"1000000",
            "linkStatus":"up",
            "id":"318341"
         },
         {
            "source":"3ff88560-fda8-4be6-92dc-89d3bb982ece",
            "startPortName":"GigabitEthernet1/0/3",
            "startPortIpv4Address":"172.31.x.x",
            "startPortSpeed":"1000000",
            "target":"23f44036-e3e4-4bac-97f6-e0b7e707ca81",
            "endPortName":"GigabitEthernet1/0/24",
            "endPortIpv4Address":"172.31.x.x",
            "endPortSpeed":"1000000",
            "linkStatus":"up",
            "id":"318338"
         }
      ]
   },
   "version":"1.0"

}

And we can start to match up the device details from our inventory output using those cryptic 'id's...

What else is possible...?! 

Take a look at the published APIs within Cisco DNA Center by going to the Developer Toolkit under the Platform tab...enjoy the ride!

Tuesday, December 10, 2019

Cisco DNA Center's Application Health: You can't do that?! Well yes you can!

So remember the Peachy one's look of dismay at the lab's shiny DNAC? Well things payed off...

I've been involved with Cisco DNA Center and things Cisco DNA/SDA related for several years...and watching from the wings I'd gotten all hot and bothered about the potential of one aspect of Cisco DNAC Center....Assurance! And you don't need to be running SDA to leverage this capability!

Assurance is a set of capabilities within DNAC to monitor various aspects of the network to make sure the most important problems show up at the NOC, with as much information as possible (in some cases with a series of remediation recommends to boot!)

The Assurance screens within DNAC show us the follow:

Overall Health - a high level snapshot of the health of the entire managed network
Network Health - focused on the devices that make up the network
Client Health - the end user view of the network!
Application Health - here's where Beards gets excited....

I'll focus on Application Health and talk about the types of data used to build this view of the business network! No surprises - it's built around my pet topics - NBARv2, Netflow and performance monitors! On the wireless side it stretches out to AVC Netflow, wireless streaming telemetry from the WLCs and AP, and the ability to monitor the new AP1800S sensor's view of the client perspective! Extremely useful!!

Within DNAC we can 'provision' devices with what is called a Telemetry Profile - a canned set of template configurations for sending critical information to DNAC's Assurance capabilities.

Types of Network Telemetry Profiles

Maximal Visibility - only possible on ISR4K/ASR1K routers (turns on syslog and Application Visibility)
Optimal Visibility - for switches (turns on syslog)
Disable Visibility - no syslog or app visibility

It's at this point that Beards gets curious...we can guess what the configs may look like on a router but what's possible on the switches?!

So I've got these lovely NBARv2/FNF capable Cat 9k switches but I can't turn on App Visibility?

Well there's nothing to stop you from configuring something like our template NBARv2 and FNF configurations on the switches - so I did!


!
flow record FLOWREC
 match ipv4 version
 match ipv4 protocol
 match application name
 match connection client ipv4 address
 match connection server ipv4 address
 match connection server transport port
 match flow observation point
 collect flow direction
 collect timestamp absolute first
 collect timestamp absolute last
 collect connection initiator
 collect connection new-connections
 collect connection server counter packets long
 collect connection client counter packets long
 collect connection server counter bytes network long
 collect connection client counter bytes network long
!
!
flow exporter 172.31.51.191
 destination 172.31.51.191
 transport udp 6007
!
!
flow exporter FLOWEXP
 destination 172.31.51.191
 transport udp 6007
 option interface-table
 option application-table
!
!
flow monitor FLOWMON
 exporter FLOWEXP
 record FLOWREC

!
interface GigabitEthernet1/0/1
 description Edge1_to_AP3800-1
 switchport access vlan 58
 switchport mode access
 ip flow monitor FLOWMON input
 ip flow monitor FLOWMON output
 ip nbar protocol-discovery
!
interface GigabitEthernet1/0/2
 description Edge1_to_UCS1_PCI_eth2
 switchport access vlan 54
 switchport mode access
 ip flow monitor FLOWMON input
 ip flow monitor FLOWMON output
 ip nbar protocol-discovery

!

Note the older Netflow v5 configuration that DNAC put onto the switch - I left it there so DNAC wouldn't get upset about that part of the config being missing (not sure it would but no harm!)

Also note the NBAR and 'ip flow monitor' lines are put onto access or routed interfaces on the switch - can't put them on VLANs.


And while it's not officially supported yet and doesn't represent Application Health in the way it would from routers (no performance monitor data), you will see the application volume from your switches...



Here we see YouTube hammering away from our wired client connected to the lab's Cat 9K! CPU impact? 1-2% CPU load....

Experiment over....definitely not at the stage of recommending this but wanted to give some insight into what's possible already...

Not all there yet - but exciting possibilities!

Beards out!    ? : {)

Steering web traffic using Application Visibility & Control - or - Herding Lots of Odd Little Sheep!

Sometimes you get thrown a curveball and just have to see where it takes you...

Being a 'Netflow' guy you get some odd questions from your colleagues and customers but this ended up being a great opportunity to see a practical use for AVC, and explore how deep you have to dig to make things work sometimes...

Grab a shovel and join me on the journey!

As a precursor I'd been exploring custom NBAR entries and what you could do with them (especially where you could either recognize and tag a customer's application traffic so it appears in their Netflow data, whether it was port based, HTTP header based or even encrypted traffic - using SSL cert common name!).

Also I'm leaning on years of HTML5 and Javascript experience...4000 lines of Javascript application anyone?!

But this curveball all started as a simple question...with what looked like a simple answer!


'Can you use Application Recognition to direct certain traffic down a specific link?'

Well in theory the answer is yes!


My mental gymnastics go into overdrive....

... PBR to steer marked traffic, NBAR type features to tag traffic with a distinct DSCP value, Bob's your proverbial mother's brother! How difficult can it be...

Of course it's at this point you have to ask some questions...my list is short.

'Why?' - cos sometimes there's a cleaner solution that can be employed
'What traffic?' - this became the technical scary bit...

The 'Why' turned out to be a common requirement in certain places in the world...

And the 'What traffic' makes the 'Why' obvious...

The traffic is all web traffic going to the BBC website...and the reason...certain countries have filtered Internet connections where going to such a public newsfeed is impossible over 'controlled' links - only certain sanctioned Internet connections are allows to get to sites like the BBC or Facebook... so we have to steer our BBC web traffic to an unfettered Internet link while the rest would follow the filtered Internet connection.

But let's construct this in reverse order...


PBR

This is the easy bit.

On my LAN side interface I want a PBR statement that directs certain traffic to a different next hop than the one in the routing table.

!
interface Vlan200
 description inside VLAN interface
 ip policy route-map PBR_GLOBAL_INET
!
route-map PBR_GLOBAL_INET permit 10
 match ip address 102
 set ip next-hop 172.21.0.2
!
access-list 102 permit ip any any dscp af43
!

So any DSCP AF43 traffic is directed to the 172.21.0.2 next hop


Marking the traffic to be PBR'ed

Again kinda easy...back to our inbound LAN side interface and have a QoS marking policy.

!
interface Vlan200
 service-policy input GLOBAL_INET_MARK
!
policy-map GLOBAL_INET_MARK
 class GLOBAL_INET_DEST
  set dscp af43
! 

⚙Note: always check your logic again the order of operations for your router...a quick 'show cef interface xxx' will uncover the order in which things are done, to ensure that marking happens on your inbound interface before traffic is PBRed...

But what does our class-map need to look like?


Application Recognition

This turned out to be the tricky piece and for the purposes of explaining I'll shoot off at a tangent for a while...

Network guys are simple dudes in general...an 'application' is defined by the ports and protocols it uses (1990s thinking)...'web applications' are defined by the HTTP headers we can recognize (2010s thinking)...

But the reality is more complex and needs the nettie to think and explore like a programmer...so pick up some new tools here....

Dig out your favourite browser and get ready to point it to www.bbc.co.uk (good place to start!)

Well is that actually where your traffic is going? Pull up your command prompt/power shell or whatever and do 'nslookup www.bbc.co.uk'. From where I am that resolves out to two addresses for 'www.bbc.net.uk'. So putting in just www.bbc.co.uk into my AVC part might not necessarily have caught all the traffic! What if a user typed 'www.bbc.com' instead? That's different again!

Quiz time...if my browser pulls up the 'www.bbc.co.uk' page, how many separate connections are used for that one page?

  • One? It's just a single page...
  • Maybe 4-5 but they are all within *.bbc.*? 
  • The most common answer may be 'I have no idea!'

⚙So let's use a nifty little tool that comes with all the popular browsers - the Web Developer Console! 

Under Chrome its found under View -> Developer -> Developer Tools.  Click on the Network tab (it's the most revealing for us!) - of course other browsers have their Developer tools hidden in other menu options but find it for your favourite browser

🚩Great for diagnosing 'network' delay issues with web apps - note the milliseconds taken to download certain parts of the page...saved my bacon a time or two!

Now open that 'www.bbc.co.uk' page....and watch the hundreds of connections within the Web Console - html pages, images, javascript code, CSS for the layout of the pages, adverts from other sites (I got an insurance advert at the top of my BBC webpage?!).

So our class-map needs to use the Application Visibility & Control feature 'match protocol' to match as many of these as possible!

It's like a custom NBAR entry but defined within a QoS class-map - NBARv2 would simply allow the Netflow records to have the right application ID attached, whereas the 'match protocol' allows us to use the NBARv2 application recognition directly within QoS!

Pretty neat!

So let's see my rudimentary answer within the class-map and look at the positives and potential pitfalls with it (for my lab it demonstrated the principles of what was needed even if it didn't catch all possible traffic!)

!
class-map match-any GLOBAL_INET_DEST
 match protocol http referer "*.bbc.co*"
 match protocol http host "*www.bbc.co.uk*"
 match protocol http host "*static.bbci.co.uk*"
 match protocol http host "www.bbc.com*"
 match protocol http host "*.bbc.co*"
!

It's not a perfect answer for such a complex website (and we're only looking at one page!)

But it's a start...

The 'match protocol http host' entries are straightforward...click on any of the Console's Network tab entries and see the HTTP header information that we can pattern match with our protocol statements (at the time I think I hit around 80% of the content from the page with the 'http host' entries!).

But what's the 'http referer' entry? Well say our webpage calls some Google Ads - the HTTP host for these requests may be something like 'https://googleads4.g.doubleclick.net' - we don't wanna include all these odd things in our match protocol but those entries usually have a HTTP Request referer entry that points back to where the originating webpage is (in our case mostly 'www.bbc.com').

That's improved our success rate on directing traffic to about 80-90% with a 5 line match protocol class-map. No guarantees it will work like this for your customer situation but use the new tools you have to do the best job you can... 

I hope I opened your eyes about web traffic and gave you some tips around how to deal with it via NBARv2's close cousin 'match protocol'...that's the Control part of AVC!


Beards signing off...I'll come back to how to make custom NBAR entries in a future blog!

? : {)








Cisco DNA Center App Health using later switch sw...

So in a previous post we talked about getting App Visibility data out of switches using our standard AVC/FNF config templates... But thing...