Test-Driven Networks with pyATS

Look, NetDevOps is one of those terms that gets thrown around way too much at conferences. Half the time it’s just vendors trying to sell you something by sticking “DevOps” on the end of network automation. But there’s actually something useful buried in there if you can get past all the marketing rubbish.

Visit Blog post 1
45 blog post index

I remember when I first heard about test-driven networks. Thought it was just another buzzword. Then I saw what happened when a customer deployed changes without proper testing – three hours of downtime because nobody checked if the new OSPF configuration actually worked. That’s when it clicked for me.

Test-driven networks aren’t about fancy tools or complicated processes. They’re about not looking like an idiot when your changes break everything.

What NetDevOps Actually Means

Forget what the vendors tell you. NetDevOps is just taking the bits that work in software development and applying them to networks. Most engineers mess this up because they focus on buying tools instead of changing how they work.

image showing workflow for a test-driven networks

Here’s what actually matters:

Everything Goes in Git Your configs, scripts, documentation – everything. Not just backups, but proper version control. I’ve walked into networks where they tracked changes in Excel spreadsheets. Don’t be that person.

Test Before You Break Things Before you touch production, test it. Not just “ping and pray” but actual systematic testing.

# This is what a basic test looks like - don't worry about the details yet
from pyats import aetest

class TestConnectivity(aetest.Testcase):
    
    @aetest.test
    def test_interface_status(self, testbed):
        device = testbed.devices['router1']
        device.connect()
        
        interfaces = device.parse('show interfaces')
        
        critical_interfaces = ['GigabitEthernet1', 'GigabitEthernet2']
        
        for interface in critical_interfaces:
            status = interfaces[interface]['line_protocol_status']
            assert status == 'up', f"Interface {interface} is {status}"

That’s pyATS checking your interfaces automatically. Instead of logging into each device and running show commands yourself, the test does it and tells you if something’s wrong.

No More Cowboy Changes Every change goes through a proper process. Review, test, deploy. No more “quick fixes” at 3 AM that break everything.

network diagram of test-driven networks using github or jenkins

How Test-Driven Networks Work in Real Life

I worked with a retail company that had about 800 stores. Every time they made routing changes at head office, they’d spend the next week taking angry calls from store managers whose tills couldn’t process payments.

They had no way to check if changes worked across all stores. Just deploy and hope.

We built them test-driven networks that checked every store every 15 minutes:

Can the store reach the internet?
Can it talk to head office?
Are the routing protocols working?

Now when they make changes, they know within minutes if any stores have problems. Fix them before anyone notices.

The tests were dead simple:

Ping 8.8.8.8 from each store
Ping the head office gateway
Check OSPF neighbours are up

Nothing fancy. Just automated checking of the basics that matter.

What You Actually Test

Don’t just ping things and call it test-driven networks. You need to check multiple layers:

Basic Connectivity Can devices reach what they need to reach? But not just ping – check the applications actually work.

Protocol Status OSPF neighbours, BGP sessions, spanning tree – all the protocols that keep your network running.

Performance Latency, packet loss, throughput. Catch performance problems before users complain.

Configuration Compliance Every device configured according to your standards. No surprises six months later when you discover someone turned off security features.

network monitoring dashboard image showing test-driven networks for ospf bgp and eigrp

Real Examples That Actually Worked

Bank Network Compliance

Had a bank that needed to prove their network met compliance requirements. Hundreds of branches, all with specific security rules.

We built tests that checked everything automatically:

Branch connectivity to head office
MPLS VPN working properly
Security policies applied correctly
QoS configured right

Tests ran every hour. When auditors showed up, we had months of reports proving everything was compliant.

Here’s a simple QoS check we did:

def check_qos_policy(device, interface):
    qos_info = device.parse(f'show policy-map interface {interface}')
    
    if not qos_info:
        return False, "No QoS policy found"
    
    if 'VOICE' in str(qos_info) and 'priority' in str(qos_info):
        return True, "QoS looks good"
    else:
        return False, "Voice priority missing"

Instead of manually checking QoS on hundreds of devices, the tests did it automatically and flagged anything wrong.

Each branch needed consistent QoS config:

! IOS-XE QoS for branches
policy-map BRANCH_QOS
 class VOICE
  priority percent 30
 class VIDEO  
  bandwidth percent 20
 class DATA
  bandwidth percent 40
 class class-default
  bandwidth percent 10

interface GigabitEthernet1
 service-policy output BRANCH_QOS

! IOS-XR version
policy-map BRANCH_QOS
 class VOICE
  priority level 1
  police rate percent 30
 class VIDEO
  bandwidth percent 20
 class DATA  
  bandwidth percent 40
 class class-default
  bandwidth percent 10

interface GigabitEthernet0/0/0/1
 service-policy output BRANCH_QOS

Manufacturing Network That Couldn’t Go Down

Manufacturing company with production systems that couldn’t tolerate downtime. Needed to know failover would work without actually testing it and breaking production.

Test-driven networks checked:

Backup routes exist in routing table
Backup interfaces are up and ready
Spanning tree has proper backup ports
Failover times meet requirements

When a link actually failed, everything failed over in under 200ms. Without the tests, they’d never have known if failover worked until they needed it.

image showing network failover than can be monitored by test-driven networks

ISP Customer Validation

ISP with thousands of customers. Can’t manually check every connection.

Tests validated each customer automatically:

BGP sessions established
QoS policies applied
Bandwidth limits enforced

Tests ran every 30 minutes for critical customers. When problems were found, tickets got created automatically with all the diagnostic info needed.

Found problems before customers rang to complain.

an image showing test-driven networks in use for customer service status

Getting Started Without Going Mad

Don’t try to test everything on day one. Start small.

Pick your most critical paths:

Internet connectivity from data centres
Connectivity between main sites
Access to important applications

Build simple tests:

Can data centre reach internet?
Can branches reach data centre?
Can users reach application servers?

Basic ping tests catch most problems and give you confidence.

Mistakes Everyone Makes

Testing Everything Every Minute Don’t do this. You’ll overwhelm your monitoring and create more noise than signal.

Think about what needs frequent checking:

Interface status: every 5 minutes
Routing protocols: every 15 minutes
Performance tests: every few hours

Testing everything constantly just creates problems.

Tests That Don’t Actually Test Anything I’ve seen companies that pinged devices every few minutes and thought they had test-driven networks. When DNS failed, tests kept passing because devices were reachable, but users couldn’t access anything.

Make sure your tests actually validate what matters.

Ignoring the Results Tests are useless if you don’t act on them. Use the data to fix problems before they affect users.

an image showing failed testing on customer network with the use of test-driven networks automation

Tools You Actually Need

You don’t need expensive commercial tools. Open source works fine for most things.

pyATS for Network Testing Built specifically for network testing. Can connect to multiple vendors, parse output into usable data, compare network state over time, generate proper reports.

Integration with Monitoring Tests should integrate with your existing monitoring. When tests fail, alerts get generated with enough context to actually fix the problem.

Version Control All test code goes in Git. Changes to tests get reviewed like changes to network configs.

How to Know It’s Working

Track these numbers:

Mean Time to Detection How fast do you spot problems? Should be minutes, not hours.

Mean Time to Resolution How fast can you fix things? Good tests provide enough info to speed up troubleshooting.

Change Success Rate What percentage of changes work first time? Should improve dramatically.

Unplanned Downtime How often do users get surprised by network problems? Should be rare.

Working with Automation

Test-driven networks and automation work together. Automation does the repetitive stuff, tests validate it worked.

When automation deploys new branch office config, tests verify:

Devices are reachable
Routing protocols work
Security policies applied
Applications connect properly

Without testing, automation can break things faster than humans ever could.

The Culture Bit

This isn’t just about tools. You need to change how your team thinks.

Proactive Instead of Reactive Traditional networking is firefighting. Something breaks, you fix it. Test-driven networks are fire prevention.

Failure is Information Tests will find problems. Don’t shoot the messenger. Use failures to improve your network.

Use the Data Tests generate tons of data about network behaviour. Use it to make better decisions about capacity, architecture, operations.

Advanced Stuff for Later

Once you’ve got basics working:

Chaos Engineering Deliberately break things to test resilience. Sounds scary but controlled failure testing is safer than discovering problems during real outages.

Predictive Testing Use historical data to predict when problems might occur.

Intent-Based Networking Define what you want the network to do, then test that it actually does it.

Start Tomorrow

Don’t wait for perfect solution:

Pick one critical path
Write simple test to validate it works
Run test regularly
Expand gradually

Start with single test. Once you see value, you’ll want to test everything.

Test-driven networks change how you think about networking. Instead of hoping things work, you know they work because you test continuously. Instead of reacting to problems, you prevent them.

Users expect reliable networks. Business expects predictable operations. Test-driven networks deliver both while making your job less stressful.

Won’t happen overnight, but every step makes your job easier and network more reliable.

Next post we’ll look at pyATS ecosystem and how all the pieces fit together. You’ll see the architecture that makes pyATS powerful for network testing.

Learn more about network automation best practices at Cisco DevNet

2 thoughts on “Cisco ptATS Blog 2: Introduction to NetDevOps and Test-Driven Networks”

Try Veo 3 July 22, 2025 at 9:26 am


Totally agree that ‘NetDevOps’ often gets lost in the hype. Your point about putting everything in Git and actually testing before pushing to production really hits home—especially after seeing outages caused by untested changes. It’s refreshing to see a practical take that focuses more on mindset than tools.
Pingback: Blog Index: Cisco pyATS Automation - RichardKilleen

Cisco ptATS Blog 2: Introduction to NetDevOps and Test-Driven Networks