Back in 2012 I got hired by a telecom company, as a primary/Sr. Quality Assurance Engineer. My main staple was to test software quality for their web development projects, however a need came up within my first few months: Build a tool to test call completion across different carriers. The complexity here, is well beyond web automation. Initially, this started as a simple shell script kicked off by a Jenkins CI job. In time, I turned this into a full fledged web application, using Vue JS as the frontend, and the backend written in Grails. While I won’t go through the specific code, I will show how someone can conceptually set up their own framework for testing call completion.
SIP Handshake
There are several ways to test call completion, but perhaps the easiest is to verify that the SIP handshake works without error. A SIP call will return response codes, much like a web application: 200, 400, 500. This wasn’t my first approach, but it became my finalized methodology.
To create a SIP call for testing purposes requires some considerable setup. A soft PBX like Asterix or Freeswitch is required to setup the DNIS/phone number routing. Whatever solution you use, it needs to be solid. If the method to broker a call is flaky the test strategy will not be useful in spotting external call completion problems.
SIP calls flows can be setup to test a variety of conditions, including:
- Validating a SIP proxy is working
- Validating an outbound carrier is completing calls
- Validating inbound carriers are completing calls
- Validating Cisco telecom devices (i.e. Cubes) are operational
- Validating geographical outages
Automating SIP Calls
If you can manage setting up automated calls through Asterix or Freeswitch, or even a code-based library, that would be the best methodology. I’ve had various levels of success and failure with those approaches. In the end, I opted to use command line utilities. I would spawn a command line utility from code (Groovy or Python) to run a SIP call and validate the result through logic patterns.
The command line utility I chose to use is a tool called SIPP. The intention of SIPP is to load test internal hardware, to certify it for production use. However, if SIPP is given specific parameters, it will place one call sequentially. What makes SIPP very useful, is that it can be fed XML for specific testing scenarios. The XML itself certifies the expected SIP handshake. These XML tests can also place pauses and use PCAP files to playback audio files during a SIP call.
SIPP
As mentioned, SIPP is a command line utility that can be used to place SIP calls and validate the calls based on an expected handshake using XML scenarios.
An example XML scenario for sipp can be found HERE. That linked page, has a variety of examples for different scenarios. Once the SIPP handshake is working, then logic can be set around the dialer.
When constructing the call, in whatever language you’re using, you can dynamically replace key parameters like so:
# Grails/Groovy code example:
def sipCall = "sipp ${scenarioFile} -s ${phoneNumber} ${server} -m 1 -i ${localIpAddress} -timeout 30s -timeout_error -cid_str ${randomIdentifier}".execute()
The above is a simple example. Complexity can be added to send values back into the XML scenario but that’s outside the scope for this brief example.
The scenario file is passed in dynamically in the above example. The actual value could be something like:
"-sf ${rootpath}/scenarios/uac.xml"
The sf flag tells sipp what scenario file to use. ${rootpath} is something I create dynamically as the web app explodes on a web server after deploy. The XML file itself can be exported from sipp initially and then modified to suit one’s needs.
Automation Logic
Placing SIP calls requires considerable logic. It’s not just a “call sipp and see if the call completes.” Calls have a lot of complexity and points of failure. How often do we email people of a failed call? Should it be carrier driven (i.e. overseas carriers given more failure opportunities, than local or large carriers)? How many times do we re-run a test before marking it a failure? Should we block the tests from running if a check of the internal systems is down? Is it useful to lump servers into geographical areas and notify if the entire geographical region is failing?
Answers to those questions determines the scope and complexity of the logic. Note that in my example, the localIpAddress is dynamic… maybe you want to run your test harness on machines with redundancy, so the local IP may change from time to time. If so, you’ll need to dynamically ingest it as a SIPP parameter.
Data Capture
What data needs to be displayed to a front-end UI? Will you use something like Elastic to parse log data? Or will you use a database of results to fill out a web app UI? For my needs, I chose the latter. I use elastic to monitor the health of the test harness, but for results I use a front end written in Vue. My front end is a dynamic list of results for each carrier, number, CISCO device, proxy, server or other element under test. Each is broken into sections on the page. Along with the most recent pass/fail, is some data analysis of failures over time (failure % over the past hour, past 3 hours, past 6 hours, past 12 hours and past 24 hours).
Knowing what you want to display will determine the logic necessary in code.
Audio Quality
Do you want to get an idea of the call audio quality? Typically this can be handled best by third party software, like VoipMonitor. However, since one is placing a call to test these carriers, one could feasibly record the audio of the call and run it through a comparison algorithm like PESQ. This is doable if the tests are sequential and you are rebuilding audio from a packet capture. However, if multiple tests are running in separate and simultaneous threads, the complexity of this task is too much. A simple packet capture will capture data from multiple and simultaneous audio streams.
Conclusion
Your end result will be perfect for your setup and organization. Maybe you don’t require a web app, but more of a script with great logging that feeds into an ELK stack for dashboard review. Or maybe you prefer just a simple one off script that runs once in awhile, and not a full blown monitoring system that kicks off tests every few minutes. Start small. Start with something manageable. In time, add on as requirements grow. But know that call quality is something that can be done in house and relatively easily. It just requires a bit of time and effort.