Testing your application is a necessary part of the software development process. Each application has a set of behaviors that it is expected to perform consistently. As developers make code changes, established behaviors must be verified before the application is updated in the production environment. This verification of prior behaviors has traditionally been referred to as regression testing. This is contrasted with acceptance testing, which is conducted on new features. In both cases, an individual will run a set of tests and match the results against expected outcomes. Any discrepancies are logged as a bug for developers to fix.

Regression testing in particular can be very resource intensive, when performed manually by your QA team. As an application matures, the number of regression tests will grow, sometimes resulting in hundreds (or thousands) of behaviors that need to be verified. It can become impractical to run these manually before each release. Usually, the team will adapt by delaying regression testing and large releases until the end of a sprint.

Of course, the downside to this delay is that it increases the time between code completion and feature release to users. Also, bugs are reported long (relatively) after the developer has made the code changes. In order to address the bug, the developer has to refresh themselves on the code changes made at that time.

A solution to this is to reduce the time it takes to conduct regression testing through automation. On a modern, UI-driven internet application, much of this manual testing can be replaced by simulated tests run by computers. Computerized tests can be parallelized and completed faster than if a human ran them.  In this post, we will explore a couple of mechanisms available to automate regression testing and how you can measure your testing effectiveness.

Unit Testing

The first step in reducing the overhead of regression testing starts with unit tests. These are the code level tests that developers create to exercise individual functions or classes within a software component. A unit test provides a written contract that the code will satisfy. It specifies discrete inputs and an expected output in the form of an assertion.

Unit tests can be executed by each developer in their development environment before checking code into a shared repository. This catches logic errors created by that developer’s own changes. To ensure that one developer’s changes are compatible with all the other developers’ subsequent changes, we utilize continuous integration. Continuous Integration (CI) is a practice that requires developers to integrate code into a shared repository. A CI server will monitor for changes to the shared repository and then kicks off a “build”.  The server will execute the code’s build process, deploy it into a test environment and then run the set of unit tests. The output for the unit test run is published to the development team.  Popular CI servers are Jenkins and Bamboo.

As it relates to reducing the overhead of regression testing, a comprehensive set of unit tests can go a long way. These can catch simple logic errors before they get to the QA team. Developers should write unit tests as they code. These should be checked in periodically and not be postponed until after release. Also, you should measure unit test coverage, to ensure that all logical code paths are covered. A tool like Clover will examine your code base and provide a report of test coverage. How much coverage to target is debatable. More is usually better, but 100% is not realistic. Depending on the maturity of your test program and development team, somewhere between 50% to 90% should work as a target. If your team is just starting out with unit testing, increasing coverage can be a continuous improvement exercise.

Also, it is important that unit test failures surfaced by the CI builds are addressed in a timely manner. Some development shops prevent any new commits until a broken build is fixed. This is one way to ensure that unit test failures are fixed quickly. However, it blocks work for all developers. On the other hand, if unit test failures are ignored, their count will quickly increase until the continuous integration process is no longer useful. Each team can establish their own standard for fixing unit test failures, based on their culture and maturity. I think a reasonable guideline is to fix unit test failures by end of day.

Functional Testing

Functional testing generally involves testing interactions between the application’s user interface and its back-end logic. Similar to unit testing, inputs are sent to the application and expected outputs are verified. Testing of user interfaces will span all devices which the application supports – web, mobile, desktop, etc.

Because the user interface device is the test medium, this type of testing is initially performed manually by QA engineers. For acceptance testing, manual testing makes sense, but for large regression test suites, it can be unwieldy. Fortunately, functional testing can be automated by using a testing tool that simulates interactions with the particular UI device, directly making inputs and capturing responses. Test frameworks exist for each device type. For the web, a test tool can be used that simulates actions in a browser.  The most popular open source tool is Selenium, providing support for most modern browsers.  For mobile devices, ideally you would use a test framework that supports “cross platform” automated testing.  In this case, the test framework exposes a single API for interactions with iOS, Android and mobile web. A popular open source tool for this is Appium (created at Zoosk, incidentally).

A role within your QA team, called the test automation engineer, is usually responsible for maintaining the automated functional tests. Requirements for a test automation engineer are more advanced than for a manual black box tester.  An automation engineer needs the ability to code, as they will write test scripts within the automation framework.  Also, it is a good idea for an automation engineer to possess some devops skills, as they will likely own the test environment. Their role will be to create new automated tests as functionality is added to your applications and to update tests as business logic is changed.

Automating functional tests will significantly reduce the overhead in each regression test pass. Since they interact through a user interface, a nontrivial time is required to run them. A set of a few hundred Selenium web tests, for example, can take a few hours to execute. Even so, it is feasible to run these several times a day, or at least nightly. The output should be distributed to the team for investigation. Usually, a QA engineer will check test failures manually first, and then file bugs for verified code issues.

Service Interface Testing

Stand-alone services do not have a user interface. These are typically fronted by a RESTful API that facilitates interaction with a back-end application.  Services usually encapsulate a set of related application functions, involving substantial business logic and interactions with data stores. These types of interfaces are easy to test through automation. Automated testing of a service involves calling each API endpoint individually with standard inputs. The test then checks the response of the API for an expected value. A test automation engineer will script these tests, using documentation describing the service interface. Some open source tools that provide the capability to automate service interface testing are SoapUI and PyRestTest.

By moving your application functionality to stand-alone services with open interfaces, you will be able to automate more of your regression testing. This will reduce the amount of testing that must be verified through a UI interaction. Automated regression tests against an API generally run faster than through a user interface. I highlighted the advantages of a service-based architecture in a prior post.

Measuring your Test Automation Program

Creating and maintaining your test automation suites will represent a major resource investment. Like any allocation of resources, it is important to track the cost and benefit associated with that investment. You should collect a set of metrics which represent the amount of effort put into the test automation program and its relative success. Your QA manager should own this data collection. They should summarize the data periodically and present it to the team. These metrics will generate insights and provide feedback for additional changes to the automation program as it evolves.

Here is a list of sample metrics that are useful to collect:

  • Time spent on automated test creation. Track the amount of time each test automation engineer spends creating new automated tests. This is usually required when functionality is added to the application. These numbers can be aggregated on a per sprint basis.
  • Time spent updating automated tests. Track the time spent updating existing automated tests. Updates to existing tests are necessary when the business logic for an existing feature changes.
  • Bugs identified by automation. When automated tests are run, review the output. If a test failure results in filing a bug for a developer to fix, record this event. Aggregate the number of bugs caught by automation on a per sprint basis. These bugs represent the primary benefit of test automation.
  • Bugs missed by automation. After code is pushed to production, issues will be reported by users. If the issue report results in logging a bug to a developer, then make note of the bug. This represents a test failure that automation missed. Examine how testing didn’t capture the bug.
  • Time spent on manual regression testing. Hopefully, as more testing is automated, this time will go down.

These metrics should be reviewed periodically. Ideally, you will see an increasing number of bugs caught by automated testing and the amount of manual testing decrease.

Expectations for Automation Test Coverage

Over time, your metrics will give you a strong sense for the effectiveness of your test automation program. Pay particular attention to the number and types of bugs that are and are not caught by your automated functional regression testing. In my experience, automation of functional testing will not catch every bug. Some set of manual regression tests for critical functionality in your application is still advisable. You can include these for major functions like user registration, search, product pages, shopping cart, etc. Your product managers should be able to help identify the areas of the application that would have major business impact if they didn’t work. For these, your QA team can craft a short list of functional tests that can be run manually before each major release. While often redundant to the automated tests, these provide a good balance to an evolving test automation program. Once you have more confidence in your automated testing, you can cut back on these manual tests.

Also, as you collect data on your test cycles, your QA manager should share that data with other engineering leads and product managers. This data can be summarized monthly or quarterly and presented at group meetings. Sharing this type of data will generate productive conversations about the state of testing and what improvements can be made. The data should help address questions about the investment being made into the automated test program and its benefit.