7 tenets of software testing

Back in the Janurary 2004 issue of MSDN magazine Don Box introduced four fundamental tenets for developing service based or connected systems.

  • Boundaries are explicit
  • Services are autonomous
  • Services share schema and contract, not class
  • Service compatibility is determined based on policy

That inspired me to develop my own list of guiding principles in 2005 that applied to software testing. These tenets documented some of my key learning from over the years working delivering software at companies like Microsoft. The origional 7 testing focused tenets were:

  • You can’t test everything so you have to focus on what is important.
  • If you are going to run a test more than once, it should be automated.
  • Test the product continuously as you build it.
  • Base your decisions on data and metrics, not intuition and opinion.
  • To build it, you have to break it.
  • Apart from Test-Driven Development, A developer should never test their own software.
  • A test is successful when the software under test fails.

You can’t test everything, you have to

FOCUS ON WHAT’S IMPORTANT

Unit testing is good. Test Driven Development (TDD) is great, and it is something that every developer should do. However, like most development techniques, TDD is not a silver bullet. TDD is primarily focused on defining how a class should work, implementing that class, and then verifying the implementation performs as expected. This post isn’t about TDD, or unit testing. It is about application level testing. I feel it is important to mention it because it is the “exception to the rule”, where the “rule” that is the true subject of this post.

A couple who are good friends of my wife and I, recently had their first child. The child’s father is an orthopaedic surgeon, who, during his years as an emergency ward doctor, has delivered several babies. Before the birth I asked him, as he is qualified, and experienced, if he wanted to, could he arrange to deliver the baby himself? He answered pretty much as I expected. He would never consider delivering the baby himself, as he had too much emotional investment in the patient, his wife, and the event itself.

What the heck does this have to do with testing I hear you ask? Just as surgeon won’t operate on friends or family unless it is an emergency, a developer shouldn’t test their own code. The reason for this is clear; A developer cannot test their own code, because they simply have too much emotional attachment to it.

Development and testing are two diametrically opposed disciplines. Development is all about construction, and testing is all about demolition. Effective testing requires a specific mindset and approach where you are trying to uncover developer mistakes, find holes in their assumptions, and flaws in their logic. Most people, myself included, are simply unable to place themselves and their own code under such scrutiny and still remain objective.

Let’s say that a developer has to write some code that calculates a sales commission, where the commission is normally 5%, but rises to 7% for sales over ten thousand dollars, and they implement the following code.

if  (SalesAmount < 10000.00)
{
    Commission = SalesAmount  * 0.05;
}
else
{
  Commission = SalesAmount  * 0.07;
}

The developer has made the assumption that a sale of exactly $10,000 should earn 7% commission. If they are testing this code as well they might write tests similar to the following:

[Test]
public void VerifyLowerCommission()
{
    Assert.AreEqual(499.9995,CalculateCommission(9999.99));
}
[Test]
public void VerifyHigherCommission()
{
    Assert.AreEqual(700.0007,CalculateCommission(10000.01));
}

The problem with these tests, is that even though they achieve 100% code coverage, the developer has based them on the same assumptions and thought processes they used when writing the code itself. In this contrived example, let’s assume the actual calculation should have been based on commissions greater than or equal to $10,000. So, even though these test cases would pass, the calculation is actually wrong. This type of bug would probably manifest itself infrequently, as it would require a sale of exactly $10,000 to cause a problem and would otherwise remain dormant.

Having someone impartial write the tests for the code increases the chance of finding that type of issue significantly. This helps because they will have make their own ideas about how things should work, and challenge the developers assumptions.

So of course the title of this post is to get you thinking. Of course developers should test, but, they should test someone else’s code, after they have checked their own first, and then passed it to their friendly tester to get them to really put it through it’s paces.

If you are going to run a test more than once,

It shoud be automated

Tenet: If you are going to run a test more than once, it should be automated.

Automation explained

Test automation means different things to different people in different contexts. When I talk about automated testing I am referring to; A method of executing a test without human intervention, that would otherwise require it. In practical terms that may mean a nUnit test, a GUI test using a commercial testing tool, or a test written using an application’s internal scripting language. The technology is not the key concern, the fact that the test can be run 100% without any human involvement is the key.

Why automate?

The primary reason to automate tests is time. As a tester, you always need more time.

Automation can provide immense reductions in the amount of time required to execute tests. My first introduction to automated testing, in 1997, managed to condense 5 days of manual testing effort into 1 hour of automated execution for a 97.5% reduction in execution time.

Think about that for a moment, by automating our tests, we have achieved the equivalent of adding 40 additional testers to the team, for a fraction of the price. In addition I had taken the drudgery from my teams work day, increasing morale. More importantly, it lets your testers perform more ad-hoc testing which is much more effective than performing the same manual test over and over again.

So that’s it then, should we just retrench all our testers, and use automation instead? Well, no.

Despite when Microsoft retrenched 62 Longhorn testers, citing automation as the cause, Test automaton is generally used to help increase the amount of test coverage that can be achieved with a given schedule and resources, not reduce testing head-count.

When implemented correctly, with enough hardware, automating your tests allows you to execute all your tests, at least once a day, every day. When combined with a daily build, you have one (if not the), most powerful testing tool on the planet.

Build Verification or smoke testing

Every night after a successful compile, the daily build should automatically be packaged and deployed into a test environment, and smoke tested with an automated build verification test (BVT). The term “smoke” test is derived from the idea of quickly plugging in some electronics to test them, checking that no smoke comes out. While ideally you would run all your tests, typically the smoke test is a carefully selected subset of your full automated tests that can be run in about 10-20 minutes or so.

Automated regression testing

*regression n. “[To] relapse to a less perfect or developed state.”*

Regression testing has the goal of ensuring that the quality of an application doesn’t decline as features are added. Regression testing has a significant challenge. When an application needs it most, there is less time and resources available to perform all the tests that were executed when the product first shipped.

Without automation, the typical approach is to perform localised regression testing, which is limited to directly testing the area around the changes.

With an automation suite in hand, it is simply a matter of executing all the tests that were developed previously. This allows the maintenance programmers to make frequent releases, and allows the quality of the application to improve over time.

This is particularly important in light of trend that Fred Brooks suggests in his classic work, the mythical man-month, where each defect that is fixed has a 20-50% chance of introducing another.

The largest automation project that I have personally worked on was a huge effort which my client ran their $1M+ investment of tests on a rack of 25 dedicated machines that pounded away relentlessly, shortening their regression testing cycle by 75%, and that was partially automated to begin with!

With any form of testing, you have to focus on what is important. Blindly trying to automate everything just to reach an arbitrary automation goal is contrary to that tenet. Does that mean that this tenet is wrong then? No, I don’t think so. There is a significant difference between should be automated and must be automated. In my experience, most projects need a heck of a lot more automation than they have. If this tenet was called, “you must automate what you can”, the whole point of this tenet would have been lost. Personally, the highest I have ever achieved on a project was 95% automated, and that lasted for all of 1 day, before we added more tests.

The #1 priority for a tester on any project, should always be finding and logging issues. However, investing in the right amount of automation can make that a whole lot more effective.

A test is only successful when

the software fails

Tenet: A test is successful when the software fails.

As I discussed in tenet five, all software has bugs, and the goal of testing is to find them.

One of the ironies of testing is that when a test case runs without error we give it a big green tick, and say that it has passed. By doing this, we are incorrectly reinforcing that a test is successful if it doesn’t find any bugs. If we want to find defects and improve the quality of our software, we want our tests to fail.

This is a subtle but very important difference in testing approach. A good analogy are tests performed by a doctor. When a doctor returns with test results for your sore leg, the last thing you want to hear is: “That pain in your leg, the tests didn’t find anything, so you are fine, just ignore it.”

There is a danger when we change our expectation away from “All tests must pass all the time”, to “We want tests to fail”. The danger is that our expectation will instead become: “Most of our tests should fail and it’s ok to have tests failing for weeks on end.”

Ideally, your goal should be to have a high fidelity test system, with a core set of automated regression tests, that suffers an occasional failure as developers evolve the application over time. In addition to the regression tests, you should be adding tests for things like known defects, and new tests to expand the test coverage. Ideally, you should expect any new test to fail the first few times it is run. When the issue you found is resolved, the test should execute without failure and then stay that way.

It is also important that your test suite has a high signal to noise ratio. As the number of tests increases, so will the amount of analysis that you need to perform when there are test failures.

Now if only I could charge for tests like a doctor does …