Working, as opposed to not broken
One of the interesting debates sparked by the discipline of Test-Driven Development, a subset of Extreme Programming, centers on whether the style of unit testing it recommends really does have anything to do with "testing", as traditionally understood. For instance, Ian Smith at PARC wonders about the distinction between testing that is performed to find (or repair, or prevent) defects, as opposed to tests written to (implicitly) document the software's design (unit tests) and specifications (acceptance tests).I'm more inclined to use a different distinction; one which will distinguish between "smoke tests", or indeed much of the post-development testing that goes under names such as "beta testing", and the more focussed kind of testing favored by XP.
"Smoke test" is a term used in the manufacture of electronic equipment : if you turn the thing on, and smoke escapes from it, you know something is wrong. In software, a smoke test is a set of operations exercising major functions of the system; if it crashes or malfunctions badly at some point, further testing is usually not warranted - it goes back into development for debugging.
In my experience, much beta testing, and in fact much of all post-development manual testing, is barely one step above a smoke test. The traditional advice to users when "beta" quality software is released : "Bang on it, and report back if it breaks". Manual testing most often reveals defects accidentally, as it were; it serves to verify, not that a specific feature of the system is working as designed, but to check that the system as a whole isn't badly broken. (I understand that skilled testers, and professional testers, can write test plans that really validate functionality. I've just never worked anywhere that was the case. Some teams must have them, though.)
This, by the way, is why a post-development test "phase" always takes longer than expected. The first thing it finds is the really huge, really obvious defects - when the system is doing something entirely different from what it should (such as crash). Then the smaller defects take longer to spot, because you have to look harder to see if the system is doing what it's supposed to, or something slightly different. And so on - "there's always one more bug".
An XP-style automated test is supposed to test just one thing, just one feature. It should fail only if that feature is broken, or (of course) if a feature it depends on is broken. In the latter case, there should be a "smaller" test which also fails. Potentially, there could be several such levels of failing tests, corresponding to levels of features built atop each other. Ultimately, this should end with a unit test "close" to the defect, and in fact pinpointing its location, so that correcting the defect is almost instant. These tests only pass if the feature they test is really working - not just "not badly broken".
This distinction has interesting consequences, which I'll explore in more detail in the next entry on testing. In particular, it plays well with another important distinction, this time between two contexts of testing (maintenance of code with too few tests, vs. greenfield development in the TDD style).
No comments:
No trackbacks:
Trackback link: