In my last post, I wrote about my opinion on how tests are meant to express the relationship between specific parts of the code and not to repeat knowledge of interfaces and contracts. In my experience, the most valuable tests are those who exercise those interfaces and contracts indirectly, through the particular architecture implicit in their design.
The growth of agile tests is a recent phenomenon, which is offering now a good opportunity to talk about good practices, philosophy and methodologies of development in the context of Agile testing. In special, the Rails community is doing an exceptional work in bringing tests to the forefront of the Agile discussion in the Web development community.
However, the success of testing lends itself to a lot of misunderstanding among novice developers and also among those developers not so used to TDD and BDD. More so, the also recent multiplication of testing frameworks has resulted in a lot of bad code as frameworks try to compete with each other offering new features that, in some cases, are actively detrimental to the health of the test suites.
In some ways, that is the same discussion about what is the real difference between TDD and BDD, but I think the particulars of the subject deserve a little more emphasis. To sum up the argument, that point is that you should never use tests as a replacement for good architectural practices.
That may sound simple and obvious, but is easy to find examples where testing frameworks not only fail to abide by that principle but actively encourage bad behavior. Taking Shoulda, for example, it’s very common to see code like that in projects using it:
class UserTest < ActiveRecord::TestCase
should_have_named_scope('recent(5)').finding(:limit => 5)
This kind of code doesn’t prove anything about the architecture of the class. The code above:
It’s redundant, because the three first clauses can and will be tests in their use on other parts of the code, viz., the controllers;
It’s brittle, because it’s too tied to the class implementation details;
It’s little more than sanity testing to see if the developer remembered to properly declare some model stuff;
It’s exposing orthogonal implemental issues, like the fact that the application is using a database-based persistence engine in the case of the index matcher.
Overall, the tests above are almost completely useless. There may be some justification for the name scope test but it’s still redundant.
Yet worse, that are some examples like the Remarkable matcher named shouldhavebeforesavecallback, which is actually detrimental. A test that exposes so much of the inner functionally of a business object has absolutely no justification to exists in the first place. It’s a complete deviation from what TDD represents.
Tests, once again, are about interoperability between parts of the code. They are part of a architectural discourse that tries to remain focused not in implementation details but on the growth of the code base. The goal, as always, is to write the smaller body of tests–axioms, if you will–that will give a proper indication about the validity of a given body of code. Simplicity, in other words, which, as I believe, should be an explicit goal of good architectures.
I like most of what Joel Spolsky and Jeff Atwood write, but the last conversation between the two of them in their regular postcast show a blatant lack of knowlege about what tests and TDD really are.
At the core of their arguments is the idea that high code coverage through tests–Jeff Atwood mentions the 95%-plus range–makes the maintenance of the tests themselves time consuming, considering the proportion of the tests that need to be changed when the code changes. A secondary argument is that tests are more suited for legacy code, except for the kind of new code that has natural rigidity, as, for example, the specification for a compiler.
The solution for the second argument is simple: all code is legacy. Simple as that. Code the becomes production code is instantly made legacy and the argument that there is some difference between “older” and “newer” code is dubious in the best of the cases.
Reading the transcription of their dialog is possible to identify a confused notion of what tests really are–especially when both talk about the relationship between testing and architecture, something that in the agile context is commonly referred as TDD or BDD.
That confusion–that tests are meant to cover method or class interfaces–is extremely common even among practitioners of agile testing methods, be it among those who propose tests as design tools, as it’s the case of TDD and BDD adopters or be it among those who simple use tests as post-coding tools to verify code behavior in an automated way.
I can sympathize with the argument that 100% code coverage is usually unnecessary. In face, 100% code coverage never means that your code–and by extension your architecture–is without flaws.
First, because 100% of real code coverage is really impossible to achieve for any meaningful body of code. Dependencies make that a given. Second, because no matter how much tests you have, cyclomatic complexity will always get you in the most inappropriate times. No matter how much white- or black-box testing you’re doing, complete coverage is always directly exponential to your code.
There is also another factor represented by a causal variation in the 80/20 rule: the most benefits you will ever achieve from testing are always in the most complex parts of your code, but the real gain comes from the tiny deviations that blindside you on a lazy Tuesday. In this case, the more coverage you have, the easier it will be to introduce new tests.
And that’s the real reason why Spolsky and Attwood argument fails: tests are not about interfaces, or APIs or contracts. They’re rather about the relationship between the different pieces of your code. In that distinction is the root of one of the biggest debates raging in the agile test community: what’s the real difference between TDD and BDD.
My answer is centered around a small reinterpretation of what TDD is. Instead of seeing it as Test-Driven Development, I see it as Test-Driven Design.
If you’re using tests as a way to guide your design, that means you’re worried more about knowing how the pieces fit together than about how they work, as mentioned above.
But the real problem with unit tests as I’ve discovered is that the type of changes that you tend to make as code evolves tend to break a constant percentage of your unit tests. Sometimes you will make a change to your code that, somehow, breaks 10% of your unit tests.
Of course you can make changes that will break 10% of your tests, but in my experience that will only happen if your tests are brittle and if your design is already compromised. In that case, you can throw away the tests because they’re not helping anyone.
A couple of weeks ago, I made a substantial change in a system I wrote. I had to change a middleware protocol engine from DRb (distributed Ruby) to JSON over HTTP. This particular code is 100% covered.
Because of the protocol change, a considerable part of the code was touched in some way. But only three or four new tests had to be written to deal with representation changes–something that will also be of use in future protocol additions–and none of the existing tests was modified. Code was moved around, changed to new classes, but, all in all, the tests remained the same.
The explanation for what happened in simples: while there are a few tests dealing with specific interfaces, most of them are concerned about the relationship between the parts of the application: about how data leaves this part of the application in that format and is reinterpreted in a different format suitable for another part, how a given AST is reorganized to suit the language generator in a differente part of the application, and so it goes.
Jeff continues to say:
Yeah, it’s a balancing act. And I don’t want to come out and say I’m against [unit] testing, because I’m really not. Anything that improves quality is good. But there’s multiple axes you’re working on here; quality is just one axis. And I find, sadly, to be completely honest with everybody listening, quality really doesn’t matter that much, in the big scheme of things…
This is something that made me to rethink the entire context of the discussion. I’m really surprised that somebody that considers Peopleware and The Mythical-Man Month basic references for programmers would say something like that. Both books have entire discussions about quality being the focus of robust code that can be delivered in less time and that can add more value to business and users. Saying that quality is just one axis is the same as saying that good is enough, even if you have to throw it away later and start all over again because you couldn’t bother to design your architecture in a better way.
To sum up, TDD or testing is not an end in itself. But the argument that using tests is an ideologic waste of time fails when one considers how it can help to insure architectural decisions.
Joel is very known for his pragmatic approach to bug fixing. Tests are a very programatic way to ensure that a given set of conditions won’t trigger the same flaw in your applications. That’s that business value–in hours saved–that Joel and Jeff are talking about.
At the end of the day, pragmatism is what really counts. And tests, when done right, are some of the most pragmatic tools a programmer has in his arsenal.