Testability Metrics, Revisited

The response to my earlier post on testability metrics has been, well, a little negative. I think that the strident tone of the conversation (for which I bear much of the blame) has artificially widened the gap of opinion between me and some people I respect.

So I am going to start again, and fill in the gaps from the earlier post. Along the way, I will answer the interesting questions raised in the comments.

Java is naturally untestable, and that ain't news

Let's start at the very beginning, with the idea that Java is naturally untestable. What does the word natural mean here? That if you use Java naturally, taking advantage of the various features of the language when they are appropriate to the task at hand, you will end up with untestable code. This is not news to anybody, least of all experienced Java programmers. Here are a few quotes from the Java community. First, on the Testability Explorer home page:

The metrics are a calculation of the skill of the development team in making their classes testable.

From the Spring Framework mission statement:

Testability is essential, and a framework such as Spring should help make your code easier to test.

And from the Guice homepage:

You will still need to write factories in some cases, but your code will not depend directly on them. Your code will be easier to change, unit test and reuse in other contexts.

If I can paraphrase these quotes: Easily testable code doesn't just happen naturally in Java. A team must develop skill in testability. In fact whole frameworks have been created, counting testability among their primary objectives.

Java can be tested

Of course, many Java programmers can and do write testable code. As a Java programmer, I learned to avoid certain idioms that are not testable, or to use a framework to do it for me. But this (in my view) isn't natural. I have made the point elsewhere that a language should be richly consistent. If "testability" is a core feature of the language, then testability should be combinable with all the other core features. There should not be a laundry list of special cases of the form: "Don't do X or you can't have Y."

In many other languages, testability is almost free!

The Testability Explorer defines testability in terms of two things that you should avoid:

unmockable complexity
global mutable state

As in the earlier post, I will focus exclusively on the first of these: unmockable complexity. (Global mutable state is a discussion for another day.) Also, I am not going to worry about the complexity part. I use cyclomatic complexity measures in both Java and Ruby code, and find them to be decent predictors of code that a human reviewer would find objectionably complex.

What remains then, is the definition of unmockable. The Testability Explorer considers code unmockable if it cannot be overridden or injected, which it defines as (some subset of) constructors, statics, and privates.

By this definition, Ruby has no unmockable complexity:

Ruby's "statics" are polymorphic to begin with, so they can be overridden just like instance methods. (In fact, they are instance methods.)
Ruby constructors can be overridden or simply rewritten.
Ruby's reflection model will let you publicize private members when needed in a test.

Several commenters asked for a patch so they could try out LOW CEREMONY mode for themselves. Easily done: set the unmockable complexity number to zero across the board. I did it by refactoring MethodInvocation.computeMetric.

What do the testability metrics really tell us?

When I figured out how Testability Explorer calculated its metrics, I knew immediately what it would "prove" -- it is one of those metrics whose outcome is driven by its assumptions. If unmockable complexity really is a primary enemy of testing, then the Testability Explorer will quickly "prove" that you should either

Use a Dependency Injection (DI) framework (in Java)
Use a low-ceremony language that makes mocking trivial

If you want to introduce Spring or Guice to a Java shop that still doesn't believe in DI, Testability Explorer is your friend. It will probably "prove" that your code base isn't testable. Or you can use it like I did in the original blog post to "prove" that Ruby projects are intrinsically testable.

What does the Testability Explorer actually prove? Not as much as one might hope. With the right tool support, the unmockable features of Java become mockable. And even a green project may still be hard to test. This is a problem with metrics in general. Once you understand what they measure, it is all too easy to game the metric, often unintentionally. My colleague Jason Rudolph is giving a talk called How to Fail with 100% Test Coverage that explores this problem.

At Relevance we include two metrics related to testability in our continuous integration builds: rcov test coverage and flog score. Test coverage doesn't measure ease of testing, but insisting on high coverage makes it painful to have complex code. Flog measures method complexity. Unlike Testability Explorer, we treat all complexity as a testability issue.

Java has better metrics, nya nya

Java beats Ruby in support for collecting both of these kinds of metrics. If you want to heckle a Rubyist in a language war, here's some ammunition:

Ruby's coverage tool only provides line coverage, while Java has tools like EMMA that can give you branch coverage within a single line.
Java-oriented complexity tools (like pmd) are much more elaborate than flog, and can provide all kinds of specific advice above and beyond a simple numeric score.

A few more responses to commenters

To the heroic defenders of Ant: I didn't offer any of my own opinions on Ant. I merely reported the numbers from Testability Explorer, and the conclusions that project implies with its color scheme. The only reason I even picked Ant was that it came first on the page. :-)

To the attackers of JRuby: I agree that JRuby is hard to test. I have already looked into it, contributed code, blogged about it, and given conference talks on the subject. One interesting positive aspect of JRuby, which in my opinion dwarfs the negative metrics, is that the JRuby committers are almost always online in the JRuby IRC. They will answer your questions, and they are working to make the project better.

Finally: I am not a Ruby zealot. In ad hominem attacks, please refer to me as a Lisp weenie wannabe.