Testability Metrics, Revisited

  • Posted By Stuart Halloway on May 22, 2008

The response to my earlier post on testability metrics has been, well, a little negative. I think that the strident tone of the conversation (for which I bear much of the blame) has artificially widened the gap of opinion between me and some people I respect.

So I am going to start again, and fill in the gaps from the earlier post. Along the way, I will answer the interesting questions raised in the comments.

Java is naturally untestable, and that ain't news

Let's start at the very beginning, with the idea that Java is naturally untestable. What does the word natural mean here? That if you use Java naturally, taking advantage of the various features of the language when they are appropriate to the task at hand, you will end up with untestable code. This is not news to anybody, least of all experienced Java programmers. Here are a few quotes from the Java community. First, on the Testability Explorer home page:

The metrics are a calculation of the skill of the development team in making their classes testable.

From the Spring Framework mission statement:

Testability is essential, and a framework such as Spring should help make your code easier to test.

And from the Guice homepage:

You will still need to write factories in some cases, but your code will not depend directly on them. Your code will be easier to change, unit test and reuse in other contexts.

If I can paraphrase these quotes: Easily testable code doesn't just happen naturally in Java. A team must develop skill in testability. In fact whole frameworks have been created, counting testability among their primary objectives.

Java can be tested

Of course, many Java programmers can and do write testable code. As a Java programmer, I learned to avoid certain idioms that are not testable, or to use a framework to do it for me. But this (in my view) isn't natural. I have made the point elsewhere that a language should be richly consistent. If "testability" is a core feature of the language, then testability should be combinable with all the other core features. There should not be a laundry list of special cases of the form: "Don't do X or you can't have Y."

In many other languages, testability is almost free!

The Testability Explorer defines testability in terms of two things that you should avoid:

  • unmockable complexity
  • global mutable state

As in the earlier post, I will focus exclusively on the first of these: unmockable complexity. (Global mutable state is a discussion for another day.) Also, I am not going to worry about the complexity part. I use cyclomatic complexity measures in both Java and Ruby code, and find them to be decent predictors of code that a human reviewer would find objectionably complex.

What remains then, is the definition of unmockable. The Testability Explorer considers code unmockable if it cannot be overridden or injected, which it defines as (some subset of) constructors, statics, and privates.

By this definition, Ruby has no unmockable complexity:

  • Ruby's "statics" are polymorphic to begin with, so they can be overridden just like instance methods. (In fact, they are instance methods.)
  • Ruby constructors can be overridden or simply rewritten.
  • Ruby's reflection model will let you publicize private members when needed in a test.

Several commenters asked for a patch so they could try out LOW CEREMONY mode for themselves. Easily done: set the unmockable complexity number to zero across the board. I did it by refactoring MethodInvocation.computeMetric.

What do the testability metrics really tell us?

When I figured out how Testability Explorer calculated its metrics, I knew immediately what it would "prove" -- it is one of those metrics whose outcome is driven by its assumptions. If unmockable complexity really is a primary enemy of testing, then the Testability Explorer will quickly "prove" that you should either

  1. Use a Dependency Injection (DI) framework (in Java)
  2. Use a low-ceremony language that makes mocking trivial

If you want to introduce Spring or Guice to a Java shop that still doesn't believe in DI, Testability Explorer is your friend. It will probably "prove" that your code base isn't testable. Or you can use it like I did in the original blog post to "prove" that Ruby projects are intrinsically testable.

What does the Testability Explorer actually prove? Not as much as one might hope. With the right tool support, the unmockable features of Java become mockable. And even a green project may still be hard to test. This is a problem with metrics in general. Once you understand what they measure, it is all too easy to game the metric, often unintentionally. My colleague Jason Rudolph is giving a talk called How to Fail with 100% Test Coverage that explores this problem.

At Relevance we include two metrics related to testability in our continuous integration builds: rcov test coverage and flog score. Test coverage doesn't measure ease of testing, but insisting on high coverage makes it painful to have complex code. Flog measures method complexity. Unlike Testability Explorer, we treat all complexity as a testability issue.

Java has better metrics, nya nya

Java beats Ruby in support for collecting both of these kinds of metrics. If you want to heckle a Rubyist in a language war, here's some ammunition:

  • Ruby's coverage tool only provides line coverage, while Java has tools like EMMA that can give you branch coverage within a single line.
  • Java-oriented complexity tools (like pmd) are much more elaborate than flog, and can provide all kinds of specific advice above and beyond a simple numeric score.

A few more responses to commenters

To the heroic defenders of Ant: I didn't offer any of my own opinions on Ant. I merely reported the numbers from Testability Explorer, and the conclusions that project implies with its color scheme. The only reason I even picked Ant was that it came first on the page. :-)

To the attackers of JRuby: I agree that JRuby is hard to test. I have already looked into it, contributed code, blogged about it, and given conference talks on the subject. One interesting positive aspect of JRuby, which in my opinion dwarfs the negative metrics, is that the JRuby committers are almost always online in the JRuby IRC. They will answer your questions, and they are working to make the project better.

Finally: I am not a Ruby zealot. In ad hominem attacks, please refer to me as a Lisp weenie wannabe.

Comments
  1. Gabriel CMay 22, 2008 @ 09:12 PM

    I think you’re limiting too much your “testability” scope. Any software that stops and produces some effect can be tested…

  2. Jonathan AllrnMay 22, 2008 @ 09:27 PM

    What is with all this obsession with mocks?

    Formal testing has to be against real code, not a bunch of stubs. Mocks may be a useful fall-back, but they don’t replace real end-to-end testing against the real program.

  3. Andres AlmirayMay 24, 2008 @ 01:12 AM

    Stu, thanks for expanding on the subject, still I’m not truly convinced that Java the language is untestable by itself, but rather the conventions and “standards” used by developers make it hard (real hard) to be tested. If all your classes and fields are public then how much trouble would be for you to mock and inject everything? Once we delve into encapsulation the code gets more verbose, still it is testable, so perhaps the problem with Java is that it takes too much code to test a simple thing (here I completely agree with the low ceremony argument) so that you have to resort to DI testing.

    Another example would be Ruby’s message system, you just send a message to an object hoping it understands it, no need to define the method in a binding contract (interfaces), and most of the times the object responds as it is expected, Java can’t do that (easily) specially when you try to reach a non-public method, reflection tricks come into play and it gets messy.

    Granted, I’m a Groovy developer but the following bit also applies for JRuby and perhaps Jython, testing Java code with Groovy is much easier that testing it with plain Java, not only because Groovy has a lower ceremony factor than Java and it integrates well with it, GPath, operator overloading, MOP, property access and dynamic proxies also help.

    That being said, testing Java with Java is not always an easy task, testing Java with [Groovy|JRuby|Jython] relieves you of a lot of burden, which I believe is somewhat the message you tried to send, not radically switching to NON-Java as other posters angrily suggested.