Prediction is very difficult....

January 19, 2009

Prediction is very difficult, especially about the future.

During New Year's week I was watching tv and happened to switch over to the History Channel. It was showing something about the Bible code. More specifically, the title of the show was The Bible Code: Predicting Armageddon. Don't ask me why the History Channel decided to show something apocalyptic during New Year's week. I guess most people are happy to associate impending doom with the New Year instead of blooming hope.

A slight tangent before I get to the gist of this post....

For those unfamiliar with the Bible Code, it's a book that postulates that the Hebrew bible contains hidden messages in the form of a code â€“ the Bible code â€“ that hold predictions about the future. The exact details of the postulated cipher can be found on its wikipedia page.

Anyway, by using the Bible code, the authors claim to be able to find records of all the major historical events that have transpired including the two World Wars, the Holocaust, the assassination of prominent figures, etc. They concluded that there was "strong statistical evidence" that such encodings could not just be random.

Interesting. So the Bible code actually encodes all the events that have happened. Then could it be deciphered so we could use it to predict events that have yet to transpire? Sure. But there's a catch. We won't actually know how to look for those predictions. It's easy to look for things that have happened because we have clues and keywords to look for in the code. But for predicting the future, we have no idea on what to actually look for. Catch-22.

And that, to me, is a prime example of confirmation bias. The wikipedia article illustrates this easily with the 2-4-6 problem. We only look for what we seek to discover in the first place. And we conveniently ignore what we don't want to discover (or don't really know about yet). We conduct experiments and case studies but all too often we interpret the results to suit what we want to verify.

All right, back to the gist of this post. I wrote this post with a focus on TDD: Test-driven Development. TDD is one of the more controversial practices in agile software development today. And it is also one of the most misunderstood practices.

In Aim, Fire, Kent Beck says:

"Test-first coding isn't testing."

It's more about design. Writing tests first forces the developer to think about the design of the different units. Each unit should be designed so that it can be unit tested easily (and preferably in isolation) from other units.

I'll be honest and say that the first time I heard about TDD, I didn't grasp this fundamental concept. Instead, I too thought that it was all about writing your tests up-front. And, initially, I wasn't very keen on the idea. I believe that adequate testing is definitely useful. But I wasn't really convinced why we needed to do test-first. Wasn't it just as useful to have tests slightly later after the initial design so that your tests actually have a chance of, erm..., passing?

So I used to read papers studying the success of TDD with my own confirmation bias. I always looked out for little things that the authors missed that could invalidate their claims of the success of TDD. There weren't hard to find since it was impossible to do a fool-proof study of TDD in any actual environment.

But here's the interesting part. Now that I am more in favor of TDD, those little things still cause me to be skeptical on how useful TDD is (especially if the authors forget the part that TDD isn't just about testing first!). The case studies aren't really conclusive enough to help me predict if using TDD is a requirement for good software. N.B. Evaluations on small projects aren't particularly helpful either because when your project is small it is likely to succeed even if you don't have a proper process.

Sure, TDD's proponents are still enamored by it. But the views of its opponents (maybe that is too strong a word) cannot be ignored either.

Some of the most important things about writing software include delivering the product to the client on time, ensuring that the product has good quality, ensuring that the product fulfills the requirements and also ensuring that the code is maintainable for subsequent releases.

And right now, we don't have strong evidence that TDD is essential in accomplishing those tasks. There are teams that do not do TDD (whether for design or testing) and yet produce exceptional code. There are teams that start of being gung-ho about TDD and stop doing it halfway because they run into problems. So what does following TDD actually tell us?

And it's not just about TDD. What about things such as refactoring, pair-programming, and all the other pillars of agile development. Or what about all the latest trends in software development such as SOA, cloud computing, etc.

We still don't have a good way to evaluate such things other than to try it out. Trying it out isn't a bad thing but some of these practices cost time and money and could be prohibitively expensive to try out on a whim. And while some would justify it as paying the cost up-front instead of later during the maintenance stage, no one actually knows for sure whether the cost is worth it. And after trying it out, unless we do proper experiments we can't measure the actual merit of that technique. And without proper data, we are inclined to make skew predictions about our ability to replicate the success we had in one software project in our other projects.

And when you cannot actually verify those claims, you run into the danger of herd mentality, religious debates, zealotry. And when something new comes along, you either obstinately stick to you old practices or apostatize and switch over to newer paradigms.

There needs to be more research on studying how to effectively measure¹ the effects of some software development technique. Now, it could be extremely hard to do or even impossible. But without proper studies, we only have our gut instincts to rely on and that is no better than flipping a coin and letting it predict what software practices to follow....

Now I like all the agile development practices. I find that it makes me feel more productive. And it gives me better confidence that I am writing good code. But is that enough as a measurement of how useful a practice is?

Here are some of the TDD papers that I have read that some might find interesting:

1. HCI actually has a good set of evaluation techniques that they use to evaluate user interfaces. Perhaps we need to develop a set of evaluation techniques for Software Engineering as well.