Archive for April, 2006

Producing and maintaining high-quality code

Wednesday, April 26th, 2006

So much bad and ugly code is being spewed out every day all over this planet that entire websites are dedicated to the preservation of the worst specimens. In this post, I will show you simple ways to produce and maintain high-quality code because, firstly, this is one way of achieving Internet stardom that you probably want to avoid and secondly and more importantly, you know how painful it is to have to maintain bad code.

I will start by a disclaimer, using Douglas Adams words:

The problem with designing something completely foolproof is to underestimate the ingenuity of a complete fool.

You can follow all the tips in this post and still end up with bad code, but you would have to be a very crafty fool!

1. Understand the problem

If this sounds like a duh-ism, that is because it is one. If you do not understand what the code is supposed to do, you will not get it right. Before you start coding, whether you face a blank sheet or whether you are maintaining code, sit down and think: what problem am I trying to solve here.

2. Write unit-tests

I could almost have numbered it 1bis. Unit tests show and validate how the code is supposed to be used. Write the tests first if possible — unless you can get your hands on a time-machine, this is not an option when doing some maintenance. The quality of your interfaces will improve dramatically because you will design them from their user’s point of view.

3. Talk to your cardboard friend

How many times were you stuck on a problem, decided to seek help from a nearby colleague, explain the situation and bingo! the solution became obvious. Your colleague hasn’t even had time to pronounce one word, he probably didn’t even understand the question. Articulating the problem solved it. My nearby colleague calls this the “cardboard friend” because people drawn on cardboard tend to have the property of listening without interrupting.

4. Enforce a coding standard

I previously wrote about this but just in case you missed it: you need a coding standard to make your code homogeneous because reading code is hard enough, you do not need the distraction of multiple coding styles. Once you have a coding standard, use tools to enforce it. In Java, checkstyle rocks and if you use eclipse, be sure to check out this plugin. There are similar tools in most languages, use them! At first configure them to be a complete and utter pain and then relax the rules you believe are damaging.

5. Use code analysis tools

Use any tool you can find to study and analyze your code. If the language you are using is compiled, turn on all warnings on the compiler: use -Wall when you use gcc, open the preferences in Eclipse and activate most warnings for the Java compiler (be pragmatic). Use any lint-clone you can put your hands on for your language. Here are the tools I recommend for Java: JDepend to measure the quality of my abstractions and to avoid cyclical dependencies, PMD to identify potential bugs and potential optimizations, FindBugs to (coincidentally) find bugs. I should mention Checkstyle again since it can also detect bug patterns. The situation is simple: the more code analysis tools you run on your code the more you will be able to fix problems before they bite you.

6. Use a continuous integration system

I will unleash my secret finishing move on the next guy who tells me: “but it works on my machine”. A continuous integration system is a referee. If the ref says the build fails because it is broken or because some test fails, it’s because it is. So, go fix it. The ref is always right even when “it works on my machine”. How does it do that? At regular intervals, it tries to build the latest version of your code and if anything goes wrong, it fires notifications — generally emails — to whoever is concerned. Receiving these notifications is not such a bad news: you have identified a problem in your code. Use your continuous integration system to run the code analysis tools and fail the build if these tools report problems. You can use CruiseControl, AntHill, DamageControl or write your own with cron+make+sendmail, but please use a continuous integration system.

7. Write less code

Aspire at writing code like a Zen master writes a Koan:

Monk: Does a dog have Buddha nature or not?
Zhaozhou: Wú.

There are two principles to assist you in your quest for minimalist code: KISS and DRY. “Keep It Simple Stupid”: do what is required and no more. Focus on a clean and simple implementation. “Don’t Repeat Yourself”: avoid duplicating code. Refactor mercilessly. This will decrease the size of your code-base, which is a Good-Thing™ because the less code you write, the less bugs you write.

8. Performance is overrated

I will probably have a contract on my head after this point, that’s what blogging is about, no? Writing fast code is a two step process:

  1. get the code to work,
  2. stop.

In case this was unclear: do not optimize the code. In the vast majority of cases, the code you are about to optimize is not a bottleneck. Use a profiler on a system running realistic data and identify the bottlenecks. Only then can you start working out how to eliminate them. Very often you will not have to change a single interface, only the implementation will require changes.

9. Read, read, read

Read books about software development. You must read the GoF book about Design Patterns and put it to good use. You must pick up Martin Fowler’s Refactoring, Code Complete by Steve McConnell. There are countless others. One of the first book to really open my eyes on developing good code was Writing Solid Code by Steve Maguire.
Read books about other programming languages too, you will frequently discover programming concepts that you might be able to port to your environment.

10. Use common sense

Common sense is your best friend to beat bad code and it’s free! Common sense is what dictates the previous nine points.

I hope you found these tips interesting and entertaining too! I am looking forward to your comments. I leave you with Zhaozhou to conclude this post: Wú!.

technorati tags:, ,

Behaviour Driven Development with RSpec

Monday, April 24th, 2006

Back in February, I told you about Behaviour Driven Development using RSpec. Things have moved in very interesting directions since then.

First, and this is probably the most important, BDD is getting quite some attention, most notably at Canada on Rails. You should check out the Ruby on Rails podcast episode from the 17th of April, where Dave Astels makes an appearance alongside Tim Bray.

On the usage front, RSpec’s DSL has seen some tremendous progress. It reads like plain English. I have not decided yet whether I find it too wordy or simply astonishing! What do you think of this example.

Finally, RSpec now has a good-looking content-packed website complete with documentation, tutorial, examples and all.

There is simply no barrier to its adoption, it just needs traction within the development community. This is where you and me come into the picture. Let’s make RSpec a success, let’s all give it a go: gem install rspec.

You will be amazed at how much RSpec will impact the quality of your code because it changes your perspective on unit testing.

Comments are welcomed and encouraged! That’s all folks, see you next time!

Quantum Computing and “Singes Dactylographes”

Thursday, April 20th, 2006

As a popular science anorak, I listen to several scientific podcasts geared towards enthusiasts. This morning, while catching up on the Science Friday Podcast, I had the great pleasure to listen to Ira Flatow interview Seth Loyd on Quantum computing. The dialogue is witty, thought provoking and easily accessible. Seth Loyd, who jokingly describes himself as a quantum mechanic, excels at popularizing this complex and fascinating topic. I hope not to betray his ideas in what follows.

In man made quantum computers, we coax atoms and elementary particles to perform specific computations. Seth Loyd’s argument is that the universe is a quantum computer. Indeed, every atom and elementary particle carries bits of information that can flip following collisions. In effect, the universe is already computing. This reminded me of an excellent book by David Deutsch: “The Fabric of Reality”. If my memory serves me well, he too considers the universe — or should I say multiverse — to be a quantum computer. What made me think of this book is Loyd’s reference to Borel’s “singes dactylographes” towards the end of the interview.

In 1913, Emile Borel, a french physicist, illustrated how unlikely the occurrence of a particular event is — to be precise: how unlikely it is to observe a noticeable difference from what is the most likely outcome of statistical mechanics — by comparing it with the likelihood that a million monkeys using a million typewriters ten hours a day for a year would produce the exact copy of each and every book in the world’s most famous libraries. I should point out that this quote is also attributed to Aldous Huxley, the English biologist (he would have been only 19 in 1913, so I tend to favour Borel as the originator). Check out the Parable of the Monkeys for a rather long list of related quotes.

Seth Loyd points out that if the monkeys were to use a quantum computer the result is far more likely:

if you take a computer, we have a computer because the universe is computing and it’s a quantum computer, and if you take a bunch of monkeys, here the monkeys are these little tiny quantum fluctuations that tell the universe to do this or that. […] These little accidents program the universe and it’s this process of programming the universe with quantum fluctuations that gives rise to the computation we see around us which produces all sorts of complexity, and structure, and beautiful things, and horrible things, and most of all amazing things.

While listening to this, I could not stop thinking we have already run the monkey experiment. It turns out that monkeys do write the collected works of Shakespeare rather quickly. What takes a lot of time is waiting for the monkeys to self assemble. Indeed, it took roughly 14,000 million years for the quantum computer we are part of to come up with a proto-monkey. It took a further 5 million years for this proto-monkey to evolve into William Shakespeare and approximately 36 years for Shakespeare to write Hamlet. Typewriters turn out to be the hardest thing to get as they only appeared 300 years later.

Rounding up, 100% of the time is spent waiting for the monkey, 0% waiting for the books to be written.

The Panda Principle

Wednesday, April 5th, 2006

A few years back, I read “The Collapse of Chaos: Discovering Simplicity in a Complex World” by Jack Cohen and Ian Stewart. As a popular science addict, I thoroughly enjoyed this book. I particularly remember it for introducing me to Stephen Jay Gould’s Panda principle. All of us have seen the Panda’s principle in action, yet few of us know of it. The following lines will, I hope, shed some light on this principle. I will start by exposing how it came to be discovered. Then I will look at a couple of typical examples to show that it reaches beyond biology. Finally I will explain why the Panda principle is relevant in the context of this blog — because for a couple of paragraphs, you will wonder.

The Panda’s Thumb: Survival of the Fit-ish

In “The Origin of the Species“, Charles Darwin popularized the concept of natural selection. Here it is, summed up by the man himself:

I have called this principle, by which each slight variation, if useful, is preserved, by the term of Natural Selection.

Contrary to popular belief, Darwin did not coin the phrase “Survival of the Fittest”, Herbert Spencer did when he applied Darwin’s theory to human societies. This phrase suggests that only the cream of the cream thrives, the rest being condemned to extinction.

Enter the panda.

Check out this thumb! How could such a poorly formed appendage be the result of millions of years of natural selection. Should it not be the best possible thumb to manipulate bamboo shoots? Stephen Jay Gould answers this question by observing that pandas might have evolved better thumbs but their current thumb is good enough — and sufficiently widespread among the panda population — to prevent improved thumbs from being a selective advantage. He named this the Panda Principle. As Cohen and Stewart put it:

Evolution leads to the occupation of niches by locally optimal creatures, not by globally optimal ones.

In effect, once locally optimal creatures get a foothold they prevent globally optimal ones from occupying the same niche.

The Panda Principle in action

The Panda Principle manifests itself in a wide range of domains. Stephen Jay Gould uses the example of keyboard layouts. Christopher Sholes, an early typewriter inventor, realized that his original design suffered from one problem: fast typists would frequently jam the type-bars. He had to slow them down. He came up with the QWERTY layout by trial and error in 1873. It is fair to say that QWERTY is a slow-ptimized layout! That hardly makes it the most efficient design for today’s usage where RSI sufferers’ fingers are the only thing that can actually jam.

In the 1930s, Dr. August Dvorak found that virtually any random layout leads to faster typing speed than the QWERTY layout — Sholes had worked hard to reach this (global) minimum. Dvorak invented the Dvorak layout — I am sure the amazing and unlikely coincidence is not lost on you. Throughout his life Dvorak fought unsuccessfully to get his layout adopted. He died a bitter man in 1975. Sadly, the man had a point: in every single comparative study, the Dvorak layout proved faster, more comfortable and less error-prone. There is still a community of Dvorak layout users, most of them programmers who have decided to use the fittest layout. But by all account, the QWERTY specimen is here to stay. It might well be the least fit specimen in the keyboard-layout-space but it will not be replaced any day soon. A perfect example of the Panda Principle in action.

At first, I thought the Betamax v. VHS war was another manifestation of the principle. I had it all typed and it looked promising: “despite Betamax being superior to VHS in technical terms — and being the Simpsons’ favourite videotape format — it lost the war”. Then I discovered that the original Betamax tapes were only one hour long, whereas VHS would stretch up to 3 hours long. In hindsight, this is a huge argument against Betamax’s fitness. As it turned out, in the videotape space, fitness is strongly correlated with the record-ability of hollywood movies. Hardly a good example to illustrate the Panda Principle.

Panda sighting #1: Smalltalk

Now, you must be wondering where I am going to with this principle and what is the link with this blog. Well, I have recently stumbled upon two manifestations, or so I believe, of the Panda Principle in the realm of software development.

The first one was a couple of weeks ago when I decided to pick up a Smalltalk book from Stéphane Ducasse’s collection and learn the language. Smalltalk is a fascinating and beautiful language. It was destined to be the next best thing. Then came Java. We all know what happened next.
The more I read about or play with Smalltalk, the more I wonder how Java won the match. It seems to me that Smalltalk had everything Java does and much more — yes, I am once again talking about metaprogramming, but there are so many other aspects. The two most common arguments to Java’s dominance are:

  • Java was an incremental step from C++. Whereas Smalltalk was influenced by the far less popular Simula and Sketchpad. It was therefore easier for Java to gather momentum as it appealled to a larger audience
  • Sun made the JDK available for free. Smalltalk development kits where expensive and more often than not incompatible.

Both arguments seem to forget that at the beginning, Sun envisaged Applets as being Java’s spearhead. Ironically, Java only really took off when people started using it on the server side.

What surprises me is the impressive list of contributions the Smalltalk community made to the Java community. Take Eclipse, arguably one of the most successful open-source Java project. Eclipse was originally derived from IBM’s VisualAge Micro Edition. VisualAge, up to the Micro Edition release, was Smalltalk based. As Joe Winchester puts it

From a technology point of view Eclipse has a lot of Smalltalk DNA in it.

Take the JUnit framework, written by Erich Gamma and Kent Beck. It has inspired countless testing frameworks in virtually any programming language. Well, JUnit actually had a precursor: SUnit, also written by Kent Beck.

Martin Fowler’s classic “Refactoring: Improving the Design of Existing Code“, although not a pure Java book, is on the bookshelf of most Java programmers (go buy it if is not!). The foundations of this book lie in Smalltalk as Fowler acknowledges in the preface.

It is hard to conceive that a language so ahead of its time and yet so simple failed to overcome the challenge of what seems a less evolved language. Moreover, Smalltalk had been preparing since 1971 for the advent of OOP‘s widespread acceptance. Java’s inception, on the other hand, only dates from 1990. When C++’s dominance came to pass, how did the fittest lose to the slightly fitter? Is this another manifestation of the Panda Principle?

Panda sighting #2: Mac/OS X

I have always been curious about Operating Systems and more particularly OSes for personal computers. I still remember my Pentium-75 triple-booting OS/2 WARP, Windows 95 and Linux. For quite a few years I have been a proponent of Linux and FreeBSD. I rarely use Windows, I get rashes and spots when exposed to it for more than a couple of minutes.

But lately, through the Ruby and Io community, I have been frequenting a bad bunch: Mac users. I heard so much about the Mac that I decided to have a closer look. I looked a bit too close. Here I am proud owner of a brand new iMac. I have been using it for less than a week and I have no doubt: this is the best machine I have ever had the pleasure to work with. My wife used to complain about my excessive usage of computers. Now, we are almost fighting about whose turn it is to use the Mac. I thought it would take me a while to get use to the Mac idiosyncrasies. After only a couple of hours, I feel pretty comfortable with the interface. Not only is it gorgeous, it is also far more comfortable and consistent than most systems I have used in the past.

A typical example of consistency is the key bindings. No matter what application I use I know command-F will let me search whatever this application manipulates. Compare that with Windows, ctrl-F is used by most applications but not all: see for yourself, try it in IE. It actually opens a side panel to let you search the web. If you look in the main menu, you will discover that ctrl-F is indeed the binding for searching into the current page… and sometimes it does — let me know if there is a cartesian explanation for this infuriating behaviour. The problem also exists in Gnome and KDE. It tends to be related to non-native applications like X-based ones.

But let’s go back to the Panda Principle, nobody can dispute Windows dominance in terms of marketshare. It is however hard to fathom how Windows could be considered the fittest OS for personal computers. I am talking fitness in the market space, not fitness in my own little OS fanatic space. Since I work in computers, I tend to get called by other people to fix their computers. I am pretty sure this resonates with your own experience. Are these people masochistic? How much pain can they endure? That repeatedly? Microsoft is losing the trust of its own customers. This has been going on for quite a few years, and yet only few people switch to alternatives. And there are many viable alternatives! I think this is a typical manifestation of the Panda Principle. Windows might not be the fittest but the shear number of users virtually guarantees that it will keep its place.

That is fine by me. I am quite happy with my very own globally optimum machine!

A bit of help to spread the word on Digg or Reddit would be very much appreciated!