Archive for the 'Software development' Category

Reinventing the wheel and associated pains

Tuesday, October 17th, 2006

Have you ever visited Starke County, Indiana? I haven’t. I wouldn’t even have heard of it if it wasn’t for the little misadventure I’m about to relate. What lead me to Starke County is a rather bad habit that afflicts most of us, developers. No, not the lousy t-shirts… not the heavy coffee drinking… not the sleepless nights spent hacking… do we have that many bad habits? The one I want to single out today is our tendency to reinvent the wheel. I don’t really know why but we all believe we can do better than whatever framework, library or system currently exists. The pursuit of perfection is a laudable ambition but all too often, it is the way to pain, because, as this post will highlight, we underestimate the effort required to get a decent wheel.

Spinnin’ wheel

I’ll start by contradicting what I just wrote: reinventing the wheel is a good thing. Take web-frameworks. They come in all shapes, sizes and colors. The variety is so abundant there has to be one that fits your needs perfectly — finding it is the topic of another post. Why would you want to come up with your own implementation? As James Brown famously said please, please, please, please to which, far lesser known fact, he added not another flamin’ web-framework!. He was being polite. Yet, when Rails came out, it shook the web-framework arena with its simplicity and elegance. You might disagree with me regarding Rails’ qualities, but you cannot dispute the seismic impact it has had on web development and on the popularity of dynamic languages. If this does not convince you, take a look at Seaside, KPAX or Django. Obviously reinventing the wheel has some advantages, at the very least it fosters ideas and progress. Heck, Linux started as a reinvention of existing UNIX systems.

Wheel of misfortune

But there are contexts when reinventing the wheel is a very, very bad idea. The worst context I can think of is in a corporate environment when the wheel is not your core business. Why? Well, there are several reasons:

  1. you are going to spend time and resources building this new wheel,
  2. you are going to spend time and resources testing this new wheel,
  3. you are going to spend time and resources documenting this new wheel,
  4. you are going to spend time and resources maintaining this new wheel.

See a recurring pattern? To be fair, the first point is rarely underestimated. What tends to happen though is that the promised benefits seem to justify the building cost because the others are underestimated.

The testing cost is always neglected. Think about it: how can you ensure your code receives at least the same exhaustive testing as the one it is supposed to better. Consider the constant QA bashing endured worldwide and around the clock by Struts, Tapestry or Spring. Sure, you will still find the occasional bug. You will lose time and patience working around the issue, or even better, you will fix it and submit a patch. Will someone do this for your code?

The documentation cost… well… what documentation? Up-to-date documentation is the unicorn of software development: a beast of legends. Mind you, if you have a good unit-test coverage, it can serve as documentation and it is guaranteed to be in sync with the code. This still falls short of the documentation of the wheel you are replacing: books, articles or blogs. What’s your plan?

The maintenance cost is directly related to the previous two. The more tested your wheel is, the least it is likely to break down and require maintenance. If it does, the more documentation there is, the easiest it will be for a third party to resolve the issue. Let’s face it, chances are the people implementing your new wheel today will be gone tomorrow, there are so many wheels to reinvent elsewhere!

Back to Starke County

You are probably wondering what is the link with Starke County, so here is my little anecdote. It all started by a bug report sent by the sales team. They were demonstrating the product to some potential customers in New-York. Naturally, they had set the system’s time-zone to EDT/EST in accordance with the local time. The time on the command-line interface to the system was accurate but, somehow, the time on the web front-end was off by one hour. The issue was logged and we got cracking straight away.

To give you a bird’s eye view of the concerned parts of the system, the command line interface is implemented in C/C++, whereas the web front-end is Java based. It turned out the system uses its own home-made time-zone database for reasons I have yet to fathom. This database maps time-zones set by the administrator to UNIX or Java time-zones depending on the interface. We quickly spotted that EDT_EST was mapped to US/Indiana-Starke.

We ran simple tests on various platforms and where surprised to discover that the latest version of the JVM was giving the erroneous off-by-one answer. Hmm… weird. At that point we started to suspect something fishy. We even checked out Google earth to find out where the elusive Starke, unsuccessfully. Eventually, I stumbled upon this:

[…] other entries represent smaller regions like Starke County, Indiana, which switched from central to eastern time in 1991 and switched back in 2006.

It turned out the time-zone database on the Linux system was not up to date but the JVM one was. Java was right all along. Whoever populated the time-zone mapping database made a very poor choice when they picked US/Indiana-Starke as a representative of EDT_EST, especially when other candidates would be America/New_York, US/Eastern, EST5EDT… but what do I know.

Better still, the time-zone mapping table should have gotten the boot. There are wheels you should not even think about replacing. Without our home-made “improved” wheel, the choice of US/Indiana-Starke or America/New_York or anything else for that matter would have been left to the end-user, untouched by the system. Granted, the 25,000 potential users living in Starke County would still have suffered the bug, but the 20,000,000 potential users in New-York State would not have been affected — and I’m not even counting the other millions of people leaving on eastern time.

Where the wheel comes full circle

I hope you will remember this little anecdote next time you are thinking about reinventing the wheel. If you do, slow down and weight your options carefully. Is there an existing framework, library or system that already provides the functionality? To an acceptable degree? If not, is your employer fully aware of the risk and cost associated with your proposal? Make it clear! If you are self-employed, are you willing to take the risk and pay the cost? Are the benefits worth it? You get the picture: think it through. If you reach the conclusion that it is all worth it, you’re either mad or incredibly lucky, because you will get to fulfill your inner-geek deepest desire: inventing your very own wheel!

technorati tags:,

Advertisements

Producing and maintaining high-quality code

Wednesday, April 26th, 2006

So much bad and ugly code is being spewed out every day all over this planet that entire websites are dedicated to the preservation of the worst specimens. In this post, I will show you simple ways to produce and maintain high-quality code because, firstly, this is one way of achieving Internet stardom that you probably want to avoid and secondly and more importantly, you know how painful it is to have to maintain bad code.

I will start by a disclaimer, using Douglas Adams words:

The problem with designing something completely foolproof is to underestimate the ingenuity of a complete fool.

You can follow all the tips in this post and still end up with bad code, but you would have to be a very crafty fool!

1. Understand the problem

If this sounds like a duh-ism, that is because it is one. If you do not understand what the code is supposed to do, you will not get it right. Before you start coding, whether you face a blank sheet or whether you are maintaining code, sit down and think: what problem am I trying to solve here.

2. Write unit-tests

I could almost have numbered it 1bis. Unit tests show and validate how the code is supposed to be used. Write the tests first if possible — unless you can get your hands on a time-machine, this is not an option when doing some maintenance. The quality of your interfaces will improve dramatically because you will design them from their user’s point of view.

3. Talk to your cardboard friend

How many times were you stuck on a problem, decided to seek help from a nearby colleague, explain the situation and bingo! the solution became obvious. Your colleague hasn’t even had time to pronounce one word, he probably didn’t even understand the question. Articulating the problem solved it. My nearby colleague calls this the “cardboard friend” because people drawn on cardboard tend to have the property of listening without interrupting.

4. Enforce a coding standard

I previously wrote about this but just in case you missed it: you need a coding standard to make your code homogeneous because reading code is hard enough, you do not need the distraction of multiple coding styles. Once you have a coding standard, use tools to enforce it. In Java, checkstyle rocks and if you use eclipse, be sure to check out this plugin. There are similar tools in most languages, use them! At first configure them to be a complete and utter pain and then relax the rules you believe are damaging.

5. Use code analysis tools

Use any tool you can find to study and analyze your code. If the language you are using is compiled, turn on all warnings on the compiler: use -Wall when you use gcc, open the preferences in Eclipse and activate most warnings for the Java compiler (be pragmatic). Use any lint-clone you can put your hands on for your language. Here are the tools I recommend for Java: JDepend to measure the quality of my abstractions and to avoid cyclical dependencies, PMD to identify potential bugs and potential optimizations, FindBugs to (coincidentally) find bugs. I should mention Checkstyle again since it can also detect bug patterns. The situation is simple: the more code analysis tools you run on your code the more you will be able to fix problems before they bite you.

6. Use a continuous integration system

I will unleash my secret finishing move on the next guy who tells me: “but it works on my machine”. A continuous integration system is a referee. If the ref says the build fails because it is broken or because some test fails, it’s because it is. So, go fix it. The ref is always right even when “it works on my machine”. How does it do that? At regular intervals, it tries to build the latest version of your code and if anything goes wrong, it fires notifications — generally emails — to whoever is concerned. Receiving these notifications is not such a bad news: you have identified a problem in your code. Use your continuous integration system to run the code analysis tools and fail the build if these tools report problems. You can use CruiseControl, AntHill, DamageControl or write your own with cron+make+sendmail, but please use a continuous integration system.

7. Write less code

Aspire at writing code like a Zen master writes a Koan:

Monk: Does a dog have Buddha nature or not?
Zhaozhou: Wú.

There are two principles to assist you in your quest for minimalist code: KISS and DRY. “Keep It Simple Stupid”: do what is required and no more. Focus on a clean and simple implementation. “Don’t Repeat Yourself”: avoid duplicating code. Refactor mercilessly. This will decrease the size of your code-base, which is a Good-Thing™ because the less code you write, the less bugs you write.

8. Performance is overrated

I will probably have a contract on my head after this point, that’s what blogging is about, no? Writing fast code is a two step process:

  1. get the code to work,
  2. stop.

In case this was unclear: do not optimize the code. In the vast majority of cases, the code you are about to optimize is not a bottleneck. Use a profiler on a system running realistic data and identify the bottlenecks. Only then can you start working out how to eliminate them. Very often you will not have to change a single interface, only the implementation will require changes.

9. Read, read, read

Read books about software development. You must read the GoF book about Design Patterns and put it to good use. You must pick up Martin Fowler’s Refactoring, Code Complete by Steve McConnell. There are countless others. One of the first book to really open my eyes on developing good code was Writing Solid Code by Steve Maguire.
Read books about other programming languages too, you will frequently discover programming concepts that you might be able to port to your environment.

10. Use common sense

Common sense is your best friend to beat bad code and it’s free! Common sense is what dictates the previous nine points.

I hope you found these tips interesting and entertaining too! I am looking forward to your comments. I leave you with Zhaozhou to conclude this post: Wú!.

technorati tags:, ,

Behaviour Driven Development with RSpec

Monday, April 24th, 2006

Back in February, I told you about Behaviour Driven Development using RSpec. Things have moved in very interesting directions since then.

First, and this is probably the most important, BDD is getting quite some attention, most notably at Canada on Rails. You should check out the Ruby on Rails podcast episode from the 17th of April, where Dave Astels makes an appearance alongside Tim Bray.

On the usage front, RSpec’s DSL has seen some tremendous progress. It reads like plain English. I have not decided yet whether I find it too wordy or simply astonishing! What do you think of this example.

Finally, RSpec now has a good-looking content-packed website complete with documentation, tutorial, examples and all.

There is simply no barrier to its adoption, it just needs traction within the development community. This is where you and me come into the picture. Let’s make RSpec a success, let’s all give it a go: gem install rspec.

You will be amazed at how much RSpec will impact the quality of your code because it changes your perspective on unit testing.

Comments are welcomed and encouraged! That’s all folks, see you next time!

Development best practices: coding standards and the “20 lines” rule

Wednesday, March 8th, 2006

Defining a coding standard

A coding standard is a set of conventions regulating how code must be written. These conventions usually cover formatting, naming and common idioms. Choosing them can be a painful process as it frequently leads to endless and passionate discussions between developers (how many hours have been lost arguing the positioning of curly-braces in Algol derived languages). Yes, us developers have an acute sense of aesthetics when it comes to our code — probably only rivalled in intensity by our legendary lack of aesthetics in the clothing department. In my experience, the best way to select a set of conventions is to have one experienced programmer act as a dictator. After all, no coding standards has ever pleased everyone.

Enforcing the standard

Coding standards are like speed limits: they are A Good Thing™ but they are useless unless they are respected. There are several ways to enforce the rules. Code reviews are probably the least efficient (don’t get me wrong: having code reviews is a very valuable practice, but not to enforce a coding standard). Using a code formatting tool when code is checked in the source control management system is more efficient. However, these tools rarely cover naming conventions and common idioms. Most IDEs can be configured to warn when the conventions are not followed and format the code on the fly. But the most efficient way is to use a dedicated tool and integrate it in the build system — particularly when using continuous integration. The best example I have come across is Checkstyle. Not only does it integrate easily in a build system but it can also be used as a plug-in in most Java IDEs. One stone, two birds and no more escaping the coding standard!

Benefits

At the very least, adhering to a coding standard allows a developer to read a piece of code written by another developer while focusing only on the content, because she is familiar with the form. If you think this is mildly important during the development phase of a project, think about the maintenance phase. Undoubtedly, enhanced readability leads to better maintainability. What is less obvious is that a coding standard can also improve the design of a piece of software. In my experience, no rule has a greater impact on design than what I call the “20 lines” rule.

The “20 lines” rule

It goes like this: No method body shall be longer than 20 lines. Period. Yes, that’s all: no big formula and no esoteric concept, just a simple little rule that absolutely anyone can understand. Its power lies in its simplicity.

You see, 20 lines of code is more than enough to express an idea concisely and it is also about the right amount of information the eye and the brain can scan and comprehend without having to do too much double-takes — not that I have data to back this up, but 20 lines happens to snugly fit in most screens/windows used while coding, thus reading the code does not require interrupting the train of thought by scrolling. If the only impact of this rule was enhanced readability, you would accuse me of false advertisement.

Most benefits from this rule actually cascades from the side effects of breaking down a large method in a set of smaller methods. The most dramatic effect is the reduction of the cyclomatic complexity of the code. Here comes the esoteric concept! Crudely, the cyclomatic complexity of a piece of code is a measure of the number of possible execution paths through it. Less execution paths means less execution paths to test. Bingo! We have just simplified our unit tests. The emergence of smaller methods improves abstraction and reusability. Indeed, very frequently these methods can be invoked in other parts of the code base, thus reducing redundant code.

I am certainly forgetting other effects but I hope you get my point: the “20 lines” rule is a really low hanging fruit. I frequently realize how far reaching its usage as had on the code I have written. Finally, the “20 lines” rule could have been Fight Club’s ninth rule: it does not accept exceptions, as exceptions to this rule are precisely when this rule should be applied.

Give it a go, you will be surprised by how quick you’ll start reaping the rewards!

Behaviour Driven Development

Monday, February 6th, 2006

A few days back, I was listening to Dave Astels talk on Behaviour Driven Development on the Agile Toolkit Podcast. It struck a chord!

As a brief overview, Behaviour Driven Development (BDD) derives from Dave Astels experience with Test Driven Development (TDD). In TDD you write your tests before you write your code. Dave argues that this has been misunderstood (see A new look at TDD) and that TDD isn’t about testing the code but about specifying its behaviour. He suggests that changing the vocabulary might improve the quality of TDD, and coins BDD as a way forward.

This resonates with my experience so well! How many times have I seen developers misusing unit tests. Here is an example of what I mean: there is an idiom in Java that says one unit test class per class in the system, and one test method per method in the tested class (Eclipse’s “JUnit test case” wizard does that – note that I am not trying to bash Eclipse here!). This has to be one of the main culprits for the bad usage of JUnit. A TestCase should be about one case, not all use cases! So there should be multiple test cases per class to test. With BDD, the shift in vocabulary tends to force the developer to split the test case in multiple specifications contexts. The tests are not expressed in terms of assertions (prefixed with assert) but in terms of duty (prefixed with should). This helps reinforce the purpose of writing tests first.

Some might say that BDD is only a thin layer on top of TDD. This might be true but I think it is missing the point. BDD is not a paradigm shift, it is an evolution of TDD based on experience. BDD is a pragmatic refactoring of TDD.

I would recommend having a look at RSpec, Dave’s BDD framework in Ruby.