Development best practices: coding standards and the “20 lines” rule

Wednesday, March 8th, 2006

Defining a coding standard

A coding standard is a set of conventions regulating how code must be written. These conventions usually cover formatting, naming and common idioms. Choosing them can be a painful process as it frequently leads to endless and passionate discussions between developers (how many hours have been lost arguing the positioning of curly-braces in Algol derived languages). Yes, us developers have an acute sense of aesthetics when it comes to our code — probably only rivalled in intensity by our legendary lack of aesthetics in the clothing department. In my experience, the best way to select a set of conventions is to have one experienced programmer act as a dictator. After all, no coding standards has ever pleased everyone.

Enforcing the standard

Coding standards are like speed limits: they are A Good Thing™ but they are useless unless they are respected. There are several ways to enforce the rules. Code reviews are probably the least efficient (don’t get me wrong: having code reviews is a very valuable practice, but not to enforce a coding standard). Using a code formatting tool when code is checked in the source control management system is more efficient. However, these tools rarely cover naming conventions and common idioms. Most IDEs can be configured to warn when the conventions are not followed and format the code on the fly. But the most efficient way is to use a dedicated tool and integrate it in the build system — particularly when using continuous integration. The best example I have come across is Checkstyle. Not only does it integrate easily in a build system but it can also be used as a plug-in in most Java IDEs. One stone, two birds and no more escaping the coding standard!

Benefits

At the very least, adhering to a coding standard allows a developer to read a piece of code written by another developer while focusing only on the content, because she is familiar with the form. If you think this is mildly important during the development phase of a project, think about the maintenance phase. Undoubtedly, enhanced readability leads to better maintainability. What is less obvious is that a coding standard can also improve the design of a piece of software. In my experience, no rule has a greater impact on design than what I call the “20 lines” rule.

The “20 lines” rule

It goes like this: No method body shall be longer than 20 lines. Period. Yes, that’s all: no big formula and no esoteric concept, just a simple little rule that absolutely anyone can understand. Its power lies in its simplicity.

You see, 20 lines of code is more than enough to express an idea concisely and it is also about the right amount of information the eye and the brain can scan and comprehend without having to do too much double-takes — not that I have data to back this up, but 20 lines happens to snugly fit in most screens/windows used while coding, thus reading the code does not require interrupting the train of thought by scrolling. If the only impact of this rule was enhanced readability, you would accuse me of false advertisement.

Most benefits from this rule actually cascades from the side effects of breaking down a large method in a set of smaller methods. The most dramatic effect is the reduction of the cyclomatic complexity of the code. Here comes the esoteric concept! Crudely, the cyclomatic complexity of a piece of code is a measure of the number of possible execution paths through it. Less execution paths means less execution paths to test. Bingo! We have just simplified our unit tests. The emergence of smaller methods improves abstraction and reusability. Indeed, very frequently these methods can be invoked in other parts of the code base, thus reducing redundant code.

I am certainly forgetting other effects but I hope you get my point: the “20 lines” rule is a really low hanging fruit. I frequently realize how far reaching its usage as had on the code I have written. Finally, the “20 lines” rule could have been Fight Club’s ninth rule: it does not accept exceptions, as exceptions to this rule are precisely when this rule should be applied.

Give it a go, you will be surprised by how quick you’ll start reaping the rewards!

About these ads

40 Responses to “Development best practices: coding standards and the “20 lines” rule”

  1. Prashanth Says:

    you have made no reference to commenting/documenting the code. Document your data structures and teach him how to hack your program, Document your code and teach him how to read your program :-)

  2. ozone Says:

    Prashanth, I would follow the XP school of thoughts: the code should be self explanatory, comments are usually a sign that the code needs improvements.

  3. dennis Says:

    I see comments as being helpful for two things: 1) Clarifying the *intent* of the code, rather than how it works, which can help you find bugs when those two don’t match, and 2) Explaining the context that each routine fits into. Good code can make clear how it works, but not how it fits into the larger picture.

    Now, of course you can see the larger picture if you start at the right point and work your way downward through all the method calls, but maybe you have a particular section you’re interested in, and don’t want to master the entire codebase just to understand that one piece.

    For a great example of comments like this, look at the SQLite sourcecode.

  4. Greg Says:

    In general, a good idea, but my experience shows that people who say “there are no exceptions allowed” are usually wrong.

    In this case, I’d say that there are numerically intensive algorithms that require more than 20 lines of code, and that breaking these up into

    for each row of matrix
    call method one
    call method two
    call method three

    done solely to make a 20 line rule happy isn’t going to
    be very useful. For instance, consider Numerical Recipe’s
    singular value decomposition functions.

    If you haven’t seen these, then what you’re saying is “I haven’t seen anything I didn’t think could be broken into 20-line methods, so therefore there must not be anything.”

    One last this: those of us who’ve been around a while may remember that some of the old unix code was written in a this-function-must-fit-on-a-24×80-crt-screen mode, and as a consequence was in places very difficult to read.

    I agree that it is good to express ideas simply and in small, easily understood chunks. But there are ideas in the world where the _idea_ is complex, and it is more important to do the _right_ thing than to bindly follow rules. Saying “there should be very few exceptions” is a good idea; saying “there are never any exceptions” betrays a lack of experience.

    Best wishes,

    -greg

  5. ozone Says:

    Greg, I tend to agree with your analysis. The problem with saying “there should be very few exceptions” is that most developers end up believing that the code they are writing is one of these exceptions.

    I’ll take an example from a comment on reddit: “try to apply this rule in a java method which opens a file and a connection to a database — just the exception handling there will take 19 lines of code”. This guy would obviously believe he has just found one of these rare exceptions, whereas his method is typically the kind of method that this rule targets!

    This reminds me of a french expression: “l’exception fait la règle” (loosely translated: one exception makes a rule) which implies that no matter what the rule, there is always an exception.

    Thanks for the feedback!


  6. I agree with the XP philosophy, but comments need not be only to explain the code. The code is usually the solution, comments should explain the problem.

  7. Greg Says:

    I agree that exceptions certainly present opportunity for abuse, and it’s difficult to know where to draw the line.

    At times I’ve been the “standards dictator” you refer to in your blog, and I’ve generally followed a philosophy that says: doing something outside of the standards must be substantially more effort than following them.

    There are several ways to accomplish this, but I focused on two. First, make them easy to follow. This might involve with using popular, easily remembered, or truly convenient standards. It might involve providing or making tools that automatically bring code into conformance, or editors that automagically help people write conformant code.

    However, the other end of the spectrum is also important: make not following the standards pretty inconvenient. For instance, let’s look again at the 20 line rule. If you say to developers that you’ll require, say, a two page memo describing reasons for the exception, alternatives considered, risks and benefits of the exception, etc. then you’ll find that the vast majority of time, people will just follow the standard, and the times such a memo crosses your desk, you’ll have someone pretty well convinced the exception is necessary.

    Finally, the fact that Java and his database access methodology requires half a page of exception handling might be a problem with the way Java does things — it’s pretty verbose sometimes — but one wonders why open a file _and_ a database connection in the same method? That doesn’t seem irreducibly complex to me. So I’m on your side there.

    Thanks for an interesting conversation…

    Best wishes,

    -greg

    p.s. Nice use of WordPress, by the way.

  8. Prashanth Says:

    “Good comments don’t repeat the code or explain it. They clarify its intent. Comments should explain, at a higher level of abstraction than the code, what you’re trying to do.” (Code Complete, McConnell)

    Yes good code needn’t be commented thoroughly. but comments help in giving a deeper insight into the code. Also the 20 line rule cannot be held fast between lanugages. Compare a 4G language (Read Python) and C ;-)

    Good article. Kudos.

  9. ozone Says:

    Thanks Prashanth,

    I don’t fundamentally disagree with you regarding comments. I tend to track them and refactor the code so as to make them superfluous. As Martin Fowler and Kent Beck put it — in Martin Fowler’s excellent book: “Refactoring: Improving the Design of Existing Code” — while discussing code smells: comments often are used as a deodorant.

    I have successfully used the “20 lines” rule in C, C++, Java and Ruby. In functional programming (Lisp, Scheme, Haskell, etc.), 20 lines is probably too much.

  10. Prashanth Says:

    20 lines might be a little on the lower side for a language like C for most professional projects. C++, Java, etc may do fine. But I agree a 20 line function is a visual treat :-)

  11. PinkyDead Says:

    Standards Suck!!!

    We’ll they don’t actually – but the big problem with standards is the people forget why they are there.

    I love the idea of being able to look at two source files written by different developers and getting the same feeling of style continuity from both source. And there is nothing worse than looking at two pieces of code that only differ in the spacing around operators.

    The core purpose of standards has to be (a) interoperability between current users (of the source) and (b) future maintainability by others. Rules in standards that go to coding style, as opposed to simple layout style, are a bad thing.

    Take the 20 line rule. If you are writing a device driver with low-level operations (or similarly a mathematical/engineering/financial calculation), then 20 lines might actually be two generous.

    On the other hand, writing an XML parser that simply drops down though an if-else structure to make simple calls, would not benefit from being split into multiple smaller methods.

    Another problem – I can write the XML parser in 5 lines and the device driver in 50 lines. The parser would be unreadable, and the device driver will give you RSI from your mousewheel.

    Who are your standards directed at? newbies (totally green or new to your way of thinking) – you’ve gotta have them. And the problem with them, is that they pick up very quickly on the standard but have no idea why it is there. Then what do they do? Sacrifice the original intent of the standard for the standard itself – writing totally indecipherable code in neat 20 line blocks. Slaps all round!

    Rigid rules are fine for whitespace and the like – and checkstyle is great for keeping that in line. But automated tools do not work as a replacement for skill and experience (and a little common sense).

    Nothing wrong with 20 lines as an aspiration – sucks as a rule. A method should be a single unit of (useful) functionality – not a fixed length ASCII string.

    “Fight club” is a film – not the real world.

  12. ozone Says:

    PinkyDead, the “20 lines” rule is a pragmatic — and tested — solution to a real world problem: most of us write poorly abstracted code. You can decide to ignore the problem and write “if-else” sequences or you can try to come up with solutions.

    The “20 lines” rule does not come from a top-down approach, it emerged from our experience: three years ago, I started imposing it on the team, simply because it was a checkstyle option and I wanted to see the impact it would have. Along the way, we tried a lot of other options. Most only impacted the layout of the code, which is what we expected. Some had a negative effect on the quality of the code and we dropped them. Now three years later, while chatting with a colleague we look back at the rules we used and one of them stands out by both its simplicity and its beneficial impact: the “20 lines” rule. I wrote this post to share this with whoever is willing to read and may be even try it.

  13. pinkydead Says:

    As a rule of thumb – 20 lines excellent. As an error condition in checkstyle – no way.

    You are, of course right, most of us do write poorly abstracted code. That’s the point though. Well abstracted code is not less than 20 lines long, and badly abstracted code is not greater than 20 lines long.

    Just to test your theory, I turned on the warning on my own production code. 15000 lines of code, only 20 methods over 20 lines long. Only 5 > 30.

    Let’s say my code is well abstracted – I didn’t write it that way because 20 lines is a good method length, I did it because that’s a skill I have developed over many years. The 20 lines is a ‘symptom’ of code, not a cause.

    I think I might leave on that 20 lines filter – but I won’t turn it in to a rule.

    Don’t get me wrong – it’s a good point, well made. But it’s many the good point that has been picked up by an idiot and turned into a fool’s charter.


  14. [...] I previously wrote about this but just in case you missed it: you need a coding standard to make your code homogeneous because reading code is hard enough, you do not need the distraction of multiple coding styles. Once you have a coding standard, use tools to enforce it. In Java, checkstyle rocks and if you use eclipse, be sure to check out this plugin. There are similar tools in most languages, use them! At first configure them to be a complete and utter pain and then relax the rules you believe are damaging. [...]

  15. Ben Says:

    Sorry, I’ve got to say, I find the 20 lines rules rather boneheaded. I see two major problems, both of which have been commented on. First, you can get around the “rule” in ways that don’t improve the code at tall — pack more statements on one line, break up the code arbitrarily into pointless sub-functions. Second, some things you do are just inherently complex, and performance may still be necessary. (My experience is as a professional gaming and embedded programmer, so performance is always an issue). Try doing lighting with different specular, diffuse, and ambient components (and RGB factors) in both light and the materials. Try to do a software rasterizer for semi-transparent textured triangles using only fixed point. Hell, try to have a main event loop with a lot of different event types.

    In short, consider that the world of programming is considerable larger than the corner you’ve seen so far.


  16. Ben,
    yes, you can get around the 20 lines rule easily by packing statements on a single line. But I do not agree that “pointless” sub-functions impact on performance. Most, if not all, compilers inline these functions. I do agree that in certain domains, 20 lines is too severe a constraint as discussed with Greg.
    By the way, a good few years back, I use to code demos (the good ol’ days!) so I am acquainted with the domain you describe. Could we engage openly without hiding ourselves behind attitude and derogatory comments.

  17. Greg Miller Says:

    Ozone said: ‘The problem with saying “there should be very few exceptions” is that most developers end up believing that the code they are writing is one of these exceptions.’

    While this may be true, not allowing exceptions is just plain wrong (parden me for being so frank about it). While I’ll agree that I’ve run into a lot more cases where I wish the “20-line rule” were enforcred, than cases I have run into where I needed to violate it. The general idea that smaller functions produce better code is flawed. I had recommended, more than once, that some developers break their functions up into smaller pieces and they would come back with functions like “DoIt()”, “DatabaseStuff()”, etc. After seeing this a few times I realized I can’t make someone a better programmer by giving them general rules of thumb.

    There’s just no substitute for a good code review by a good programmer. And when someone fails the code review, the response isn’t a standards document, it’s a sincere and specific recomendation on how to improve the code in question.

  18. Gerard V Says:

    Am I the only person who has a problem with the fact that it took this guy 50+ lines “to express [his] idea concisely” about how their is no exception to the rule of 20 lines of code “to express an idea concisely”? I wrote my own “counter-point” to this flawed theory on my own blog. if anybody is interested, as I didn’t want to go over 20 lines in this comment. It’s a bit harsh, but not as harsh as demanding total conformity found in this article.

  19. Gerard V Says:

    Greg… You make a great point. It sounds like you work with a team of programmers… If that’s the case, then have you considered “cleaning up” the code after the app has been completed (or better yet, during the testing/optimization phase) rather than wasting time doing it during the coding phase of the project? If your programmers are very good at what they do, albeit a little sloppy (for a lack of a better word) in getting it done, and you also believe in this 20-Line limit nonsense, then this would be the perfect time to focus on it. Just a suggestion.


  20. Thanks Greg for expressing your disagreement without being disrespectful. It is evidently getting quite rare.

  21. twifkak Says:

    Let’s say my code is well abstracted – I didn’t write it that way because 20 lines is a good method length, I did it because that’s a skill I have developed over many years. The 20 lines is a ’symptom’ of code, not a cause.

    True, but as the saying goes, “Fake it ’til you make it.”

  22. Paddy Says:

    At a previous company I instigated a 25 line max function limit based on reading that people can handle around five different ‘things’ and that function bodies are often grouped or separated into areas. 5 ideas, in five line groups leaad to our 25 line limit that we applied to great effect.

    Your 20 line rule is far too restrictive.

  23. Brooks Moses Says:

    There’s something I don’t understand about this 20-line limit, which I’d like to. In my codes — and I should note that I’m working largely on numerical simulation codes that are for my own use, at this point; thus, my corner of the programming world is often a weird and remote one — I tend to have a fair number of functions that fit the following pattern:

    * 10 lines of code doing thing A, resulting in having a half-dozen variables defined.
    * A one-line comment.
    * 10 lines of code doing thing B, which uses the half-dozen variables from thing A, and define a half-dozen new variables.
    * A one-line comment.
    * 10 lines of code doing thing C, which uses the half-dozen variables from thing B and a couple from thing A, and defines a half-dozen more variables.
    * A one-line comment.
    * 10 lines of code doing thing D, which uses the half-dozen variables from thing C and a couple from A and B, and defines the final output that I care about.

    Now, it would obviously be trivial to divide this up into four separate functions. The one-line comments are practically equivalent to the function names, even — all it takes is translating “/* Find the first point on the boundary */” to “findFirstBoundaryPoint()”, or whatever.

    I can see a lot of costs of doing this. These functions will have about a dozen parameters each, which would have to get passed. And thus would have to get declared, each time, which is rather a lot of lines of code. I could conceptually put them all in a single structure, but that would be a one-use structure that I’d never use anywhere else. Any time I change the algorithm in a way that adds or removes a variable, I’ll need to change the variable declarations in each of the functions that use it, the overall function, the function calls, and the function definitions. If I decide that the last step of thing A really should be moved to the end of thing B, then I’ve got a terrible tangle to undo.

    On the other hand, I see very few advantages. Potential for code reuse isn’t one of them; I’m Not Going To Need It ™. Having some of the variables be local to a “thing” might catch a typo or two. It makes unit testing a bit easier, since I could write separate tests for the individual things rather than putting “debug” statements in the main code.

    I have to presume, from the fact that the “20 lines” opinion is so prevalent, that I’m missing something, and that there are some tremendous advantages that I’m simply not seeing. What are they?


  24. Brooks,
    I do not think you are missing something. The rule is not as efficient in the domain you work in as it is in others. Greg pointed out previously that numerically intensive algorithms frequently break the 20 lines rule and trying to make them fit might be counterproductive.
    In your particular case, if I chose to follow the rule, I would use a structure to hold the local variables, as you suggested, and inline the smaller functions, to limit any performance impact. This might help improve an eventual maintainer’s understanding of the algorithm. Having simpler unit tests is very valuable in that regard too. And of course, simpler tests means less error prone tests.
    Most of these arguments are probably void if the algorithm you implement has a widely known form that gets obfuscated by the 20 lines rule. This would clearly be an impediment to maintenance.
    I believe the rule’s soft spot is in Enterprise computing where ease of maintenance is paramount.
    Thank you very much for spending the time to write such a detailed and relevant comment.


  25. [...] Development best practices: coding standards and the “20 lines” rule [...]

  26. Prashanth Says:

    I found another link with some good hinters. It reminded me of this post. http://www.perforce.com/perforce/papers/prettycode.html


  27. Very good link indeed! Thanks Prashanth.

  28. Losers Says:

    “The most dramatic effect is the reduction of the cyclomatic complexity of the code. Here comes the esoteric concept! Crudely, the cyclomatic complexity of a piece of code is a measure of the number of possible execution paths through it. Less execution paths means less execution paths to test. Bingo!”

    There’s a reason why we call you morons code monkeys. The avg programmers who create broken ass buggy software. Suppose I havea function that calls 20 functions and those functions call 20 functions and those functions call 20 functions. Add that up and tell me how many test cases that is retard.


  29. Great to see anonymous posters being so courageous about their opinion.

    Your comment is frankly stupid: 20 functions that call 20 functions that call 20 functions is simply 60 functions to tests with a couple of mock objects. Compare that with the number of execution paths in your humongous method solution!

    Next time try to think before you comment.

    • David Murphy Says:

      The extra functions will contain the same number of execution paths in total. Also, you may create extra needless paths because you have taken functions that might have been logically cohesive with loose coupling and turned them into the opposite.

  30. Ben Says:

    Hi Oliver,

    One year later! You seem to have misinterpreted my post in a number of ways.

    First, splitting into sub-functions can be pointless because the splitting may be arbitrary and/or confusing, not because of performance losses.

    Second, my list of examples was just to show there were a number of cases, one might even say whole domains (as you’ve later recognized) where 20 lines is a poor rule.

    Finally, I didn’t think I was showing too much attitude or being derogatory. I attacked the idea, not you. The only ‘attack’ on you was my opinion that you were generalizing from one (small?) domain of programming, and you didn’t seem to recognize/admit that.

    I’m all for polite discourse, and I’m glad the 20 line ‘rule’ worked for you. I think it’s too arbitrary, language and domain dependent, and a very poor substitute for code reviews. Short functions should be a result of following other design guidelines (one purpose, few/no side effects, easy testability) rather than a goal in and of themselves.

    And as can be read in Code Complete, the few studies that have been done suggest that functions between 50-200 lines were best, which is quite a ways from 20. The studies are fairly old, but should be food for thought nevertheless.


  31. [...] Development best practices: coding standards and the “20 lines” rule [...]

  32. Dave Jarvis Says:

    Eventually code and comments will be split into two panes. Source code in one and corresponding comment on the other. Managers will love it because the logic can be wholly expressed in a human-readable language.

    Flip it ahead to 1:40 to see an example.

    • Nagora Says:

      “Eventually” was about 30 years ago when Forth did exactly this – you were (and still are in some current Forths) automatically allotted one screen of comment space for each screen of source code. These screens were called “shadows” and depending on the editor either were shown side-by-side with the corresponding code or there was a key to flick between them.

      Having space allotted like this “tempted” coders to actually use it for commenting.

      Additionally, it was difficult (although not impossible) to code single functions (or “words” in Forth lingo) that were more than 15 lines long and the recommended length was 2-4 lines.

      Forth was a major player back in the day and used by NASA for many probes. It still shows up in places now and has a reputation for creating solid reliable code. In modern terms, it is a domain language defining language – you start with Forth and design a language with it that suits your problem and then you solve the problem with that language. It’s rather fun.


  33. Dave,
    nice video but I can’t imagine working in such a poor UI: 1:40 to get to the code? No thank you!
    Separating the comments from the code seems rather clumsy: how do you know which comment corresponds to which line of code? In the video, the system highlights the comment for the current line of code. Would I have to move the cursor to everything line to figure out the related comment?


  34. Arguing that 20 lines is better because it fits nicely in a window is a bit of antiquated thinking (in my view), born in the VT100 days. When I went to college in 1982, the rule in my Pascal classes was that a routine was disallowed from being larger than 24 characters. I found out later that this was because the graders could view the whole function on a single screen, so this fits your reasoning. However, when I am editing code, my coding window has at least 50 lines, if not more. This has been true since I got my first Amiga in 1986.

    I do strive for simplicity and brevity in my functions, but I seem violate my own guideline at least once per major project, when the “fix” would be more work to maintain than the original code.


  35. [...] niente API tuttofare, niente funzioni/metodi mastodontici (cfr. questo bel post); [...]

  36. David Murphy Says:

    I started programming in 1981 in a government place whose code went back to the early 1970s. The 20 line rule applied than (it was for Cobol, can’t remember of the same rule applied on the assembler code). It was there because the code fit into one page of standard computer printout (we didn’t have screens, we used punch cards and paper).

    It was a dumb rule then and a dumb one now, it delivered little benefit but made logically coherent code artificially split across modules. If you applied the ancient rules of code modularisation (Coherence and Cohesiveness) you would not need such artificial rules.

    I suppose I ought to be surprised to see you recreating ancient coding rules again, but given most programmers only (re) discovered the necessity for testing recently and still don’t have any rules to guide them about handling exceptions I suppose I should not be.

    • David Murphy Says:

      Pah, of course I meant Coupling and Cohesion (beats head against wall). I must be getting old…


    • Yes 20 lines is as dumb and crude as humans. It is needed because programmers do not understand or want to admit to any human limits … Perhaps the pure reality is logical coupling and cohesion .vs. artificial breaking code in to modules based the limits humans eye / brain. Perhaps the exception is when the coupling and cohesion extends human limits then documentation is needed to help extend human limits. Documentation also needed to understand compact code… and to help people learn to code…


Comments are closed.

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: