Archive for March, 2006

Little manual of cloning for Io programmers

Monday, March 20th, 2006

Today, we will continue the little stroll around the Io language landscape that we started in Blame it on Io! A slow-paced introduction to the Io language. We will delve deeper into the central principle of prototype-based languages: cloning.

Cloning crash course

For the mad scientists and other evil dictators who landed here after googling “manual of cloning”, here is a crash course in object creation, à-la Io:

Movie := Object clone

original := Movie clone
original title := "Island of Lost Souls"
original year := 1933
original scenario := "Mad scientist off-shores
    development of mutant race"

remake := original clone
remake title = "The Island of Dr Moreau"
remake year = 1977

remakeOfRemake := remake clone
remakeOfRemake year = 1996

Object creation in Io, like Polka, is a two step dance: 1. you clone, 2. you specialize.

Differential inheritance

As we have just seen, cloning an object is done by sending it the clone message. This allows for a very elegant implementation of the Singleton pattern:

GeorgeClooney := FamousActor clone
GeorgeClooney clone = GeorgeClooney

Bad news for my wife… but I digress.

To really understand cloning, it is necessary to look at how an object is implemented. I will spare you the details of the struct definition (see vm/IoObject_struct.h if you really want to know) by remaining at a conceptual level. Io objects are made of two ingredients: slots and prototypes. Slots are stored in a hash table and represent both the data and the methods of the object. Prototypes are stored in a list and are the ancestors of the object — more on why this is a list in the next section.

How does this all fit together? Well, when a message is sent to an object, its name is used to query the slots hash table of the object. If no slot is found, the operation is repeated for each prototype in the prototypes list. When a matching slot is found, it is activated. If no slot is found, the forward slot is activated. If no forward slot is found, then you get an error. This will clarify:

JeanClaudeVanDamme := FamousActor clone
JeanClaudeVanDamme act

will raise: Exception: Importer Object does not respond to 'act', whereas:

KeanuReeves := FamousActor clone
KeanuReeves forward := method(
    writeln("I know ", call message name))
KeanuReeves kungfu

will even pretend that Keanu Reeves can act.

Now, back to our first question, what happens when an object is cloned? Very little actually: a new object is created with an empty slots hash table and a prototypes list containing the target of the clone message. By default, a clone delegates everything to its original. Once it is created, a clone is usually specialized by adding slots in its slots hash table. The clone only stores the difference with its original. This is called differential inheritance.

Clones sharing memories with their original, not only makes bad hollywood movies (The Island and Replicant), it is not always desirable:

Movie := Object clone
Movie cast := List clone

would lead to all movies sharing the same cast! You could fix the problem by initializing the cast slot to a freshly cloned list on every Movie clone, but this would be error-prone and very inelegant. This is where the init slot comes into play. It is activated, if it is present, just after the creation of the clone. Let’s rewrite the previous piece of code:

Movie := Object clone
Movie init := method(self cast := List clone)

Note that self is required, otherwise the cast slot will be local to the method.

Future directions

This is the current state of affair in Io. But things might change, Steve Dekorte, the mad scientist behind Io, identifies 5 different cloning strategies with regards to slots:

  1. nil: the slot is set to nil in the clone — init := method(self foo = nil)
  2. shared: the slot is not created in the clone — the current default behaviour
  3. shallow: the slot is created with the original’s value — init := method(self foo = foo)
  4. deep: the slot is created with a clone of the original’s value — init := method(self foo = foo clone)
  5. very deep: the slot is created with a deep clone of the original’s value — init := method(self foo = foo deepClone)

It is probable that support for each strategy will be added to Io at some point. For now, you have to use the init slot to implement these strategies.

Dynamic inheritance with Protos

Let’s come back to the prototypes list. From what I have explained so far, it seems objects can have only one parent, the original. A list looks superfluous. Great news for my mad-scientists readers: the answer does not follow the laws of biology. Io makes it possible to add and remove parents to an object, at runtime, using appendProto, prependProto, removeProto and other related methods. These methods simply manipulate the list of prototypes associated with an Io object.

LukeSkywalker := Person clone
LukeSkywalker appendProto(Orphan)
LukeSkywalker fightsWith(DarthVader)
DarthVader tellsSecretTo(LukeSkywalker)
LukeSkywalker removeProto(Orphan)
LukeSkywalker appendProto(DarthVader)

And this concludes today’s stroll! I hope you found it informative and entertaining. I would like to thank everyone on the #io channel for your patience in dealing with my questions. Most, if not all, of the information contained in this post comes from chats on IRC and from the Io documentation. I’ll only claim the silly examples!

Blame it on Io! A slow-paced introduction to the Io language

Wednesday, March 15th, 2006

No posts in a week, I have been silent too long. Blame it on Io!

You see, Io is a contagious disease. It affects programmers’ brains. The incubation period is extremely short, due to the simplicity of its syntax. But the damages are extensive, particularly for Algol-derivative programmers: you will end up messed up! Huh… I forgot to mention the vectors include reading posts about Io. I just hope your infection will not be as bad as mine. (If I do not get google hits from E.R. fans with this paragraph, what will it take?)

Yo Io!

First things first, let’s get the token “Hello World!” implementation out of the way:

"Yo Io!" print

Aside from my poor taste in greeting phrases, you probably noticed how concise and simple this statement is. At first sight, to most programmers, it appears to be written backward. Not so if you think that the "Yo Io!" object — which happens to be a string — is being sent the message print. Everything in Io is an object, and I really mean everything: lists, files, strings, numbers and even messages!

Attack of the clones

Well that is not fully accurate, to be pedantic, I should have written: everything in Io is a prototype. Indeed, Io is a prototype-based language. C++/Java/C# programmers, here comes surprise #1: Io knows nothing about classes. In class-based languages, classes describes the structure and behaviour of objects in term of methods whereas instances hold the objects’ data. Prototype-based languages do not make this distinction. A prototype is both a class and an instance. To get a new instance, you clone an existing prototype. To get a new class you add or alter the behaviour of an existing prototype. Sounds complex? Let your inner mad scientist free:

Sheep := Object clone

In English, Sheep is defined as a clone of Object — the parent of all Io prototypes.

Sheep legCount := 4

In brain-damaged English, the legCount property of a Sheep is defined as four. Non-programmers would have you believe this is equivalent to: a sheep has four legs.

MutantSheep := Sheep clone

A MutantSheep is defined as a clone of a Sheep. No surprise here, we all knew that!

MutantSheep legCount = 7

But their legCount property is set to seven! Most inner mad scientists tend to go for six legs, but my inner mad scientist is also nonconformist. I explain the difference between = and := in the next section. Just pretend they are the same for the moment.

dolly := MutantSheep clone

Here be dolly the mutant sheep! The convention is to use an upper-case character at the beginning of the name of prototypes acting as classes and a lower-case character for prototypes acting as instances.

Lots of slots

Once again, I have to confess that I have been inaccurate: I wrote legCount was a property of a Sheep whereas, in Io speak, it is actually a slot of Sheep. A slot associates a value to a name within the context of a prototype. Sheep holds one slot named legCount with the value 4. Here comes surprise #2: slots can contain variables and methods.

Sheep baa := method("bah!" print)

The baa slot on Sheep is defined as a method which prints “bah!”. Simple, no?
You noticed earlier that Io has two different assignment operators: := and =. Both of them are actually methods on Object. := is actually parsed as setSlot and = is parsed as updateSlot. From the method names you probably inferred the difference: := creates a new slot on a prototype from a name and a value whereas = updates an existing slot with a new value.
Io interprets the previous code sample like this:

Sheep setSlot("baa", method("bah!" print))

There is madness to Io’s methods

In case you were wondering, method is a method on Object that creates anonymous methods. The self-referential aspect of what I just wrote surely enlightened you, not. Rewind. Slow motion. The method method takes a list of parameters. All but the last one are arguments to the created method. The last argument is the body of the method. In this example, I define the growMoreLegs slot on MutantSheep to be a method with one parameter:

MutantSheep growMoreLegs := method(n,
    legCount = legCount + n)

Apart from the slightly surprising syntax, I hope this all sounds pretty familiar so far. Things are about to change big time! Surprise #3: a method can control when and if to evaluate an argument. This lets us define flow control constructs (if, unless, while, etc.) as methods. Most of them already exist on Object of course. One construct is missing though and we will implement it here. But first, we will botch the job:

Object unless := method(cond, then, else,
    if(cond, else, then))

The problem: cond, then and else are evaluated when unless is called. We do not want that. What we want is: unless cond is true, evaluate then, otherwise evaluate else. In Io this is done like this:

Object unless := method(
    if(call eval argAt(0),
        call eval argAt(2),
        call eval argAt(1)))

Update: see jer’s comment for a better implementation
What looks like a nice trick, is actually extremely powerful. Lisp programmers laugh in the background, they’ve used it successfully since 1958! I will not explore this vast topic any further in this post — I must save some ammunition for future posts!
By the way, I hope you appreciated the fact that Io is a dynamic language: we just added a method to the Object prototype!

Lobby for Io!

Before concluding, I would like to point out that Sheep and MutantSheep are defined like slots. But what is the underlying prototype? In Io, this is the Lobby. So what? Well, Lobby ultimately derives from the Object prototype, like all other prototypes. But where do you think we found the Object prototype? Bingo! It is a slot in the Protos slot of Lobby. I suggest you read this again. You are not dreaming: Object is both a slot of a slot in the Lobby and an ancestor of the same Lobby. How is that for another self-referential bit of self-reference?

Time to conclude, I hope this introduction will entice you to dwell deeper into this really amazing language. Believe me, we have only scratched the surface!
For a more complete, formal and accurate presentation of Io, check out The Io Programming Language.
Visit the official website or join the posse on irc.freenode.net#io.

Development best practices: coding standards and the “20 lines” rule

Wednesday, March 8th, 2006

Defining a coding standard

A coding standard is a set of conventions regulating how code must be written. These conventions usually cover formatting, naming and common idioms. Choosing them can be a painful process as it frequently leads to endless and passionate discussions between developers (how many hours have been lost arguing the positioning of curly-braces in Algol derived languages). Yes, us developers have an acute sense of aesthetics when it comes to our code — probably only rivalled in intensity by our legendary lack of aesthetics in the clothing department. In my experience, the best way to select a set of conventions is to have one experienced programmer act as a dictator. After all, no coding standards has ever pleased everyone.

Enforcing the standard

Coding standards are like speed limits: they are A Good Thing™ but they are useless unless they are respected. There are several ways to enforce the rules. Code reviews are probably the least efficient (don’t get me wrong: having code reviews is a very valuable practice, but not to enforce a coding standard). Using a code formatting tool when code is checked in the source control management system is more efficient. However, these tools rarely cover naming conventions and common idioms. Most IDEs can be configured to warn when the conventions are not followed and format the code on the fly. But the most efficient way is to use a dedicated tool and integrate it in the build system — particularly when using continuous integration. The best example I have come across is Checkstyle. Not only does it integrate easily in a build system but it can also be used as a plug-in in most Java IDEs. One stone, two birds and no more escaping the coding standard!

Benefits

At the very least, adhering to a coding standard allows a developer to read a piece of code written by another developer while focusing only on the content, because she is familiar with the form. If you think this is mildly important during the development phase of a project, think about the maintenance phase. Undoubtedly, enhanced readability leads to better maintainability. What is less obvious is that a coding standard can also improve the design of a piece of software. In my experience, no rule has a greater impact on design than what I call the “20 lines” rule.

The “20 lines” rule

It goes like this: No method body shall be longer than 20 lines. Period. Yes, that’s all: no big formula and no esoteric concept, just a simple little rule that absolutely anyone can understand. Its power lies in its simplicity.

You see, 20 lines of code is more than enough to express an idea concisely and it is also about the right amount of information the eye and the brain can scan and comprehend without having to do too much double-takes — not that I have data to back this up, but 20 lines happens to snugly fit in most screens/windows used while coding, thus reading the code does not require interrupting the train of thought by scrolling. If the only impact of this rule was enhanced readability, you would accuse me of false advertisement.

Most benefits from this rule actually cascades from the side effects of breaking down a large method in a set of smaller methods. The most dramatic effect is the reduction of the cyclomatic complexity of the code. Here comes the esoteric concept! Crudely, the cyclomatic complexity of a piece of code is a measure of the number of possible execution paths through it. Less execution paths means less execution paths to test. Bingo! We have just simplified our unit tests. The emergence of smaller methods improves abstraction and reusability. Indeed, very frequently these methods can be invoked in other parts of the code base, thus reducing redundant code.

I am certainly forgetting other effects but I hope you get my point: the “20 lines” rule is a really low hanging fruit. I frequently realize how far reaching its usage as had on the code I have written. Finally, the “20 lines” rule could have been Fight Club’s ninth rule: it does not accept exceptions, as exceptions to this rule are precisely when this rule should be applied.

Give it a go, you will be surprised by how quick you’ll start reaping the rewards!

RubyBeans revisited

Monday, March 6th, 2006

I have refactored the Rubybeans, a short example of Ruby metaprogramming code to make better use of Ruby’s metaprogramming API. I would recommend reading the original post to get the background information (in short: developing JavaBeans is a repetitive task that would greatly benefit from metaprogramming, RubyBeans are an illustration of how this would be done in Ruby). Since this post is also intended for developers not familiar with Ruby, I have tried to be quite verbose while explaining the code.

Buckle your seatbelts! Here we go:

class RubyBean
  def initialize
    @listeners = []
  end

I have just declared the RubyBean class. The initialize method is called everytime a RubyBean is created. It initializes the listeners instance variable to be an empty array. In Ruby, instance variables are preceded with @. Note that, instead of using [], I could have used Array.new. The square bracket notation is more commonly used in Ruby. In case it isn’t obvious, the listeners array will contain the property change listeners registered on this RubyBean.

Now, I add two methods to manage the listeners: one to register a listener on this bean, one to unregister a previously registered bean.

  def register_listener(l)
    @listeners.push(l) unless @listeners.include?(l)
  end

  def unregister_listener(l)
    @listeners.delete(l)
  end

In register_listener method takes one parameter l, the listener to be registered. Its code is almost plain english: push the listener at this end of the list of registered listeners, unless it is already there. The unregister_listener method is even simpler: remove the given listener from the list of registered listeners.

The code so far hasn’t used any metaprogramming features. But things are about to change!

In what follows I define a property class method. It is intended to be used by RubyBean subclasses to define their properties. For every argument it receives, the property class method will add 2 methods to the current RubyBean subclass: a getter and a setter.

  def RubyBean.property(*properties)
    properties.each do |prop|

In Ruby, a class method is defined by using the prefixing the method name by the class name: def RubyBean.property (this is equivalent to Java’s static modifier when it is used before a method). The *properties notation lets me capture a variable number of arguments as an array. The first line of the body loops through each element of this array. The name between the pipe characters is the name of the variable used to hold the current element in the block. For the time being, you can consider a block to be akin to an anonymous inline function. Blocks in Ruby come in two flavours do |param, other_param|... end and { |param, other_param|... } (note that if no parameter is required by the block, there is no need to use the pipe characters).

      define_method(prop) {
        instance_variable_get("@#{prop}")
      }

We’ve just written our last ever getter! The accessors in Ruby do not follow the JavaBeans conventions: the getter for a property is simply the property name (getName() is name) and the related setter is the property name followed by equal (setName(...) is name=). The previous code defines a method named using the content of the prop variable. The body of this method is defined by the block between the curly braces (this block has no parameters, so no need for pipe characters this time). The block will return the value of the instance variable named prop. In a double quoted string, #{…} is replaced by the evaluation of its content, here prop. So if the value of proc is name, "@#{prop}" will be "@name".

The following code defines how we declare setter methods. It also uses the #{...} notation, this time to concatenate the property name to equal, to create method name of the setter.

      define_method("#{prop}=") do |value|
        old_value = instance_variable_get("@#{prop}")
        return if (value == old_value)

The block for the setter accepts one argument, the value, as indicated between the pipe characters. As per Java conventions, we first make sure that the new value is different from the current value. If the values are the same, no need to notify the listeners, so we simply return.

Otherwise, we loop on all the listeners and notify them of which property is changing from what value to what value:

        @listeners.each { |listener|
          listener.property_changed(prop,
                                   old_value,
                                   value)
        }
        instance_variable_set("@#{prop}", value)
      end

Once all listeners have been notified, we set that property to its new value.

And finally, a bit of tidying up:

    end # loop on properties
  end # end of property method
end # end of RubyBean class

I hope this has enlightened you! Let me know!

Binary Search Tree sauce Ruby (part 1)

Thursday, March 2nd, 2006

This post is the first of two focusing on the implementation of a binary search tree in Ruby. This first part will deal with adding elements to a binary search tree. The second part will cover binary tree traversals.

Let’s jump straight in! After writing the unit tests, I have a first go at the code (I chose to go for a linked implementation rather than an array based one). I end up with this:

class BinaryTree
  def add(value)
    if @root.nil?
      @root = BinaryTreeNode.new(value)
    else
      @root.add(value)
    end
  end
end

class BinaryTreeNode
  def initialize(value)
    @value = value
  end

  def add(value)
    if value < @value
      if @left.nil?
        @left = BinaryTreeNode.new(value)
      else
        @left.add(value)
      end
    else
      if @right.nil?
        @right = BinaryTreeNode.new(value)
      else
        @right.add(value)
      end
    end
  end
end

Green light, the unit tests pass. One thing is annoying me though: code duplication! In three places, once in BinaryTree and twice in BinaryTreeNode, I have:

if field.nil?
  field = BinaryTreeNode.new(value)
else
  field.add(value)
end

Wouldn’t it be great to have a method that writes this piece of code for a particular field? Metaprogramming, here we come! I will put this method in a module so that it can be included in both classes.

module BinaryTreeHelper
  private
  def add_or_create_node(field, value)
    if instance_variable_get(field).nil?
      instance_variable_set(field,
                            BinaryTreeNode.new(value))
    else
      instance_variable_get(field).add(value)
    end
  end
end

Now I can refactor my classes:

class BinaryTree
  include BinaryTreeHelper

  def add(value)
    add_or_create_node(:@root, value)
  end
end

class BinaryTreeNode
  include BinaryTreeHelper

  def initialize(value)
    @value = value
  end

  def add(value)
    add_or_create_node(value < @value ? :@left : :@right,
                       value)
  end
end

Pretty nice, no?