Wednesday, January 21, 2009

Blogroll, January 2009

Programming languages are a voyeuristic pursuit for me, as my daily work consists entirely of C and assembly. Nonetheless it is quite clear that the future belongs to high level languages. Garbage collection avoids a huge class of bugs which have vexed developers for decades. Exceptions likewise make the handling of errors simpler and more robust, and having associative arrays as a built-in capability makes them the obvious choice for many needs.

Criticism is leveled at high level languages about their performance, but languages based on a virtual machine will eventually achieve higher performance than C++. Profile-driven optimization is a well known technique to improve software performance, by optimizing for code paths which are actually used. Achieving this with C/C++ compilers is possible by running a first pass binary and feeding back profile data to a second pass. This is such a giant pain in the ass that it is practically never done. Virtual machines, by their very nature, constantly collect profile data while running the interpreter. The Just-In-Time compilation in a VM can take advantage of this information, resulting in straight line, tightly scheduled code with no branches. It is a beautiful thing... or rather, it will be once it all works.

So without further ado, lets talk about blogs that talk about programming languages.


The Blogroll: January 2009
 
1) armstrong on software, by Joe Armstrong

Joe writes mostly about Erlang, a language which has been around since the mid 1980s, though part of that time was hidden within the bowels of the Ericsson corporation. Erlang is very different from other programming languages. For example, there really are not variables in the normal sense: you can give a name to a value, like 'x', but you cannot change the value of x later. The language is single assignment only. Similarly there are no loops (you can't have a loop if you cannot have a loop variable), there are only list comprehensions. There is really no mutable state in an Erlang program, once created a data element is guaranteed not to change until it is garbage collected.

Why does Erlang make these choices? Erlang's primary design features are all around making message passing very cheap and efficient. Complete lack of mutable state is one example: there is no need to copy the current values of variables into a message, as there is no chance the values will change. Data can be freely passed by reference. The messaging infrastructure drives Erlang's reliability features, where Erlang nodes monitor other Erlang nodes and take action if they stop responding. The messaging capabilities also allow excellent scalability across multiple CPUs, which has driven Erlang's recent resurgence (with high-visibility deployments in Facebook chat and Amazon SimpleDB.


 
2) Headius, by Charles Nutter

Ruby is an interesting language, driven to popularity by the excellent Ruby on Rails web application framework. Ruby's primary implementation is mri, which only recently implemented a bytecoded virtual machine. Earlier mri versions were entirely interpreted, and this led to the development a number of alternate implementations on various VMs. JRuby is the most interesting to me personally: it implements the Ruby language atop the Java Virtual Machine, a very mature and well-performing technology. Charles Nutter is driving the JRuby development effort, and joined the staff at Sun Microsystems to work on it full time.


 
3) Code Commit, by Daniel Spiewak

Erlang and Ruby are both dynamically typed languages: functions accept objects as arguments, and dynamically determine their type at runtime. This is certainly powerful, in that a single routine can handle a wide variety of inputs, but it also means that type errors will only be detected at runtime. Unit testing becomes essential in this environment... and I have trouble accepting the premise that I'm supposed to love doing manually that which the toolchain used to be able to do.

Daniel Spiewak writes mainly about Scala. C is a weakly typed language, where it is trivial to cast one type to another. Scala is an example of a more strongly typed language, making the toolchain able to make more guarantees about what the valid inputs are. Scala is also interesting in that it is another language implemented atop the Java VM.