Thoughts on Systems

Emil Sit

Let’s Improve Our Code

New Year’s is a good time to set intentions for the coming year. Many people come off the holidays with the intention to exercise more, but if you’re reading this blog, you’re probably a programmer (if you’re not, consider signing up for Code Year…), so let’s set an intention about our programming. But first, a musical interlude.

Earl Hines was a jazz pianist; in this 9 minute video, he describes how his early playing evolved.

As you watch it, notice how he not only describes and demonstrates how his style evolved, he also describes why. For example, he talks about how his melodic line was drowned out in the larger bands so he picks up playing in octaves (doubling up the notes).

In his TED talk, David Byrne generalizes the idea of environment influencing music by talking about how music has always evolved to fit the architecture in which it was performed: from how the ethereal sounds of early church music were driven by the open acoustics of churches to how the smaller rooms of the 18th and 19th centuries allowed for the more complex rhythms and patterns of classical music to be heard. (Watch it here.)

Can we as programmers reflect similarly about our programming styles? What influences the way our programs look? And more importantly, perhaps, why should we care?

For music, Byrne argues that the evolution of styles was driven by the needs of the audience and the acoustics of the performance hall. Understanding these consciously allows contemporary musicians to make more informed choices about what and how they perform.

As programmers, our programs must communicate: with the compiler, of course, so that it will render our code executable, but also with the human readers of our code, be that our future selves or our colleagues. So to write better programs—programs that communicate their intent more concisely and clearly, as opposed to those that execute more efficiently or that are more clever—we should consider what affects the structure and readability of the programs we write.

The frameworks and mechanisms available to us most obviously affect the structure of code. Write a program in a system based on callbacks, such as the async XML HTTP request that underlies AJAX, and you will find yourself with code that chains callbacks together, preserves state in various heap objects, and is requires that callbacks be called from the right contexts to work properly. Write code for a threaded system and your code will have all manner of locks and constructs to control memory write visibility. Regular expressions can be called from Perl with the overhead of only m// so it is easier to write text munging code in Perl than almost any other language.

Our methodologies, tools, and processes—how we program—also determine how our code looks. Test-driven development will tend to produce stronger and more usable abstractions. Stream of consciousness programming results in a mess. Using an editor that supports refactoring patterns will make it more likely that you will refactor. Code review or pair programming will similarly result in code improvements, simply because you had to communicate while writing the code. (Even just commenting your code helps in this regard.) The end result of these practices is code that is more understandable.

Our audience (that is, our teammates) also affects our code. This is the role of engineering culture. What will your teammates accept versus some ideal? To get code committed to the Linux kernel requires detailed commit messages, a well structured patch series and surviving code review on the kernel mailing list. To get code committed to your personal project requires nothing outside of what you ask of yourself.

We have control over these factors. We can vary our tools, our practices, our choice of frameworks, and influence our team culture. If we are framework or API developers, we can consciously evaluate what code we induce our users to write and improve on what we provide to simplify their lives, and facilitate their communication and self-expression.

This year, let’s set an intention to examine our code and improve how it reads. Let’s experiment and play with the factors under our control to see which choices work better for our teams. Ask your teammates whether one way or another works better for them. Spend some time analyzing your own code and consider how it got that way.

I’ll try to share some of what I learn from my team at Hadapt and I’m curious to hear what you learn from yours.

Git Is More Usable Than Mercurial

Once upon a time, I used Mercurial for development. When I moved to VMware, people there seemed to favor Git and so I spent the past few years learning Git and helping to evangelize its use within VMware. I have written about why I chose Mercurial, as well as my initial reactions upon starting to use Git. Hadapt happens to be using Mercurial today and so I have been re-visiting Git and Mercurial.

What I wrote about Git and Mercurial in 2008 is still true: Git and Mercurial are similar in may respects—for example, you can represent the same commit graph structure in both—and they are both certainly better than Subversion and CVS. However, there are a lot of differences to appreciate in terms of user experience that I am now in a better position to evaluate.

In using Mercurial, I find myself oddly hobbled in my ability to do things. At first, I thought that this might simply be because some things are simply done differently in Mercurial but at this point, I actually think that Git’s design and attention to detail result in it actually being more usable than Mercurial.

There are three “philosophical” distinctions that are in Git’s favor:

  1. Git has one branching model. Mercurial has several that have evolved over time; Steve Losh has a comprehensive essay describing ways to branch in Mercurial. The effect of this is that different Mercurial users branch in different ways and the different styles don’t really mix well in one repo. Git users, once they learn how branching works, are unlikely to be confused by branches.

  2. Git has names (refs) that don’t change unexpectedly. Every Git commit you care about has a name that you can choose. Some Mercurial commits that you might care about do not have a name. For example, the default branch in Mercurial can have multiple heads, so it interprets -r default as the tip-most commit. Unfortunately, that commit will vary depending on who has committed what to which head (and when you see it).

    Further, Git exposes relative naming by allowing you to refer to the branches in remote repositories by name, without affecting your own names.

    Putting this together, consider what happens after you pull in Mercurial. Your last commit used to be called default but after the pull, default is something from the upstream. Your commit is a separate head that now has no name. In Git, your master doesn’t move after a fetch and the remote’s branch is called origin/master.

    Git even tracks the changes what commit each name refers to in a reflog. You can easily refer to things that the name used to refer to. In Mercurial, branch names don’t have reliable meanings, and it doesn’t track them.

  3. Git commands operate in a local context by default. Mercurial commands often operate on a repository context. For example, git grep operates on the current sub-directory of your work tree, hg grep operates on your history. The Git analog of hg grep is using the log pick-axe; the Mercurial analog of git grep is to use ack, or if you must, something like hg grep -r reverse(::.) pattern . (Seriously?)

    Another example is the log command. Git’s log command shows you the history of the commit you are on right now. Mercurial’s log command shows you something about the whole repository unless you restrict with some combination of -b and -f. Combined with Mercurial’s way of resolving branch names to commits, it becomes very difficult to use hg log to compare two heads or explore what has changed in another head of the same branch.

    More often than not, I care about things in their current tree more than how things are in some random other branch that I am not working on and Mercurial makes it hard to do that.

There are other usability issues that I’ve found that are more detail-oriented than philosophical. I’ll note a few here.

hg log doesn’t display the full text of the commit message unless you hg log --debug. This is an unfortunate disincentive to writing good commit messages.

hg log -p doesn’t pay as much attention to merge commits as Git does; the help for hg log reads:

log -p/–patch may generate unexpected diff output for merge changesets, as it will only compare the merge changeset against its first parent. Also, only files different from BOTH parents will appear in files:.

whereas git log has a variety of options to control how the merge diff is displayed, including showing diffs to both parents, removing “uninteresting” changes that did not conflict, or showing the full merge against either just the first or all parents of the merge commit.

Both Mercurial and Git have lots of configurable options; Git has a thin veneer over editing a config file in the form of the git config sub-command. Mercurial involves editing a file even if just setting up your initial username or enabling extensions. I often wound up editing Git config files directly, but having the commands were nice for sharing instructions with others.

Git support for working with patches natively is better. Mercurial supports e-mailing and applying patches, but oddly the extension for sending out patches is built in (patchbomb) but the extension for importing from an mbox (mbox) is not. There’s no direct analog of git apply; instead you have to use a patch queue. Patch queues are okay, but branches and well-integrated rebase/e-mail/apply support are much nicer than patch queues: you don’t need to manual find some .hg/patches/series file and edit it to re-order stuff.

I could write more and indeed many people have written about Git and Mercurial—you can explore my bookmarks about git for some of the better ones. Let me close here with three interesting features in Mercurial 2.0:

  • the new largefiles extension allows users to not transfer large files down until they are needed;
  • subrepos can be Git or Subversion in addition to Mercurial;
  • revsets allow you to search your history in very flexible ways.

Overall, I feel that Git is significantly more usable for day-to-day development than Mercurial. I’d be curious to hear if you think the opposite is true.

A New Adventure

Friday, 4 November, was my last day at VMware.

I started at VMware in 2008, working on a project that has now become VMware’s Horizon Mobile. Last year, I switched to working on the latest release of VMware’s vCloud Director.

VMware has a lot going for it as a place to work. For example, it has:

It wasn’t an easy choice to leave.

Last month, I became aware that a startup in the big data space was moving to Boston. I’d been wondering about life outside VMware and this opportunity seemed just about perfect. So I’m beginning a new adventure at Hadapt. As an early employee, I imagine I’ll be doing a little bit of everything. I hope to combine the skills and knowledge I built up from my graduate work and the practical experience of delivering enterprise software at VMware to help Hadapt build a powerful, scalable, data analytics platform and make Hadapt a successful company.

I’m excited to get started and I hope to share here with you some of my experiences as I go.

Rules for Development Happiness

Inspired by Alex Payne’s Rules for Computing Happiness, some rules for having happy developers and being happy as a developer.

  • Use version control. (See The Joel Test.) In particular, use a distributed version control system (like Mercurial or Git). This ensures you can commit offline and also conduct code archaeology offline.
  • Have a correct and fast incremental build (e.g., non-recursive Make or Gradle) to avoid this.
  • Have a system for testing your changes in a safe environment prior to code submission.
  • Avoid dependencies on system tools. Different developers tend to have different systems and hence different versions of tools.
  • Be able to work offline. Offline may mean when you’re on a plane, but it may also happen when the office network goes down. Both happen. Notably, the latter happens even when you work on a desktop with a wired connection. (It’s been pointed out to me that the network going down can be a good team-building experience.)
    • Be able to build offline. That means having all build dependencies cached locally.
    • Have all e-mail cached locally. Don’t be unable to find those key instructions some mailed you just because GMail is restoring your mail from tape. Helpful tools here are isync or offlineimap. Index your mail with mu. (Or configure Thunderbird/Apple Mail/etc to keep everything offline.)
    • Be able to send mail offline; e.g., have it queue locally for deliver when the network comes back. But, make sure you keep a copy locally in case the hotel’s WiFi is transparently re-directing out-bound SMTP connections to /dev/null. (This really happened to me.)
    • Have other documentation cached locally. (Use something like gollum for your wiki.)
    • If you work somewhere with a shared-storage home directory, make sure you can login when the network is down!

Store Hudson Configuration in Git

For any kind of server, it’s a good idea to keep its configuration in some sort of version control system. Hudson is a pluggable continuous integration system. Recently, I was trying to set one up and was wondering the best way to store Hudson’s configuration in version control (StackOverflow summary). The most complete answer is a post on the Hudson blog about how to keep Hudson’s configuration in Subversion; there are also plugins like a nascent SCM Sync configuration plugin. But, the former is very Subversion specific and the latter does not seem particularly mature. So, to understand how to do it in your workflow, there are two things to consider.

First, which files are relevant? Hudson puts configuration, run-time state, source code and build output all in the same sub-directory (called HUDSON_HOME). Second, relatedly, since normally you edit Hudson’s configuration through the GUI, when should you commit changes? Should it be automated (e.g., nightly at midnight) or manual (e.g., ssh into the server and manually commit)? I’ll answer those questions with an implementation in Git but you can translate the information easily to your preferred VCS.

Identify relevant files by using the following .gitignore file:

This ignores the uninteresting files and will allow git status to show you interesting new files. Note that I prefer to actually commit the binaries of plugins since I don’t want to rely on outside sources (namely, the mirror network) having the particular version of the plugin that I was using for the given configuration files. To use this if you are installing a new Hudson server, you can just

cd $HUDSON_HOME/.. # Default is /var/lib
rm -r hudson
git clone git://gist.github.com/780105.git hudson
# Don't forget to chown hudson hudson as appropriate for your environment

before starting Hudson for the first time. Then once it has started, run git commit to track the default config that Hudson creates.

The second question is when. The Hudson blog’s recommendation is to create a Hudson job that runs nightly at midnight to check for differences and automatically commit them. I prefer manually committing the changes on the server and then pushing it. This allows me to identify specific functional changes (using git add -p) and commit them individually. If you want to do it automatically, simply write a script or add a job that will

git commit -a -m "Automated commit of Hudson configuration"
git push

once you set up an appropriate origin.

Once you have this set up, you can even use something like Chef to automatically pull down updated configuration that you manage and test elsewhere and restart the Hudson server when necessary. Then you can re-create your Hudson server in case of failure at any time!

Programming Without Fear

This past weekend, I attended Gil Broza’s seminar on Programming Without Fear, organized by the Greater Boston Chapter of the ACM’s Journeyman Programmer initiative. The seminar was as advertised, and covered:

For anyone with more than a passing interest in agile, the material Gil presented (covered in the links above) will not be new.

The benefits of the seminar came from two things. First, Gil presented the information in a somewhat “formal” framework: a taxonomy of code smells, a set of refactoring patterns, a pair of mnemonics (PRICELESS unit tests and TRUST your refactoring process) to help remember basic techniques. This gives someone new to the material an organized set of knowledge to internalize. Second, Gil has prepared a series of exercises, interspersed with the lecture-y sections, that seminar participants work through in pairs, designed to reinforce the theoretical frameworks with practical experience. Even as someone moderately experienced with these concepts, the exercises are useful in that they focus on the fundamentals and force you to actively strengthen those fundamentals. (The weakest section, I thought, was the one on mocking which received insufficient exposition and dropped the class directly into jMock, which was a bit opaque.)

Gil is not the most exciting or funny teacher but he kept the attendees engaged by teaching with a Socratic flavor—he presented examples and solicited audience evaluations, allowing the audience to interact to reach conclusions. The practical exercises were followed by group de-briefs. This encouraged the audience to stay engaged and better absorb the material.

My main worry about the techniques is the overall reliance on Eclipse (or other IDE) as a developer’s assistant: while certainly the tooling is convenient, they make me worry about Java and whether the use of tools and wizards weaken developers who may never learn how to do things themselves.

What I really enjoyed was the experience of actually developing and refactoring with the protection of a unit test suite and learning techniques to perform refactoring without more than a moment or two of compiler errors. This was in sharp contrast to my normal refactoring experience of making a top-level change and then following all the compiler warnings until the work is done. Now if only every codebase I worked on came with such a set of tests…

How to Install ThinkUp on NearlyFreeSpeech

Gina Trapani and ExpertLabs have put ThinkUp, a cool tool for tracking replies to your posts on Twitter. As of September 2010, ThinkUp has a nifty drop-in web-based installer, much like WordPress. Simply grab ThinkUp 0.007 or later, unzip it somewhere that your PHP/MySQL-enabled web server can get at and it’ll prompt you through the installation.

In response to Gina’s post asking for help testing/hacking this long-weekend, I ran through this in about 10 minutes on NearlyFreeSpeech. Here are some quick tips where NFSN’s set-up is a bit different than what is expected by the default installer:

  • After you unzip the ThinkUp dist, run chgrp -R web _lib/view/compiled_view and chmod -R g+w _lib/view/compiled_view so that the templating engine can cache its views.
  • Make sure you have a MySQL process enabled in your NearlyFreeSpeech control panel. A basic MySQL process costs $0.02/day but you can share the process with your WordPress database. Spin up phpMyAdmin in the right-hand sidebar and create a user called thinkup and make sure to check off the option to create a database with the same name and grant all rights to that user. Generate a random password, and copy that password.
  • In the ThinkUp database configuration section, enter thinkup as the user name and database name and paste your generated password. Open the advanced section and change the database host from localhost to your database host name. It’ll be something like username.db; mine, for example, is sit.db.
  • ThinkUp will fail to write the configuration file due to perms but helpfully offers the ability to copy and paste a file. Select the text in the config text box and go back to the terminal where you unzipped the dist. In the thinkup directory, cat > config.inc.php, paste and then Ctrl-D to save the file.

Check your e-mail for the activation link and configure your account. You’ll need to register your installation as a Twitter application and paste in the consumer key and consumer secret. The config page will send you to the Twitter registration page and tell you the callback URL to provide. Leave ThinkUp as a read-only application and leave the ‘Use Twitter for login’ unchecked.

That should do it!

For more details, check out the ThinkUp wiki for more up-to-date instructions.

Understanding SpringSource and the Spring Framework

In light of recent announcements like vmForce or the SpringSource/Google App Engine integration, you may be wondering, what the Spring Framework is, precisely. What does the SpringSource company provide?

According to their homepage, SpringSource is in the business of “eliminating enterprise Java complexity” and is a leader in Java application infrastructure and management. That’s not very concrete, and so I don’t feel it is particularly helpful, particularly if you are not an J2EE/JEE (Java Enterprise Edition) developer. In this post, I’ll talk about SpringSource in general and focus on the Spring Framework. Note that while I work for VMware (which owns SpringSource) and use the Spring Framework (commonly referred to as Spring) at work, I am not part of our SpringSource division nor do I have any particularly special access to the innards of SpringSource. I did get to take the Core Spring training for free, but it is only after 5 months of programming with Spring that I’ve started to understand the SpringSource philosophy.

SpringSource products let you write code that is as focused as possible on the needs of your application, and as little as possible on the boilerplate or hassle of dealing with different underlying environments (e.g., dev, test, production may have different database backends) or infrastructures (e.g., GAE, vCloud). This is the core value that underlies SpringSource, but it is only explored indirectly via its various instantiations in the SpringSource literature and product line.

The Spring Framework (aka Spring) provides glue. Spring provides glue in a relatively uniform manner, so that once you understand the basic approach(es), you can apply it to interfacing with different components. From the documentation, Spring seems to do everything, but at the same time, when you try to use it, you may feel that it seems to do almost nothing. It may be useful to compare the Spring Framework to the Debian Linux distribution: Debian provides a nice out-of-the-box experience with a uniform mechanism for managing software, and in particular, alternative software packages that can provide a common service. But to get at the power of the underlying packages, you must learn how to configure and use them. Likewise, Spring does not actually provide many services on its own. It does not free you from having to learn how to write a unit test, access a database, manage a messaging system, or implement security. Instead, it makes it possible for you to write code to do these things in a somewhat generic manner, so that your code can be as generic as possible.

Understanding these two key points will help you make sense of the variety of things written about Spring.

The core glue provided by Spring is its dependency injection support, also known as the “inversion of control” or IoC container. This means that, instead of class Foo explicitly instantiating an object implementing interface Bar, Foo will have a constructor argument or setter that accepts a Bar. The inversion of control container lets you specify the right kind of Bar for Foo in a given environment and handles constructing that Bar, and injecting it by calling the setter. This makes code less fragile because it no longer needs special-casing for testing (e.g., a stub Bar) or anything else. The mapping of a particular Bar to Foo becomes part of a configuration file that also captures all of Foo’s other dependencies.

Spring also provides more special-purpose glue; the Spring Framework page writes:

Spring provides the ultimate programming model for modern enterprise Java applications by insulating business objects from the complexities of platform services for application component management, Web services, transactions, security, remoting, messaging, data access, aspect-oriented programming and more.

For example, your application can use the Spring Framework with straight JDBC, or with generic object-relational mappers (ORMs) like JPA through to highly specific ones like iBatis or Hibernate. You can configure your application to then talk to a variety of database back-ends, with minimal changes in the actual configuration files, and write minimal code related to setting up database connections and processing error cases. Spring provides wrappers and translators to help unify service-specific method names, such as the one that causes an ORM system to generate database tables, and exceptions into more generic expressions of those concepts. This means you might be able to switch between JPA providers, for example, without changing too much configuration. However, you still have to configure your JPA provider correctly.

In line with allowing you flexibility from the infrastructure, Spring also provides flexibility of mechanism, so that the code and configuration that you write to integrate with Spring’s services are at a level that you are comfortable with. You can configure the IoC with XML or with Java; you can use annotations or you can use explicit configuration. To specify which of your business methods should be in a database transaction, you can annotate with @Transactional in your source code, or you can use an aspect-oriented programming filter to tag the relevant methods in an external configuration file.

All of this glue and the flexibility of mechanism contribute to making Spring hard to understand; however, their presence emphasizes the idea that Spring wants to get out of your way so that you can focus on application development. Other SpringSource tools such as Roo and Insight work similarly: they simplify development and debugging (respectively) without requiring that you make extensive changes to your source, and respecting current best-practices.

I’ve left out various components of Spring to focus on the core philosophy of SpringSource, but this background should help you make sense of resources like the Wikipedia article on Spring, the Spring Framework documentation, and books like Spring in Action. If you’re running into problems, however, the best place to get concrete questions about Spring answered is Stack Overflow.

Examining Your Personal Programming Style

When I was growing up, we would listen to classical music stations in the car and try to figure out the composer and sometimes even the performer. Both musicians and composers often have their own distinctive style: you can hear the mathematical precision of Gould, or the clarity of Horowitz, whether they are interpreting Bach or Mozart. My last post started me thinking about a musician or composer’s style and drawing a parallel in the context of computer programming.

When thinking about music, one’s style is a matter of personal expression, but if you say “coding style” to a programmer (or really, to Google), you’ll find rules about whitespace, variable naming, plus some proverbs about how to write maintainable code (e.g., “avoid global variables”). Overall, I don’t think these are particularly relevant to the art of programming.

For example, formatting and naming conventions are important in a codebase only that a properly followed convention becomes invisible—just like your nose becomes acclimated to a smell, your brain quickly learns to recognize a formatting convention and ignore it. Having a convention (any convention!) allows you to focus on what the source code is really doing. Following a convention is good for everyone reading your code, even you. Automate your coding conventions and forget about it. (In the extreme, check out what the Go Language formatter can do.)

Similarly, coding style proverbs, like “write tests before code” or “keep code in a function at one level of abstraction”, are like any other proverb: these statements capture an element of experience from programmers past, but are often blindly followed by people new to the practice. It takes significant time before a programmer can truly internalize the reasons for and nuance behind any proverb. (Incidentally, if you are interested in studying proverbs, I highly recommend you examine the game of Go.)

What I am interested in exploring is personal expression and style in programming, outside of language/library/tools, proverbs or code formatting. Having a personal style is not a concept that we as computer programmers are generally exposed to. School focuses almost exclusively on the technical, ignoring both the practice (i.e., the stuff of proverbs) and the art (the subject of this post). Indeed, I am only beginning to be able to express what my personal style might be.

So, how do you express yourself in code? To begin exploring our artistic programming style, let’s continue to draw an analogy from the arts—instead of music, let’s look at the process of establishing a personal photographic style. The author talks about the importance of the choices—the choice of equipment (camera), subject matter, the approach to making a picture. You come upon a way of doing things that you believe is right, that supports your personal values. When looking at myself:

  • Equipment: Linux/Vim/Mutt/Xmonad/Git.
  • Subject matter: I care strongly about the process with which you build programs, and so I wind up working a lot on tools and scripting, but I’m also interested in distributed systems problems.
  • Approach: I like (re)using what is present; I strive for consistency, simplicity, elegance. I am somewhat inclined towards using functional constructs in imperative languages (though I never did like OCaml’s syntax). I always look at a diff of my code to ensure it is minimal before committing it, and I like to write verbose log messages.
  • Examples: Some things I’ve worked on that you can look at include of course Chord, and some contributions to Gina Trapani’s todo.txt tool.

I’m not entirely happy with this “approach” because it feels like mostly a list of platitudes. But some of my difficulty, I think is in not having thought about this specifically as I look at other people’s code, not even having the words that I can use to compare and contrast my approach with that of others. So, I’d like this post be a start for each of us to explore our own style.

Spend a few minutes thinking about what you value in your own code, and how you define yourself as an artistic programmer and write about it in a comment, perhaps using the template I’ve set for myself above. I hope we’ll each learn something!

Music Education Versus Computer Science Education

My mother recently forwarded me this interview of the pianist Glenn Gould:

I encourage you to watch this (and all 6 parts), even if you know nothing about classical music.

What stood out to me in viewing this series of videos was the fluidity with which Gould is able to discuss a piece of music, in its historical context, and to simply jump in to play a phrase from a different piece to call out a point for discussion. This demonstrates an incredible mastery of the subject matter, unifying history and context, theory, and practical implementation.

From my experience at the Manhattan School of Music, musical training seeks precisely to bring this unification into its students. As you study, you learn to play a variety of pieces or styles, from different time period. You are taught some history, to be able to understand the evolution of styles; you are taught the underlying theory, to be able to discuss this evolution using precise terms; and then you practice the mechanics needed to actually play the pieces.

How does this compare to computer programming? Computer “science”, as you may know, has a fair amount of artistry to it.

We are simply not trained to have these kinds of discussions. How many people do you know who, during a code or design discussion, might say, “Oh, this is very similar to System X, in contrast to how System Y did things,” and then pull up the source code (or architecture diagram) for System X and Y and compare them with the relevant pieces under discussion?

At MIT, the classes are (were?) organized around ideas and then around technical implementation (leading, hopefully, to understanding). Little emphasis is placed on how to be a programmer, such as prototyping, testing, revision control, or code-review; that is, these things rarely factor significantly into your grade. Even less emphasis is given to the ability to discuss ideas in context, even at the graduate level. Undergraduate classes at most schools seem to focus on learning best practices for a particular programming language. As graduate students, only the extremely motivated would explore beyond the papers presented in the course syllabus, or the immediate related work for a given project; tracking down the source code of other systems is almost never done (perhaps simply because it is not often available).

In the professional world, no one teaches you how to do code or design reviews at this level either. Just like in school, professional programmers are constantly subject to deadlines which override just about any extra-curricular work. Reviews are often focused on mechanics or on vague, unsubstantiated worries. Again, extreme personal motivation is required to move beyond this.

How can we improve this situation? Is there room at the undergraduate level for more capstone projects that unify the theory, the history, and the mechanics with the craft of programming? What about at the graduate level? How about in a professional environment? What has been your experience?

I’d like to find programmers that work this way, that are excited and passionate about the craft of programming. Are you one? Get in touch.