Tuesday, May 29, 2007

The Shape Of Your Mind

In several cultures through out history, like the Karen Paduang of Southeast Asia, the ancient Han of China and the ancient native tribes of the Paracas region of Peru, various body modification practices where quite common. In fact, among the members of these cultures unusual body shapes were (and in some cases still are) considered very beautiful. Parents went to great lengths to achieve these shapes using mechanical devices to mold an infant's growth into a particular shape. For example, among the Karen Paduang this procedure resulted in women with very, very long necks. In the case of the Han it was women with very tiny, actually unusable feet. For the natives of the Paracas region of Peru it was large cone shaped skulls. Although grotesque by Western standards, the individual subjected to these procedures was considered significantly more beautiful then a person shaped along more natural lines. However, if you removed these people from their culture and time to cast them into many modern societies they would be considered very, very strange at the least and grotesque at worst.

Fascinating, you say, but what does this have to do with computer science and more specifically languages? In many cases, the communities that form around languages are very similar to the insular communities that created these practices. To take this analogy one step further, in many ways programming languages act quite a lot like the devices used to shape the skulls of infants in Paracas, the feet of Han women or the necks of Karen Paduang women. In the case of these languages, instead of shaping the skull they tend to shape the way we think about problems, the way we form ideas and the way those ideas are applied to a particular problem. For example, if you have only every written code in early versions of Fortran (Fortran 77 and before) then you probably don't even know recursion exists. Also, if you have only ever coded in Haskell you probably would know very little about imperative style loops.

If I might take a bit my own history as an example, very early in my forays into coding I came across a problem requiring a search through a directory structure. I quickly came up with an imperative-loop-based solution that did the job. Unfortunately, the code was ugly and (if I may borrow a term from the refactoring community) it didn't quite smell right. I didn't know what the right solution was but I knew I didn't have it. At that time, the public internet was a new fangled thing but I had already found it to be useful for gathering information. So I searched for solutions that others had found to similar problems. Eventually, I came across a piece of code that used recursion to walk the tree instead of imperative looping. It took me a little while to get my mind around this new concept, having never been exposed to non-imperative code. However, once I did, I realized that the recursive solution was a much cleaner and more natural solution for this problem. Recursion isn't always a more natural solution then imperative iteration, but in this case it was. That made me think about this type of problem in a completely different light. It added another tool to my toolbox. If I had been exposed to functional languages before this point, it would have saved me a great deal of time and trouble.

This is a minor example, but it does help prove a point: what a language allows and does not allow affects how you think about problems. This is an extremely important realization because it means that, with certain caveats, the more programming languages you know the more insight you may have into the solution to a particular coding problem. So, if you accept the fact that languages tend to mold your way of thinking about problems then you can easily see how languages can be compared to these body modification devices we spoke of earlier. If Programing Languages can be compared to these devices then we, the users of programming languages, can be compared to the subjects that undergo modification.

Granted, the comparison is not exact. As engineers, we don't start out with a nice round head. We have to work to achieve it using the same tools that provided the initial distortion. To elaborate, we all start out learning a single language and that language affects the shapes our 'mind' so to speak. This initial language pushes our mind out in one direction, perhaps upward. So after we learn a single language most of us are walking around with a big cone shaped mind.

Unfortunately, many of us never go on to learn any other languages. We keep our big cone shaped mind for the entirety of our career. That may not be a bad thing. If you are solidly embedded in a particular 'language culture' then cone-shaped minds are probably considered quite beautiful. In fact, you may be considered some type of elder because of the cone-iness of your mind. We tend to call these people Gurus and they are deserving of some respect.

These cone-shaped minds are probably not considered all that beautiful out side of their specific 'language culture'. C gurus aren't going to be very useful in Scheme community and Scheme gurus aren't going to be very useful to the C community. That's bad because both languages have ideas and features that are generally useful to understand. Those Cone minds that keep to a single language throughout their entire career are never realize their full potential. That's unfortunate because a programmer with a well-shaped mind is generally more efficient and better able to find the most elegant solution to a problem. In fact, he ceases to be a programmer and becomes an Engineer. If he is diligent and studies hard he may even become a Good Engineer.

Now, about this time you are probably thinking to yourself, 'I don't really like the idea of walking around with a big cone-shaped mind.' If thats the case, great! Fortunately, unlike the natives of Paracas, you can do something about they way your algorithmic 'mind' is shaped. How do you go about reshaping your mind? Well, it's not a simple process, you basically need to force your mind into a new shape using the devices that warped your mind in the first place. You must learn more and distinctly different languages. Each language forces your mind to grow in a different direction. Learn enough languages and your mind will have a nice round shape.

So how many languages do you have to learn and what languages are the best? There is no fixed number. I usually suggest five as a minimum number and recommend ten or fifteen. That may sound like a lot but after the first couple languages picking up new ones starts becoming much easier. In that regard its a little like picking up a new spoken language. For example, if you know Spanish then Portuguese isn't all that hard. If you know Spanish and Portuguese, then Italian is pretty simple. If you know Spanish, Portuguese and Italian, then picking up French is a snap. This goes on and on. In the case of programing languages there are a small number of additional rules you need to apply to get the most out of this process, but the overall process is the same. The additional rules are listed below.

1. Each language must come from a different family oflanguages. See this history of languages information.

2. All three major paradigms (Procedural, Object Oriented, and Functional) must be covered.

3. At least two minor paradigms (Concurrent and Logic/Declarative) must be covered.

4. Both static typing and dynamic typing need to be represented.

If you have never heard of functional programing and don't have a clue what procedural means don't worry. I will help you out a bit by providing a list of languages broken down by family and paradigm at the end of this little missive. As you learn new languages you will soon have a good idea of the different families of languages and the paradigms they represent. Before you know it, you will know quite a few different languages and you'll be able to think about problems from many different angles and your mind will have a nice round shape. You will be able think clearly about a programing problem in any number of ways instead of the small number of ways your previous cone-shaped mind allowed.

Dave Thomas of Pragmatic Programmers (not Wendy's) fame came up with the idea of learning a language each year. They started a group to accomplish this back in 2002. Unfortunately it seems to have started and stopped all in that same year.


The Languages


A language breakdown by family is available on the 'History of Programming Languages' information.

As for the breakdown by type, I am not going to try to do this for every language available. So I am just going to give you a list of ten of fifteen programming languages broken down according to the rules I provided previously. This should give you enough of a group to pick five that interest you.

Descriptions are arranged as follows ([paradigms], Typing, Family). If family doesn't exist assume the language is in its own family.

Erlang ([Functional, Concurrent, Distributed], Dynamic Typing)
Forth or Postscript (Dynamic Typing)
Mercury ([Logic, Declarative], Dynamic Typing)
Prolog ([Logic, Declarative], Dynamic Typing)
Mozart-Oz ([Functional, Procedural, Object Oriented, Logic, Distributed, Concurrent], Dynamic Typing)
Lisp ([Functional, Procedural, Object Oriented, Logic], Dynamic
Typing, Lisp)
Scheme ([Functional, Object Oriented, Logic], Dynamic Typing, Lisp)
Ada ([Procedural, Object Oriented, Concurrent], Static Typing, Pascal) (Another resource)
Python ([Procedural, Object Oriented, Functional], Dynamic Typing)
Haskell ([Functional, Lazy], Static Typing)
Lua ([Procedural, Object Oriented], Dynamic Typing)
Ruby ([Object Oriented], Dynamic Typing, Smalltalk)
Smalltalk ([Object Oriented], Dynamic Typing, Smalltalk)
SML ([Functional], Static Typing, SML)
Ocaml ([Functional], Static Typing, SML)
Clean ([Functional], Static Typing)
D ([Procedural, Object Oriented], Static Typing, Algol)

I didn't include the languages that are common (C,C++,Perl,Java) because there is a good chance you already know them. They also don't count for the purposes of this exercise (C,C++, and Java are all part of the Algol family and would only count once anyway). Feel free to choose other languages that you may be aware of and find interesting. This list is only a 'Getting Started' list.

I strongly suggest that you learn a Lisp dialect and a Forth. These two languages are very good at shaping your mind and the two languages specifically tend to force the shape of your mind in opposite directions. Its a somewhat painful process but well worth the quick results. At the very least make sure that one of these languages is included on your list.

Footnotes

Thanks to a comment by Vince I have found that I am not the only one thinking along these lines, not that I actually thought I was. In the linguistics community there seems to be a hypothesis call the Sapir–Whorf hypothesis
that describes something similar. Also Kenneth Iverson gave his Turing Award lecture around the same topic. It was called "Notation as a tool of thought". Unfortunately, I can't seem to find a good link for it right now.

Friday, May 18, 2007

Off Line Development

After numerous requests I have finally implemented off-line development. You no longer have to be connected to build your projects. In the past the dependency analysis task ran every build and if it thought that there was a chance that the dependencies had changed it connected to the repository to check the dependencies. This no longer happens. There is now a check_depends task that checks to make sure that dependencies have been run at some time in the past. It then checks if the dependencies need to be updated. If the dependencies do need to be updated it asks the user if he wants to update the dependencies (by connecting to the server). If the answer is affirmative then the update occurs if not then it continues with the existing dependencies. The user may run dependencies at any time by running the depends task directly. This approach gives users much, much more control over when and how dependencies are resolved. It also allows the user to control when and how the build system connects to the repository. I hope I have done this in such a way as to not add any additional burden to the user.

Wednesday, May 2, 2007

Distributed Bug Tracking

I came across a project called DisTract last week. DisTract is basically a distributed bug tracking system. Specifically, its a file based bug tracking system who's directory structure sits inside of a repository managed by a distributed version control system. It uses this distributed version control system to manage distribution. This system was preceded by another, somewhat bit rotted, system called Bugs Everywhere.

I really like the idea of distributed bug tracking. It fits in really well with distributed version control and over all distributed development. Being able to create new branches for your bug system along with it source. Merge those branches back into mainline all the while keeping history etc. Thats a very, very powerful thing. However, this approach has a fundamental, maybe intractable, problem when it comes to merging. In a version control system merging is relatively simple. Files identifiers (basically the path in the workspace) are fixed there is no ambiguity if the same file is changed on two different boxes. Its just a matter of merging the contents of that file. Distributed bug tracking still has the problem of merging changed bugs. However, it has an additional problem. That is the fact that there is now why to relate on bug created by one person to a different bug created by another person. This is a problem that every bug tracking system has to some extent. In a distributed bug tracking system its even worse because the bugs created by another person wont be visible until they get a push or do a pull from that other persons repository. There may be a way to solve the problem using some of the modern document similarity algorithms. However, considering the small amount of text usually supplied with a bug report this is probably unworkable. I don't have a solution to this yet but I may play around with some ideas and use some information form some of the larger public repositories to run some tests.