I know that as soon as I mentioned the F word (Forth that is ;) a general groan went up from those of you that have some experience with it. Don't worry, groaners, I feel your pain. However, take a couple of aspirin sit back and listen to me for a bit and you might gain a different prospective on the language. For those of you not familiar with the language, the reason for the groaning will become obvious as our story progresses.
I had given some thought to going about and gathering up all of the 'quotes' that people use to promulgate Lisp and then justifying them for forth. After thinking about it, I decided to take a slightly different tack. I went this other route mostly because Bill Clementson has already done most of the gathering work for me and also because I think that you're smart enough to draw correlations without me beating you over the head with them. So take a couple of minutes, pop over Bill's site and read the quotes. Don't worry, I'll wait.
What Is Forth
Back so soon? good. Forth is a language that takes a wildly different tack then any other language out there. It was initially developed by Charles Moore at the US National Radio Astronomy Observatory in the early 1970s. Back in the early sixties, while working for the Smithsonian Astrophysical Observatory Mr. Moore found a need for a little bit more dynamisim then he had had in the past. To that end he put together a little interpreter intended to control a punch reader. By using this interpreter he could compose different equations for tracking satellites without recompiling the entire system. This was no small feat for the time. Mr. Moore, like any good hacker, took his interpreter with him when he left that job. He carried it around for the next five or ten years constantly tweeking it. By 1968 it was finished enough to build up a little game called SpaceWar as well as pretty nifty Chess system. This version of the language was the first to be called 'Forth'.
This early evolution and constant tweaking produced a fairly interesting language. A forth program is simply a series of tokens, called words, and two stacks. The stacks represent the main data stack and the return stack. For right now we will ignore the return stack and talk about the data stack. To give you comparitive examples, lets look at an operation in Lisp and Forth. The Lisp version is perfectly recognizable to just about anyone. It just adds two numbers together.
> (+ 1 2)
3
In Forth, it goes as follows.
> 1 2 +
3
A more complex example
> (/ (+ 27 18) (* 3 3))
5
In Forth, it goes as follows.
> 27 18 + 3 3 * /
5
If you have been exposed to some of the old TI calculators or even Postscript you may be able to tell whats going on here. Each number as it appears gets pushed onto the explicit data stack. The '+' words (and words they are) take exactly two numbers off the stack and replace them with the value resulting from there addition. So what we get on terms is.
> 1 2 + ( results in 3 on the stack)
This explicit stack is the way all data is handled in Forth, everything. You may remember that I said everything is a token. Thats absolutly correct. Forth reads each token and looks it up in a special dictionary that is used to store references to words. We create entries in that dictionary as follows.
: square
DUP * ;
The colon is an immediate word (read macro) that reads ahead one word in the token stream and uses that word to create a new entry in the system dictionary.
It then reads ahead until it finds a semi-colon ';' which it uses to indicate the close of the word. It then compiles the intervening tokens into references to words. It is as simple as that. 'Hold on, Hold on!' you say. 'You said everything was tokens in a forth system, aren't these special'. Nope, I didn't lead you astray. The colon and semi-colon aren't special at all and can be overwritten and changed at any point you like. The are examples of Forths powerful macro facility in which you can create syntax for a language that essentially has none. This is a powerful concept that Lisp shares. As Paul Graham says you can use this to build up your language to the system instead of the other way around. Of course, there is a downside. Any time you give the programmer extreme flexibility someone is going to abuse it. For example take a look at this example from a Forth library in the wild.
: rework-% ( add - ) { url } base @ >r hex
0 url $@len 0 ?DO
url $@ drop I + c@ dup '% = IF
drop 0. url $@ I 1+ /string
2 min dup >r >number r> swap - >r 2drop
ELSE 0 >r THEN over url $@ drop + c! 1+
r> 1+ +LOOP url $!len
r> base ! ;
Granted I have taken away the context, but you begin to understand the impetus behind those groans you heard a short while ago. Forth is one of the most flexible languages available, but the very flexibility that makes it interesting also makes it dangerous. It takes diligence on the part of the programmer to write really clean and maintainable code. However, if the coder does take that care he can write imminently readable and maintainable code. Take a look at the following examples that appeared in Starting Forth by Leo Brodie. You can get a good feel for whats going on even without knowing the context.
: WASHER WASH SPIN RINSE SPIN ;
: RINSE FAUCETS OPEN TILL-FULL FAUCETS CLOSE ;
So doing Forth well or not is entirely in the hands of the coder, that's why the Forth experience varies so much. That's why Forth is a Language for Smart People.
Forth Basics
I have the very good luck of owning a copy of the book Thinking Forth. This is one of those books that you should add to your library even if you never intend to write a single line of Forth code. It has a huge amount of great insight onto coding and system design in general, though you probably don't want to use it as your first introduction to Forth. In any case, I am going to be borrowing some topics and examples from that book to see us on our way.
: breakfast
hurried? if cereal else eggs then clean ;
The above example illustrates the fundamental building block of every Forth system, the word. This specific example is whats called a colon definition. Its how you define a word in Forth. Basically the colon ':' in an immediate word (think macros operating on the word stream) that takes the very next word and creates an entry in the dictionary based on that word. It then reads all the words up to the semi-colon ';' that tells it to stop. For each word it encounters it looks up the position in the dictionary and puts that location in the code stream. Of course, in this case we have the if immediate word that takes control for a little while before handing it back. So what specifically might be going on here. In this case, hurried? probably looks at some parameter in the system to decide if haste is in order and puts a boolean on the stack. 'If' is an immediate word that compiles to a jump based on the 'else' and 'then' words. You can probably figure out what it does from here.
This, admittedly simplistic, example does illustrate a few of the special characteristics of Forth. Most notably, the stack, words, and use of immediate words.
Factoring
Long before any one had ever heard of Agile Methodologies or Extreme Programing the idea of factoring was already hard at work in the Forth community. The idea of constantly looking at code, breaking up functions and generally simplifying the systems in a foundational concept in Forth. In fact it's taken to something of an extreme. Words are a huge part of Forth systems and the factoring process. For example, consider the following word which finds the sum of squares of two integers:
: sum-of-squares dup * swap dup * + ;
The Stack inputs to the word at run-time are two integers. The Stack output is a single integer. By the process of factoring, the example would be re-written in Forth using a new definition called 'squared' to allow sharing the common work of duplicating and multiplying a number. The first version was overly complex and illustrated the notorious line noise aspect of Forth. Fortunately, by factoring the system we can make a much more readable and understandable system.
: squared dup * ;
: sum-of-squares squared swap squared + ;
Good Forth programmers strive to write programs containing very short (often one-line), well-named word definitions and reused factored code segments. The ability to pick just the right name for a word is a prized talent. Factoring is so important that it is common for a Forth program to have more subroutine calls than stack operations. Writing a Forth program is equivalent to extending the language to include all functions needed to implement an application. Therefore, programming in Forth may be thought of as creating a Domain Specific Language. As Lispers well know, this paradigm, when coupled with a very quick edit/compile/test cycle, seems to significantly increase productivity.
Immediate Words (The Macros of Forth)
In comparison with more main stream languages, Forth's compiler is completely backwards. Most traditional compilers are huge programs designed to translate any foreseeable, legal combination of available operators into machine language. In Forth, however, most of the work of compilation is done by a single definition, only a few lines long. As I have said before, special structures like conditionals and loops are not compiled by the compiler but by the words being compiled (IF, DO, etc.) You may recognize this sort of thing from Lisp and it's macro system. Defining new, specialized compilers is as easy as defining any other word, as you will soon see. As you know, When you've got an extensible compiler, you've got a very powerful language!
Forths I Have Known
There are a couple of new Forths out there that are breaking with the Forth tradition somewhat and innovating in this area. One that I especially like is Factor. It's a very minimal but consistent for with some modern ideas like garbage collection, CLOS like object system, strait forward higher order functions, etc. This forth kind of bridges the gap between Lisp and Forth. A good traditional forth is Gforth. Its reasonably fast, reasonably stable and implements the full ANSI spec. You really can't go wrong with gforth if you want a good forth that will stand the test of time. If you just want an easy, interesting forth that will run just about anywhere I suggest you take a look at Ficl. Its ANSI compliant Forth written with in C as a subroutine threaded interpreter. It is stupid simple and about as robust as you can get. This is a really nice forth to start you tinkering with. Now if you want a Forth that is really stretching the bounds of what is possible and desirable take a look at Chuck Moore's latest Forth, Color Forth. It's the only language that I know of where color (yes, color) actually has semantic meaning. That the custom 25 core chip that color forth is designed to run show the Mr. Moore is really pushing the boundaries. It's worth while taking a bit of time exploring ColorForth just because of it's divergence from mainstream.
So Why the Comparisons to Lisp?
I think that many more people have had some exposure to Lisp then Forth. Because, Lisp and forth share so much 'meta' philosophy, I thought it would be beneficial to draw comparisons between Lisp and Forth. You will have to be the judge of whether this was a successful approach or not.

18 comments:
Forth + Lisp + some Erlang premises
you should have a look at href="http://planet.factorcode.org/">Factor
Good article.
A few corrections: the colon is not an immediate word (or macro). Technicaly, a macro is word that is executed at compile-time; the colon word doesn't need to be macro. However, it is indeed a bit special in that sense that it consumes the input string (the name of the word to be defined, that follows it). It then switches Forth to compile-mode, where every word is compiled, except the macros that are executed as I said. The semi-colon is one of them. It compiles a return instruction and switches back Forth to interpret mode.
I think the most difficult aspect of Forth with regard to the modern trend is the lack of garbage collection. Some "modern" variations on Forth like Factor or Cat feature it. However, in my opinion this is wrong: one gets a language that most people don't like anyway because of the RPN, and because of the vital need for good factoring (otherwise you get write-only code). And a typical Forth programmer doesn't like the idea of having a big-complex-resource wasting GC because many Forth programmers are working in the embedded computers field, a place where you often cannot afford a GC. However, as Samuel Falvo says, Forth is an hyperstatic environment and does provide some features in order to manage memory (the words FORGET or MARKER), and one has to learn to take advantage of them. See the video http://www.falvotech.com/blog/index.php?/archives/372-Over-the-Shoulder-1-Text-Preprocessing-in-Forth.html
Liked the article. A quick correction, though. I wrote the FICL 4 upgrade. FICL 4 is "switch threaded" to use Anton Ertl's terminology. I did some microbenchmarking before I wrote FICL 4, and I found switch threading was faster than subroutine threading. I'm assuming it's because it obviates all the function call overhead.
FICL was never purely subroutine-threaded; FICL 3.03 and before were struct-containing-pointer-to-subroutine-threaded. (Now you see why FICL 4 was 3x faster!)
larry
This comment comes from someone who fell in love with the ease of "scripting languages". I actually started with perl when i switched to Linux, then i switched to php (!) and used that for maybe 2-3 years.
But php clearly left a desire for more, the choice was between python and ruby. I picked ruby because of an interview with matz who emphasized the human aspect. And I like to think in "pure objects" which is why I settled for ruby. I am fine with python too though, I think python is a much better choice than perl (yes yes, so many C dinosaur coders still use perl)
Now I know some (old) people who love forth. That is totally fine. But coming from the ease of "scripting" languages, especially from ruby, I am totally ignorant of languages which I'd know I would not write a lot of code anyway, simply because there are too many things I would dislike (i dislike some things in ruby too, though i love it)
The only language I hate, but which I _would_ use, is C. Simply because it is so inredibly ubiquitous.
Note - I dont speak about concepts in programming languages. Concepts translate to ideas for me, and ideas are good.
I only speak about syntax AND using a language. I simply can not find a use case for a language like Forth for me.
"If you have been exposed to some of the old TI calculators [..]"
You mean HP calculators I guess.
Nice article! Just a note - Thinking Forth is available here (http://thinking-forth.sourceforge.net) under a Creative Commons license.
Yeah baby, we like White Meat!
JT
http://www.Ultimate-Anonymity.com
This is regarding your use of then.
then - at that time; immediately or soon afterward
than - used to introduce the second member of an unequal comparison; used to express choice or diversity
Forth was needed then but not now. (then => time)
I'd rather program using PHP than Forth. (than => choice)
You may also want to check out REBOL at http://www.rebol.com as it is grown from lisp, forth, and self.
Are you saying that we should learn Forth because we can write Forth-like programs in other languages?
Is this desirable? How do we know that the Forth metaphor of a data stack and an operations is useful in Java?
Jonathan Mark said...
...because we can write Forth-like programs in other languages? Is this desirable? How do we know that the Forth metaphor of a data stack and an operations is useful in Java?
I certainly won't be for _all_ programs, but you'd never be able to tell what the 'forth metaphor' might be good for if you don't even know what it is. It was probably useful for the people who wrote the JVM, since it's stack based, and for the people who wrote PS, and probably many others.
My first full-time programming job was in a Forth shop, 25 years ago, writing consumer apps for Apple ][ and IBM-PC.
: battlecry forth love not if dead drop rot and then ;
Thanks for reviving this discussion with a thoughtful article. Forth suffered from overhype and from the concerted efforts of some members of the community to prevent the development of a standard.
Consigned to the reliquary, it is still an extraordinarily useful little language - both as an example of how simple an interpreter can be, because of its refreshing transparency about memory, and because you can get stuff done really fast. It's great for wrapping a hardware layer - the Sun Open Boot ROM is a Forth dialect, for example. The FreeBSD boot loader uses Ficl.
Its strengths can be weaknesses if you apply the language to the wrong kind of problem - particularly anything that requires a lot of string handling or garbage collection. Please do not try to write a web server in Forth - it's like teaching a pig to read: it wastes your time and annoys the pig.
Ficl is available at ficl.sf.net
Use responsibly ;-)
Good article! I am going to start messing around with gforth. I was always a lisp fan, but never got to use it professionally other than in the subset of lisp implemented in AutoCAD (AutoLISP).
The only FORTH implementation I can think of in my current role would be customizing Sun's Open Boot Prom, but that could be interesting.
you should learn forth because it KICKS ASS!
i also have basically just use scripting languages professionally, but my love of learning different languages paid off recently.
i've been using a home-rolled forth on a super-simple virtual stack machine as a control language for a bot to work a complicated web-based interface. i had been casually reading about forth for a while.
the project was getting way out of hand until i decided to try forth as the basis for my control language. suddenly total code size shrunk to a fraction the original size and became more flexible and powerful.
the forth style is just a great way of approaching things, you need a new language feature, you just add it, no macros, meta-level libraries, compiling, just do it.
the program forth is executing is available at run time, you can add whatever features to the parser you like. i wanted a new quoting operator that behaves entirely differently from the one i built in, it was simple to add it, only a few lines.
next step, adding exception handling to my little forth DSL. this is a little harder, but i know it's possible, and i also know it won't require rewriting my forth DSL library, it requires nothing more than adding a few new words to the dictionary. likewise, other new control structures don't require redefining the language in the interpreter, the language is MADE to be redefined!
google for jonesforth, this was a big help to me "getting it".
If you program for the Palm, you may find Quartus Forth to be a much more productive environment than Codewarrior.
"Long before any one had ever heard of Agile Methodologies or Extreme Programing the idea of factoring was already hard at work in the Forth community."
It's called "refactoring", not factoring, and it predates the agile/xp goons by almost ten years. The Fowler book has nothing to do with Agile/Scrumm/etc, which is why it's good and useful.
@astrobe
Thanks for the correction
Post a Comment