Experiencing Smalltalk

Posted on Feb 14, 2021

In late August 2020 I read The Dream Machine and it is my favorite book of 2020. It is an incredible overview of a sliver of computer pioneers in the 1960s-1980s and how one man was instrumental in tying all their narratives together. One of the strands the book follows is the group at PARC (and other places) that firmly believed in the notion of interactive computing. Over the last year, I’ve repeatedly been introduced to this strand of thinking. There is Douglas Engelbart’s Augmenting Human Intellect¹ and his mother of all demos. The notion that computers should allow humans to think better, and one way to do that is to allow humans to express and try out ideas better. Writing is a useful skill to form coherent, but generally linear streams of thought. Computing is the closest we can yet get to a non-linear exploration of a problem domain, and freed from the limitations of time and space, we can recreate our thoughts in a malleable, explorable way. By applying a known process to known information, but turning a few billion cranks at a time we can get to better solutions, faster. The notion is, if we could have a computer (the physical device including the software) that would allow people to iteratively express their problems, get immediate sensory (primarily visual, but nothing prevents using the other senses) feedback and extend the computer as necessary, we can really liberate the human mind from some of its limitations. It is a very different vision from the consensus reality of vendors that deliver pre-packaged applications that all come with restrictions.

Smalltalk² is one of the outputs of this strand of thinking and has had a big impact on general purpose programming languages³. However the focus on object orientation and test-driven development have robbed the world of what I think are its most important aspects:

programming as a dialogue with the computer, with meaningful feedback and assistance.
the environment being just as important to programming as the language, and empowering users to modify the environment to help solve their problem.
semantic simplicity as a necessity for easier comprehension in the face of the above points.

Aditya Siram has this excellent talk giving a taste of Smalltalk. If you had to take only one thing away from this post, it is to go and watch that. I knew I had to try this thing out! While I worked through the Pharo MOOC for a few days, I quickly lost interest in their examples. Advent of Code 2020 (AoC)⁴ seemed like a much better avenue to really try out Smalltalk. All my experiences are based on the Pharo variant, which seems to be the leading implementation right now.

While I was working my way through AoC, Mike Levins wrote On repl-driven programming. I think he wasn’t fully able to convey the ideas of Smalltalk, because the term “REPL” is too strongly associated with Python/JavaScript/Ruby REPLs at this point. I hope I did my bit of evangelizing to correct that. Alright, lets jump into my experience.

❤️ What I really liked

A truly integrated development environment (IDE)

What do I mean? Well Smalltalk really facilitates creating a program in an exploratory style, including test-driven development. As an example, for Day 5, I wrote the following:

Now, as soon as I “accept” this code (by pressing Ctrl+S), Smalltalk will ask me if I would like to define this AOCBoardingPassParser class. It will pop up a little window to let me do that. It will also detect I wrote a test case (because when I created a class for the test case, I inherited from TestCase) and the system browser and other elements of the interface will have these little gray/green/red dots next to every test case, letting me run the test and observe the results right away. Now, this is offered in modern IDEs like IntelliJ and PyCharm, but this was available 40 years ago in Smalltalk! But that is just the beginning.

I can now run this test (remember I’ve only defined the AOCBoardingPassParser class at this point) and it will encounter the missing method toSeat: on the class (think of this like a “static method” in Java/Python⁵). In a statically typed language, you would get a compilation error or have your IDE highlight it for you. If you were in a dynamic language, then you’d just have an exception thrown when running the program – AttributeError in Python. Unless you explicitly ran the test in a debugger, that is all you would get and now you’ve to go back to the editor, fix this, then re-run the test.

Not so in Smalltalk. First this isn’t just an IDE in the sense of a language-aware editor. The IDE is also very aware of the runtime environment. It will break at the stack frame with the missing method, and pop up a window. You can create a method right there. This window has a full fledged editor with code completion and all the other fanciness right there. I’m not aware of any other environment⁶ that does this. Ok so we quickly recover by creating the method. But I don’t know exactly what to put in there yet. Don’t worry, this is Smalltalk. We don’t need to go through this cycle of rerunning the entire test. The state of the running program is preserved until the point where you hit the error, and you just tell the IDE “ok, I’ve fixed the immediate error, let’s try again” and it will resume execution from there! You can do this as many times as you wish, developing the code as a function of encountering errors and test failures. This very fast iteration cycle feels amazing! Imagine writing a much more complex program that was running for minutes to hours but then hit an error. In most languages you would have to fix it and start from the beginning, but in Smalltalk you would just fix it right there and proceed.

I think browser developer tools are the closest we can come to having this kind of interactivity in widespread computing environments. They don’t do that yet, but they could. But we need to acknowledge how much work had to be put in to those tools vs Smalltalk. In addition, the that implementation would likely need to be in C++ instead of JavaScript, which means users wouldn’t be able to extend and improve it as necessary. Which brings me to my next point.

IDE extension and visualization

Where most IDEs and language implementations are written in C/C++/Java, Smalltalk prides itself on bootstrapping from a very minimal native code core. After that nearly everything about the runtime and the IDE is written in Smalltalk. The program and the environment are not 2 separate things, they are one. When you execute your program, it is just another “thread” in the same environment that you are creating the program in. You can do bizarre things like remapping true to false or redefining what addition means and the entire system will reflect the change right away (to disastrous results!). The fact that the IDE and the program share a language means that any Smalltalk programmer can understand and extend the IDE itself. Would you like a menu with your most commonly used commands? Sure, just use the inspector/debugger to find out where the menus are defined and add an entry there. Boom!

This also offers the really powerful ability to customize the debugger for the task at hand. PetitParser2 is probably the best example of this. It is a parser-combinator library. As Aditya demonstrates in his talk, when you build up a parser, the parser provides several hooks into the Smalltalk debugger. It can show you the matches it has built up so far(including negative matches!) and you can run various inputs against the parser to understand behavior. So you can start with a very simple parser and iteratively make it complicated, observing the parse tree and getting the behavior you want.

Whereas a lot of AoC implementations on reddit used various regex and string operations for parsing inputs, I’d lean to PetitParser for anything beyond simple white-space delimited numbers. It was just so easy and straightforward. Instead of having to find subtle bugs in string parsing, I could actually see a domain specific debugging view as my parser encountered tokens.

You could similarly imagine having domain specific drawings for your project. If I was writing a build system in Smalltalk, I would write a build graph viewer, and instead of a static graphviz image, it could be very interactive. While the build was running, it could update the graph with file modification times, or indicate whether it thinks a file needs to be rebuilt, and then I could easily debug why a particular file wasn’t being built.

The folks at Feenk are really leveraging this in their Glamorous Toolkit to fantastic effect! I highly recommend watching their videos.

Minor

I appreciated the built-in Array2D class for the various game-of-life implementations as well as Jurassic Jigsaw where it was easy to reason in terms of rows and columns, and quickly slice those out. Similarly, the Point class and special “syntax”⁷ made Day 12 really easy.

😕 What could be improved

Of course everything wasn’t magical. The Smalltalk/Pharo community is very small, so this shouldn’t be considered a rant, just my disappointments.

Documentation

This was my biggest frustration. If we accept that documentation has 4 distinct types, Smalltalk does a great job of Reference docs, because everything can easily be inspected and browsed, so reading method documentation, while seeing its implementation is really easy. You can also easily inspect what an object is at the point of use, and then see what methods are available on it. However there is still plenty of scope for improvement here, as not all non-trivial methods have docstrings. In addition, for such a powerful IDE, the lack of documentation tooltips on hover is very odd.

It really lags behind in the other categories. There is a wealth of great content in various free books and the MOOC that explains the language and getting around the IDE. However all the books on various useful libraries are really out of date. It isn’t fair to expect someone to sit through various MOOC videos to find out what they are looking for. The use of PDFs instead of a good HTML reference is also unconventional. The libraries really lack tutorials and how-to guides. For example, how do I write a debugger extension like PetitParser? Where do I even start from? I have no idea what the building blocks of something like GToolkit or Roassal are (Roassal2 has some book but a lot of things changed in Roassal3). What is the conceptual framework upon which they are built? What’s the vocabulary? I’m spoiled by the generally high documentation standards of Rust and Python libraries. Since Pharo has very low usage, it is almost impossible to piece together these things from StackOverflow etc, so explaining various concepts in the documentation and keeping it up to date is really important. It would be really valuable to have guides walking through these powerful libraries, considering various use cases and explaining how the APIs fit together. Since Smalltalk is dynamically typed, one can’t even make guesses from type signatures. GToolkit has the <gtExample> pragma which lets you easily find and run examples, but without surrounding context to hang them on, they don’t really help.

I think there were several problems in Advent of Code where the ability to quickly write a visualization would have helped immensely. For example writing one for Game of Life and having the state update as you step through each iteration would be really cool, but I had no idea how to start.

Similarly, if one wanted to contribute to Pharo there isn’t a public record of what the priorities of the project are, or something like RFCs/PEPs that lay out design decisions or upcoming changes. There isn’t a quick guide to setting up the development environment and how to make a simple change. There is scope for improvement here.

Non-native UI and lack of polish

Pharo’s GUI environment is quite alien on all platforms. It is all implemented from scratch which means there are no accelerator keys for various menus. I would sometimes experience a rendering glitch with comments in the custom text area. There is no concept of window management, and one can’t use system conventions or tools to do things like tiling or minimizing windows. When I was on a laptop, this meant constantly reaching for the mouse to move things around since I could only keep 1-2 windows visible at any time. I would really love to see either native UIs or at least something like Qt.

Lack of polish

There were just these tiny papercuts that made things annoying. For example Pharo wouldn’t be able to push to Github even though I gave it the correct SSH key and passphrase. Similarly, it wasn’t clear how to reconcile state when I pulled changes from a git repo. I would always default to discarding everything in my image and reloading, but that wouldn’t have worked if I wasn’t the only one working in the repo. Sometimes the debugger would always focus one frame below the frame the error happened in, and then I’d have to go do a few extra clicks.

Minor

Dictionary access seems to be really slow⁸ in Pharo. Apart from that there were 2 things that felt very odd to me but are not criticisms of Smalltalk itself. First, the complete lack of typing or type hints is jarring. Even in Python, I’ve always worked on a codebase validated by Mypy. I was really bitten by this when using collections, where at some point I was expecting a Sequence, but was actually getting a Set, leading to incorrect answers! I had no way of figuring out that was the problem without very careful debugging. Second, 1-based indexing is really hard when one has always worked with 0-based languages. There were some AoC problems where using 1-based indexing was actually good because it would directly map to the problem statement, so this was net neutral.

Closing thoughts

I really enjoyed using Smalltalk.

The “UNIX way of working” has dominated software development to the point where programmers really look down on environments that do not fit into that mold. “If your language is not in a plain text file and does not have command line based tooling, it isn’t a ‘real’ language”. “If I can’t use vim, tmux and ripgrep to hack on this, it must be for children”. On the other hand there are things like spreadsheets and various databases, but their uses are relatively constrained. There are various attempts to have no-code environments, but they all seem to apply to specific domains. This is unfortunate. Instead of expecting every person to install packages and learn arcane syntax we could have a general purpose computing environment built around a simple but powerful idea. Spreadsheets are not plain text files and Jupyter notebooks are not editable in Vim, but they unlock a crazy amount of power without requiring users to understand a terminal or use npm. Smalltalk can help people have a dialogue with computers, without throwing them into the tower of Babel that is industrial software development.

There is real value to having this kind of approachable programming environment accessible to people. In particular, I feel like the data science/analyst community has coalesced around Python, but Python has a lot of incidental complexity. Smalltalk’s simplicity combined with it’s visualization driven principles would be really valuable there. However, the Smalltalk community is too small. Even if they can focus on great documentation, evangelization and large amounts of polish, it is still an uphill battle against the large Python ecosystem. Particularly without a large corporate sponsor.

If we cannot have a world with Smalltalk, at a minimum we should focus on getting something like its environment into other languages. Jupyter notebooks and JavaScript debuggers are nothing compared to the Smalltalk IDE. Rust’s error messages are excellent, but they are messages, not “here let me help you fix this right away”. I for one will definitely reach for it more, once I can find answers to some of the visualization questions.

Addendum: Thoughts on Advent of Code

Advent of Code is really fun. I spend too much of my professional life wading through large codebases where changing things is difficult and the system interacts with so many things that development can feel like a chore. AoC was a chance to just write simple, functional code and “throw it away” as soon as I got the answer. Most of the problems are variations on “standard” computer-science puzzles (like the Game of Life) and others are interesting as a reduced version of real problems (e.g. Jurassic Jigsaw). None of them are very mathematical (except the Chinese Remainder Theorem, which I did not solve by coding) so they don’t require spending time learning about an entire domain. In general, I’m not competitive or obsessive about these kind of challenges, so if anything required spending too much time researching or coding, I would just skip it. It was about receiving short dopamine hits, and not trying to get on the leaderboard. If you are already an experienced programmer, I highly recommend using AoC to learn a new language or paradigm, as the problems don’t need sophisticated libraries (the only interaction with the outside world is reading the input file, which one could also put in a string literal). You can focus on just learning the basic idioms and data structures of the language.

I have not read the original paper yet. ↩︎
Smalltalk has a wide variety of implementations at this point, in various stages of maintenance. ↩︎
Interestingly, not only has Smalltalk influenced language semantics, it has also influenced implementations. Smalltalk and Self drove significant innovations in just-in-time compilation, which trickled to Java (via HotSpot) and JavaScript (via V8 in Google Chrome). These are very likely the most widespread implementations of any language in history. ↩︎
With caveats. I still haven’t solved Day 17 (tedious), Day 19 part 2 (lost w/o non-greedy regular expressions), Day 20 part 2 (tedious), Day 24 (didn’t get to it) and Day 25 part 2 (need to solve the others to get this). In addition, I solved Day 13 part 2 using Wolfram Alpha, and Day 21 using z3 in Python. For a couple of stars, I did look at other people’s solutions before solving it in Smalltalk. I also sometimes ran people’s Python solutions to help surface bugs in my Smalltalk solutions. ↩︎
In reality, classes are also objects in Smalltalk, so this method is not “special” at all. ↩︎
I’m aware that LISP has something like this as well, but that is hardly traditional. ↩︎
Just a clever use of defining @ as a method on number which takes another number to declare a point. So 1@1 is point (1, 1), but it is just the @ method called on 1, which then returns a Point object. ↩︎
Update on 2021-02-16: Thanks to Todd on the Pharo Discord for pointing out that pre-allocating the Dictionary and reducing the number of accesses helps a little. Here is the updated code. ↩︎