For the past week, I’ve been working on a lambda calculus interpreter (Github) in order to try to better understand the concepts in lambda calculus. It’s been a great experience, especially since I’ve been coding it in Python. Normally, if I were writing the interpreter in a language like SML or Haskell, I would rely heavily on pattern-matching.
Using pattern-matching is a pretty natural fit for creating an interpreter, given the operations involved in lexing, parsing, and reduction/evaluation — specifically, handling a particular encountered block depending on the token or keyword/structure encountered. An example of this idea implemented in a Haskell lexer is below:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
|
The lexer takes a string and iterates through it character by character. Each character encountered has specified a certain action (even if the actions are simple in this case). A '('
character produces the action of creating a cons list of the string "("
and the result of lexer on the rest of the string. An analogous action happens for ')'
, '.'
, and ' '
. For anything else (which has to then be a keyword or variable in lambda calculus), chunk it together as a string and cons it onto the rest of the the string (sans chunked string). It’s not bad, though I imagine I’d probably be able to write this in an even more concise way if I knew Haskell better.
If I were to translate this code literally to Python, I would enter elif
hell, since Python has neither pattern matching or even case expressions. Although beauty is in the eye of the beholder, a massive block of elif
s is not what I consider beautiful code. For such a small codeblock, it isn’t too bad, but even so, it looks verbose when in Python.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
|
Again, it’s not that bad, given how short the set of cases there are here (I’ll have to actually find a longer interpreter to give a properly horrifying example). In fact, taking into account the chunkWord
gymnastics I had to do in Haskell to satisfy the type checker (though that might just be due to my ignorance in Haskell), it might even look nicer to some (… aside from the first_letter
gymnastics required here because you can’t easily skip ahead characters in a Python for
loop… a topic for another post). However, even here the specific case pattern looks annoyingly verbose and repetitive, at least to my eyes. elif
s takes up two lines at minimum and starts to blur together with many, many elif
statements. Alternatively, elif
blocks start turning into spaghetti code if you opt for using a dictionary to handle each case, with the instructions for each case separated from conditional statements themselves, especially if you have to define the operations as functions separate from the dictionary.
However, this type of ugliness does comes with an advantage when trying a learn a concept. The Haskell version actually has a lot of code repetition which is made tolerable and even pleasant through pattern-matching. Being forced to think more cleverly about how to express one’s ideas in a succinct way to counter that ugliness can give one deeper insight into what’s actually going on. For example, the below code more explicitly acknowledges the code repetition and handles the for
skipping-ahead challenge in a cleaner way (to my eyes), by accumulating characters when the current character is not one of the special delimiters and clearing it when it reaches one of those delimiters.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
|
Even though this can probably also be improved upon, being forced to think harder and attack the problem with multiple paradigms was helpful. The fact that Python is not great at handling multiple cases in a neat, clean way was actually beneficial for my understanding in this case. Given this experience, I think it might be helpful in the future to try to solve hard programming problems in languages that really don’t have good support for paradigms that these problems are most naturally expressed in.