Embedding Higher-order Stack Languages in Racket

It is well-known that stack-based languages can be embedded into functional languages. This has even been done in the typed setting by Chris Okasaki in his article Techniques for Embedding Postfix Languages in Haskell. Though Okasaki’s paper does not mention row-polymorphism, his typed embedding is essentially a way to sneak row-polymorphism into Haskell using nested tuples. We even get a taste of how existing type systems can handle higher-order functions over a stack in Sami Hangaslammi’s article Concatenative, Row-polymorphic Programming in Haskell.

One of the key insights in the beginning of Okasaki’s paper is that the functions used to model stack operations end up being partial continuations. Since I was tasked with trying to embed some basic higher-order stack combinators in Racket, one of my first thoughts was ‘How else might stack languages be embedded into a functional language?’ Okasaki’s embedding has the advantage of clarity and typability, but I’ll present another embedding that is made possible by Racket’s multiple-return values feature.

The multiple-values feature of Racket is only briefly discussed in the documentation, but it should be clear enough to get a good intuition of how it can be used. We will also need Racket’s support for variadic functions. The end result will be an encoding of a subset of Brent Kerby’s calculus from the Theory of Concatenative Combinators, extended with some basic integer arithmetic. I’ve also built another version which isolates the semantics in one place, but the resulting embedded language is not as pretty to read.

To make things easy, we first encode the simple arithmetic language. We’ll only handle addition and subtraction, but adding other operations follows a similar template. We don’t want to make a separate combinator for each distinct constant, so we’ll make a generic push function as well.

(define (push c f)
  (lambda xs (call-with-values (lambda () (apply values c xs)) f)))
(define (add f)
  (lambda (a b . xs)
    (call-with-values (lambda () (apply values (+ a b) xs)) f)))

We can’t test it just yet, because each of these combinators takes an argument which represents the rest of the computation. So we need a special word to mark the end of an expression, which just returns the values it receives as its input:

(define end (lambda xs (apply values xs)))

With this we can program in a very basic RPN-like notation that just has a few more parentheses:

(require rackunit)
;; equivalent to
;; 4 1 2 3 add add == 4 6
(check-equal?
 (call-with-values (push 4 (push 1 (push 2 (push 3 (add (add end)))))) list)
 (list 6 4))

Note that we must specify the result stack in reverse order, since Racket lists are generated from left to right, and the left-most element of a list corresponds to our ‘top of the stack’. This is all well and good, but its very simplistic and using continuations for this purpose is completely overkill. Let’s add a few generic stack manipulation combinators:

(define (dup f)
  (lambda (a . xs)
    (call-with-values (lambda () (apply values a a xs)) f)))
(define (zap f)
  (lambda (a . xs)
    (call-with-values (lambda () (apply values xs)) f)))
(define (swap f)
  (lambda (a b . xs)
    (call-with-values (lambda () (apply values b a xs)) f)))

The names for each of these are fairly intuitive: dup duplicates the element on top of the stack, zap is a synonym for ‘pop’ which deletes the top element of the stack, and swap pulls the second element over the top element.

By themselves, these combinators don’t really add much power to the language. We could probably move numbers and math operators around to the point where we could transform any expression containing the above three into an equivalent expression without them. They will be necessary for a complete system by the time we are done, however. Next up, let’s increase the power of our language by adding anonymous functions:

(define (quot f n)
  (lambda xs
    (call-with-values (lambda ()
      (apply values
             (lambda ys (apply f ys))
             xs)) n)))

The quot function takes two parameters: the first is the expression that represents the body of the function, and the second is the expression rest of the program. The name quot is short for quotation, which is what anonymous functions are often called in the concatenative language community. A quotation wraps a list of combinators up in a box and puts that box on the stack, to be duplicated, zapped, or executed at a later time. But this means that we have a function on top of the stack with no way to call it! In applicative languages, a function knows that it is supposed to execute based on its position in the expression. In the lambda calculus and functional languages based on it, this is the left-most part of the expression. In concatenative systems, we need to be more explicit, so we include a way to call a function that is on top of the stack:

(define (call n)
  (lambda (f . xs)
    (call-with-values (lambda () (apply f xs)) n)))

The call combinator is just a slightly clearer name for Kerby’s i combinator. It unwraps the quotation currently on top of the stack and executes the expression inside it. We can also add another combinator, dip, which call the quotation on top of the stack so that it ignores the stack item just below it. It may not seem like a useful idea at first glance, but dip, or a way to construct it, is actually a key part of what separates a complete concatenative system from an applicative one.

(define (dip n)
  (lambda (f x . xs)
    (call-with-values
     (lambda ()
       (apply values x
              (list (call-with-values (lambda () (apply values xs)) f))))
     n)))

We can make, store, and call functions now, but we’re not done quite yet. Two more combinators will give us enough of a basic system to construct all other possible combinators. The first, unit, wraps the item on top of the stack in a quotation (it takes the top element and creates a constant function that pushes that element onto the stack when called). The second, cons, can be thought of as a partial application combinator, sort of like currying.

(define (unit n)
  (lambda (f . xs)
    (call-with-values
     (lambda ()
       (apply values (lambda ys (apply values f ys)) xs)) n)))
(define (cons n)
  (lambda (f b . xs)
    (call-with-values
      (lambda ()
        (apply values
          (lambda ys (call-with-values (lambda () (apply values b ys)) f))
          xs))
      n)))

Factoid: only four of the combinators discussed, plus quot, are necessary to have a complete system! It’s a fun exercise to try and determine what they are. Hint: try to construct the remaining three from an arbitrary set of four.

And with that, we have a tiny but complete higher-order stack language embedded in Racket! Adding other combinators from Brent Kerby’s page should follow a similar pattern. There’s plenty of place to go from here, but whether it is useful or efficient is perhaps the more prescient question. Regardless, happy stacking!

Advertisements

Implementing a Concatenative Calculus

A Stack-based Interpreter

I’ve written a small interpreter for the concatenative calculus with a call operator described in one of my previous posts. I wrote it in Haskell, in the monadic style familiar from Martin Grabmüller’s excellent tutorial Monad Transformers Step by Step. I updated it to use the more recent Except monad and tried to simplify a few other things related to the stack-based internal model I chose to work with. All in all it seems like a fun little program and could serve as a good testbed for other concatenative ideas. The main files are Eval and Syntax, the rest are just there to build up the command line interpreter. This should work out of the box with a recent Haskell installation.

I don’t want to clutter my blog with the full Gist so you can find it here.

Concatenative Function Calls, without the Bang

Killing the Call Operator

In my last post, I introduced the idea of a concatenative calculus with variables, similar in spirit to the lambda calculus. One of the key differences between lambda and concatenative varieties was the presence of an explicit `calling` word (!). A couple of days after posting, I realized the call operator was not actually necessary, and was an artifact of my tendency to think of concatenative languages in terms of stack-based languages.

The key insight lies in the fact that the two evaluation rules introduced differ not only in the presence of the call operator, but also in the number of arguments each function expects. In the first evaluation rule (Application) the number of arguments to the function must be zero. In the second rule (Substitution) the number of arguments must be greater than zero (one or more).

Using this distinction, we can eliminate the call operator from the language entirely, and all functions that take zero arguments will reduce automatically. I don’t believe this either increases or decreases the power of the language, though I’m not sure. Rather, function calls in this new language could be regarded as `implicit`, rather than `explicit` as in the system with the call operator.

T :=
1. (Empty) => ε ϵ T
2. (Var)   => If v ϵ V and t ϵ T, then tv ϵ T
3. (Abs)   => If v1,...,vn, n >= 0 ϵ V and t1, t2 ϵ T, then t1(v1...vn.t2) ϵ T.

This is the new syntax for our language. There are only three construction rules, rather than the four rules of the previous system. The concatenation operation on terms (++) is similarly modified to remove the case that mentions the call operator. Our reduction rules have changed only in the case of application, where the call operator (!) has been removed:

(Application) t1(.t2) ==> t1 ++ t2
(Substitution.Var) t1v(a1...an.t2) ==> t1 ++ (a1...an-1.t2[an/(εv)]
(Substitution.Abs) t1(b1...bn.t2)(a1...an.t3) ==> t1 ++ (a1...an-1.t3[an/(ε(b1...bn.t2))

This seems better all around! Is it possible that we might ever want a language with an explicit call operator when this simpler formulation exists? My hypothesis is that `pure` languages would not need an explicit call operator due to the lack of side effects. However, a language with side effects should arguably keep the explicit call operator. For example, say there is a function which generates random numbers. This function takes no input, but produces output. In the new system, any mention of this function will be reduced only once. If we pass the function as an argument to another function, it will generate the random number, and then proceed to use only that number in all other calls in that function, which is clearly not what is intended!

Evaluation Order Curiosities

When I discovered the concatenative mailing this, there was a tiny language called `Peg` that had been discussed. Peg was unique because it claimed to be a lazy concatenative language. The developer seems to have changed the name since then, and the new language includes some strange semantics that I confess I don’t fully understand, but the idea of a lazy concatenative language remains tantalizing to me.
In a strict concatenative language, we try to apply an evaluation rule at the left-most possible step each time. How might a lazy concatenative language be different? One possible idea is to only apply only substitutions at each left-most possible step until no more substitutions can be done, then apply the rightmost thunk which has no arguments. We can see this difference in action in an example. Imagine that the language has been extended with numbers and a division operator:
Left-to-Right:
  2 1 (a.a 0 /) (c.)
  ==> 2 (.1 0 /) (c.)
  ==> error! attempted division by zero
Right-to-Left:
  2 1 (a.a 0 /) (c.)
  ==> 2 1 (a.a 0 /) (c.)
  ==> 2 (.1 0 /) (c.)
  ==> 2 (.)
  ==> 2

In the first example,  we try to apply an evaluation rule at the left-most possible spot at each step. So first we substitute 1 for a, then try to evaluate (.1 0 /), which means we must evaluate the function body. This leads to a division by zero error.

But in the second example, we first substitute at the left-most possible spot at each step, then do an application at the very end. So 1 gets substituted for a, then (.1 0 /) gets substituted for c, then we try to evaluate (.) which succeeds, leaving only 2.

Again, I haven’t thought too much about these ideas yet, so take everything with a grain of salt, but at the very least this evaluation order seems to point in a promising direction for adding lazy semantics in a concatenative language.

Concatenative with a Side of Variables

Concatenative language theory, or at least what little there is of it, has generally been advanced in the form of various combinator languages, starting with Joy. Most concatenative enthusiasts seem to be happy with this state of affairs, and the goal of many concatenative languages is to enable and encourage a point free style. So much so, that concatenative programming is often thought of as a subset of point-free programming. However, the two are disjoint concepts despite the variable-free code present in so much Factor and Joy.

I will present a small, mostly informal concatenative calculus in the style of the lambda calculus, rather than in the style of combinatory calculi. For a great introduction to concatenative combinatory calculi, see Brent Kerby’s The Theory of Concatenative Combinators. Also note that I came up with the majority of this in a bar and haven’t much worked on it since, so it can probably be improved upon.

Syntax

First we will need to define the syntax of the language, enabling us to tell whether a sentence is a term in the concatenative calculus or not. Assume V is a countably infinite set of variables V = {a,b,c,...}. We can then inductively define the set of all terms in the concatenative calculus to be:

T :=
1. (Empty) => ε ϵ T
2. (Var)   => If v ϵ V and t ϵ T, then tv ϵ T
3. (App)   => If t ϵ T, then t! ϵ T
4. (Abs)   => If v1,...,vn, n >= 0 ϵ V and t1, t2 ϵ T, then t1(v1...vn.t2) ϵ T.

The intuition behind this syntax is that a term is a list, which is why we need rule 1. The lambda calculus is structured as a binary tree in the application rule. That works for how application is defined there, but it makes specifying the reduction rules for concatenative languages a little too complicated in all the iterations I tried. To support a binary tree, some sort of flattening rule is needed. It is much easier to enforce the flatness of the syntax by modeling an `abstract syntax list` rather than the more usual `abstract syntax tree`, and we regain a simpler set of reduction rules.

There are also two major differences in the App and Abs constructions that may not be immediately clear. Note that the Abs construction allows multiple binding variables (`parameters` or `arguments`). This is actually not strictly necessary, but it matches better with the intuition that concatenative functions act on a set of inputs to generate a set of outputs. This is a crucial difference from the lambda calculus: a function in the concatenative calculus may return zero, one, or many more values. This often results in the idea that concatenative functions take a stack and return a stack, but that is more of an implementation detail rather than a core concept, and in fact the basic reduction rules for our calculus will make no mention of a stack.

The App construction will become much clearer when the reduction rules are introduced. If it helps for now, think of the `!` as saying, “apply the function that precedes me to the terms that precede it”. That’s a high level view and good enough for hand waving, but as we’ll see, what it really does is a little bit different.

Moving forward, here are some example terms in our new syntax:

Example 1: εa
Example 2: ε(.ε)
Example 3: εa(.εb)a

In the rest of the post I will elide all instances of the empty string (ε). They are there implicitly, they are just not written to improve brevity. Here are Kerby’s eight basic combinators translated into our concatenative calculus with variables:

1. dup  := (a.aa)!
2. swap := (ba.ab)!
3. zap  := (a.)!
4. unit := (a.(.a))!
5. cons := (ba.(.ba!))!
6. i    := (a.a!)!
7. dip  := (ba.a!b)!
8. cat  := (ba.(.b!a!))!

Readers familiar with Kerby’s combinator paper may notice that the functions almost mirror the `stack effect` rewrite rules of Kerby’s basic combinators.

Reduction

So now, how do we evaluate terms in our new calculus? In the combinatory calculi, you need at least one rewrite rule for each of the combinators. In the lambda calculus, there is the almighty beta reduction. I found it easier in this calculus to split the equivalent of beta reduction into two steps, each of which are independent of the other: application and substitution.

But before that, we will need a way to combine two terms into one term. Any ideas? That’s right, we’ll use concatenation! We’ll define the concatenation operation on terms like so:

(t1 ++ t2) :=
1. If t2 is ε        => t1
2. If t2 is t3v      => (t1 ++ t3)v
3. If t2 is t3!      => (t1 ++ t3)!
4. If t2 is t3(vs.s) => (t1 ++ t3)(vs.s)

This is just regular list concatenation with the arguments flipped, complicated by the fact that we have three different `cons` constructs. Now that we can concatenate two terms, we can define function application:

(Application) t1(.t2)! ==> t1 ++ t2

Well this is a bit unusual! With this application rule, we can only apply functions which take no arguments! This seems highly counterproductive, but remember we haven’t yet seen the substitution rules:

(SubVar) t1v(a1...an.t2) ==> t1 ++ (a1...an-1.t2[an/(εv)]
(SubAbs) t1(b1...bn.t2)(a1...an.t3) ==> t1 ++ (a1...an-1.t3[an/(ε(b1...bn.t2))]

This is a lot to take in, but it’s really very similar to beta reduction, only with some nuances introduced by the fact that we’re working on a list rather than a tree. Both rules replace all occurrences in the function body of the last variable in the list of arguments with either the function or variable that immediately precedes the function. We splice that argument out of the list and concatenate the new function, which now has one less argument, with all the previous terms. When a function no longer has arguments, the substitution rules no longer apply, and only the application rule can be used to yield further reductions.

There is another subtly that should be addressed: the substitution method doesn’t merely replace the variable with the substituted term, because the way we defined our terms doesn’t allow that to happen. Rather, it concatenates the substituted term with everything that preceded that occurrence of the variable in the function body.

Some basic examples to (hopefully) make things a little clearer:

Example 1: a(b.bc)!
  ==> (.ac)!
  ==> ac
Example 2: a(bc.b)!
  ==> (c.a)!
Example 3: a(b.(.b))!
  ==> (.(.a))!
  ==> (.a)
Example 4: (.(a.aa)!(a.a!)!)(a.aa)!(a.a!)!
  ==> (.(.(a.aa)!(a.a!)!)(.(a.aa)!(a.a!)!))!(a.a!)!
  ==> (.(a.aa)!(a.a!)!)(.(a.aa)!(a.a!)!)(a.a!)!
  ==> (.(a.aa)!(a.a!)!)(.(.(a.aa)!(a.a!)!)!)!
  ==> (.(a.aa)!(a.a!)!)(.(a.aa)!(a.a!)!)!
  ==> (.(a.aa)!(a.a!)!)(a.aa)!(a.a!)!
  ==> to infinity and beyond...

The last example is the equivalent of the omega combinator in our concatenative calculus, written in the lambda calculus as (\x.xx)(\x.xx). It would be written in Kerby’s system as [dup i] dup i.

That’s all for now on this subject. I haven’t written any proofs about this system, and I don’t really intend to since I’m not sure how useful it would be compared to how much work would be necessary. It seems pretty clear that this calculus shares many of the properties of the lambda calculus, but, until it’s proven, we can’t be certain. The usefulness of the system itself is debatable, but maybe it could provide a solid foundation for the concatenative languages of the future. In any case, I’m happy with how it turned out, though no doubt it could be improved. Maybe there’s a better solution that encoding terms as lists?

Shipped Up to Boston

PLT Stuff At Last!

Well, I wanted to blog more this summer. I can’t really claim that I was too busy, since I wasn’t. I did make a lot of progress on my Hearthstone card collection though. But I think I’m back now. I’ve read five books in the past three weeks, all to do with programming languages, so I better write while the mood lasts.

Wort

Alas, no progress on Wort. It’s not abandoned, but I’m not even close to done designing the language just yet. Compiling a dynamically typed concatenative language to JavaScript is also quite an efficiency challenge: the resulting code of a naive translation is ugly to read and slow to run. I’m not sure the problem can be solved without applying type inference to most words.

Yeast

I’ve been doing a lot of reading on Forth, however, and how to efficiently compile it and other related languages. A lot of the literature for the JVM and CIL doesn’t apply as well here since Forth can have divergent stack effects and very often functions with multiple return values. Looking into the internals of the Go compiler could likely yield some valuable insight to efficient multiple return values.

But the problem of type inference remains. I think I’ve come up with a type inference algorithm for higher-order stack languages, and I’m going to try it out on a batch-compiled version of Forth. More information on this is forthcoming, because I haven’t actually proved anything yet, but I’ve tested it by hand on several edge cases and it seems to be robust. Spoiler alert: the type system doesn’t support Forth’s divergent stack effects, but is still more general than the system formalized by Christopher Diggins (in that preserves associativity of composition while allowing composition of functions like (A ~ B)(C ~ D)). The language will, like Forth, be super low level but capable of higher level design patterns. I’m sticking with the ongoing alcohol-related naming theme, I’ve named the language Yeast right now.

Graduate School

Uh, well, for having a lot less classes it’s taking more time than undergrad seemed too. Part of that might be adjusting to living outside of Idaho for the first time in my life. Boston’s a great city, definitely the most fun big city I’ve ever been in. It has a character all its own, independent from the living inhabitants, and the people have so far been great. Driving in the city is absolutely abominable though. I take the T.
I wasn’t a huge fan of the Northeastern campus when I first arrived, but it’s grown on me, and the library is exquisite. Most of the aforementioned books I’ve been reading have been read while I lounge in one of the open grassy areas on campus, which are well stocked with benches and Adirondack chairs. It’s great, but not as great as seeing so many other people doing the same thing.

Forthcoming

I’m planning to work on formalization of concatenative languages on this blog. Like I said, I think I have the type inference algorithm worked out, but I want to get a better foundation in PLT theory before I go too far afield. I’ve looked into the formalization of semantics as well, and it seems that concatenative languages yield themselves very well to term rewriting semantics. Look to see posts in both of the above categories soon.

A New Blog

This again?

I managed to crank out a few blog posts on my old WordPress-based blog early last year before it fell into the Abyss of Neglect much like its predecessor before it. Not an entirely unexpected outcome, but a disappointing one nonetheless. In my defense, it was a rough couple semesters, but now my undergraduate career is complete! And I can begin again, this time with far less irritating homework distracting me from more important things, like writing miserable little blog posts.

Graduate school!

Just a few weeks ago I was accepted to Northeastern University’s Computer Science program, so I’ll be shipping up to Boston to get my Master’s come fall. I’m pretty excited about it, firstly because I’ve never lived outside of Idaho for any extended period of time, and secondly because Northeastern has a really fun sounding curriculum with a good emphasis on programming languages. Maybe I will finally get to take a class that at least touches on formal type theory, because I’m sure taking a long time to study it on my own. So yeah, I’m pumped.

A New Wort

As a summer side project, I’m going to pick up my little Wort programming language again. It started out as a homoiconic concatenative language that compiled to JavaScript, but I’ve made some changes to the syntax and abandoned homoiconicity in favor of performance, while keeping the dynamic type system. I like the language better as a result already, and it’s easy enough to write homoiconic cat-langs that I felt Wort had nothing that set it apart. Now that’s not the case. I’ll write some more posts on it coming up.

More than Programming

I’m hoping I can include things besides my obscure programming interests in this blog. I’m just not sure what that’ll be yet. I haven’t made any constructed languages in a while, and I don’t particularly want to focus on that while I’m working on grad school, so don’t expect too much of that. On the other hand I am very much interested in more mathematical things, so I might post some interesting mathy stuff every now and then.

I am also slowly becoming more comfortable with my own views on life and society, so I may occasionally offer such musings at the risk of limiting my readership. I’ll try to keep it as rational and calm as possible during such instances. But no promises.