A Minimal Concatenative Language with Variables

In one of my earlier posts I describe a syntax for concatenative languages with variables. The resulting language suffers from three apparent problems:

  1. The semantic separation between application and substitution is not particularly clear. It seems as though this ought to be the case in concatenative languages.
  2. The language did not look concatenative at all. Parentheses were everywhere! Why write in postfix if you still need a million parentheses?
  3. Variables can only be defined at the head of a function.

I’d like to remedy these three issues, but I’d also like this to be something of an introductory article, so I won’t refer back to my earlier system very much, and will instead start from the beginning.

Why not Stack-based Languages?

A few years ago there was a bit of strife on StackOverflow with regard to what ‘concatenative’ meant when applied to programming languages. This is pretty understandable, given the state of investigation at the time, and Norman Ramsey’s claims ring mostly true even now. But I do prefer the term to the colloquial alternative, stack-based languages, because a concatenative language being stack-based is an implementation detail, not necessarily a defining characteristic. It is quite possible to implement a concatenative language using something like graph reduction or term rewriting, and in some cases it may be preferable to do so to get the desired semantics, but that’s a topic for another day.

So what is the defining characteristic of concatenative languages if it is not operations that manipulate a stack? There’s something to be said for John Backus’s emphasis on function-level programming, and indeed many concatenative languages share an emphasis on tacit (or point-free) programming with Backus’s FP. But I think overemphasizing the combinatorial nature of concatenative languages is harmful in that without strict discipline it often leads to ugly, inscrutable code. Too often the ugliness of the point-free style seems plain unavoidable. Naming things is very useful, and substitution is arguably one of the most powerful operations ever invented.

So I tend to favor Jon Purdy’s definition. A concatenative language is a functional language in which the primary operation upon which programs are built is function composition, rather than function application as in the lambda calculus. Every expression denotes a function, even value names like numbers and strings. This definition almost gives us Backus’s emphasis on function-level programming (since everything is now a function) without restricting ourselves to point-free style.

A Minimal Concatenative Calculus

I will now present a minimal untyped concatenative calculus, similar to the untyped lambda calculus in that the only ‘values’ are function literals and the only operations are ‘calling’ a function and performing substitution of variables.

Let’s start with the syntax:

e ::= ϵ
    | e [e]
    | e call
    | e let x { e }
    | e x

There are no numbers, no strings, no booleans. Just function literals (represented by an expression enclosed in square brackets), a word for explicitly applying a function, and variable introduction and reference. The \epsilon production is there only to be a basis case for the syntax; it plays no role in reduction and can thus be elided when writing out expressions. Note also that this syntax already solves the last two of the problems I introduced above! Let’s finalize the system by solving the first problem.

The lambda calculus has one reduction rule: \beta-reduction. To get a system of equivalent power, we need two reduction rules:

  1. e_1\ [e_2]\ \text{call} \Rightarrow e_1 \cdot e_2
  2. e_1\ [e_2]\ \text{let}\ x\ \{ e_3 \} \Rightarrow e_1 \cdot ([e_2]/x)e_3

These rules rely on two metafunctions: substitution, which is defined much like substitution in the lambda calculus, and the \cdot operator, which works similar to a list append function in that it replaces the \epsilon in e_2 with e_1, thus conjoining the expressions so that e_1 ‘appears before’ e_2 if we write it from the most nested expression outward.

With these two reduction rules and metafunctions, we have a simple system of awesome power. But note that the way we’ve specified these rules neglected any notion of evaluation strategy: we can apply these rules at will anywhere inside the term. To get a stack-based evaluation strategy, we would apply these rules to the most deeply nested expression that is not inside square brackets at every step, always matching e_1 with \epsilon, and stop once we no longer find a match to one of the rules. This is not the only possible evaluation strategy, but it is probably the most familiar (and maybe simplest) to implement.


To show off that the tiny new language is still useful, we give some examples of how to do certain things. We start by defining Kerby’s eight basic combinators.

swap == let x { let y { x y } }
dup  == let x { x x }
zap  == let x { }

compose  == let f { let g { [g call f call] } }
partial  == let f { let g { [g f call] } }
constant == let f { [f] }

apply == call
dip   == let f { let x { f call x } }

I’ve renamed some of Kerby’s combinators to make it more clear just what they do. We can show an expression that never finishes reducing, proving that the system has the nonterminating property:

omega == [let x { x x } call] let x { x x } call

And one could imagine some of the functions above being written more succinctly if we added some syntactic sugar. Let’s imagine the language has ML/Haskell style lists and pattern matching (quite a leap, so familiary with Haskell or ML would be useful). We can then write higher-order functions over lists.

map []     f == []
map (x:xs) f == x f call [xs f map] dip Cons

fold []     i f == i
fold (x:xs) i f == x [xs i f fold] dip f call

Both of these functions assume that the function supplied to both map and fold consumes exactly the number of arguments it produces, but it need not be used that way since the language is untyped (so we could use map as a stack accumulator, for instance). This is really only recommended for motivated experts with a niche use-case, as the stack manipulation involved can quickly get hairy.


We’ve now defined a fairly intuitive, mostly readable, and noticeably concatenative minimal calculus. I find this core language to be immensely useful right now, and will likely use extended variants of it in my future concatenative endeavors.

As always, happy stacking!


Three Ways to Specify Your Concatenative Languages

This post is existential proof that even PL theory blogs can use those annoying listicle titles.

I’ve been thinking about how best to specify the syntax for concatenative languages. It would be nice if we could have a grammar as straightforward as the lambda calculus, but I don’t think we have that luxury. This is one of those places where dealing with concatenative languages is more annoying than dealing with standard applicative languages. I’ve come up with three solutions so far, each with its own set of good and bad properties.

Naive Style

We start with what I consider to be the ‘naive’ definition.

e ::= ϵ
    | [e]
    | call
    | let x in { e }
    | x
    | e e

This is the definition style we see in Christopher Diggins’s paper. What it has going for it is that it kinda looks like an extended lambda calculus (see that e\ e in there?). The first problem is that this is misleading, because concatenative languages do not behave quite like applicative languages: e\ e is not the syntax for function application, but instead for function composition. The second reason is because this definition complicates the reduction semantics because of the tree-like nature of the e\ e production. First, we have to add a bunch of noise rules for getting rid of \epsilon leaves that might have snuck in. But it’s more serious than that: suppose we have the following reduction rule for call:

(e_1\ [e_2])\ call \Rightarrow e_1\ e_2

We can reduce the below term just fine:

([\epsilon]\ [\epsilon])\ call \Rightarrow [\epsilon]\ \epsilon

But then imagine we are presented with this term:

[\epsilon]\ ([\epsilon]\ call)

Apart from the parentheses, this term looks very similar to the first. Given that postfix languages like ours don’t need parentheses, we’d like it to reduce to the same result as the first term. But the pattern of our reduction rule doesn’t match the structure of the term, so we can’t reduce it. We’re stuck! This problem can’t be solved simply by adding a rule that fits this structure, because the e\ e construct allows us to up aribtrary trees that, when flattened and stripped of any \epsilon, look like they have the pattern of our reduction rule. Because we want concatenation to be associative, we know we can flatten the tree into a list via a preorder traversal, but requiring a tree-flatten step before any reduction can take place means we would have to carefully design our reduction rules to maintain that invariant, or include some tree-flattening metafunction as part of the semantics. I don’t even want to bother, because we can actually enforce the flatness invariant within our grammar definition using the next style.

Nested Style

What we want is for trees to be ‘flat’ by definition, so that we don’t have to flatten an arbitrary binary tree before or during reduction. We structure the syntax in such a way that the resulting tree will be be guaranteed to only have one child at each step.

e ::= ϵ
    | e [e]
    | e call
    | e let x { e }
    | e x

Now we are guaranteed by the grammar to have flat expressions. The first benefit that we gain is that we no longer need to have a bunch of uninteresting reduction rules for getting rid of \epsilon. In this syntax, \epsilon can only appear at the ‘bottom’ of a tree, where it can be safely ignored by the reduction rules. The second benefit is that we no longer need to use parentheses when writing out our expressions. And the third, most important benefit is that we only have to write one reduction rule for call! Here it is:

e_1\ [e_2]\ call \Rightarrow e_1 \cdot e_2

Uh oh! What’s that e_1 \cdot e_2 syntax? This is the downside. We can’t write e_1\ e_2 since it’s not a part of our syntax, so to concatenate two programs we have to define a rather annoying \cdot binary operator, which is essentially a more verbose list append, replacing the bottom-most \epsilon in e_2 with e_1. This is still a more acceptable metafunction that tree-flattening, since it doesn’t really require us to be too careful.

List Style

Of course, using that type of representation in an actual implementation would be pretty cumbersome. We can take advantage of the predefined list functions available in many language standard libraries by splitting the syntax into two pieces:

e ::= ϵ
    | e t

t ::= [e]
    | call
    | let x { e }
    | x

Now the list-like nature of concatenative languages is apparent even from the grammar!

The major downside here is the mutual recursion, which can complicate proofs about properties of the reduction semantics (it certainly did for me in Coq, at least). But it makes sense as the abstract syntax in an implementation, since we don’t have to write our own append function and explicit recursion can often become higher-order functions on lists.

Of the three styles, I’d say the only the naive style is not suitable for use beyond defining the basic syntax of a language. The nested style and the list style both find applications in different settings.