Introduction to the λ-calculus

Meet your new favourite formal system

Tuesday, 7 April 2015

Most people think of Newton and Leibniz’s infinitesimal calculus when they hear the word calculus, but the term is actually more general than that. A calculus is a formal system of calculation or reasoning, usually involving the symbolic manipulation of expressions. The λ-calculus (lambda calculus) is one such system, and it is very important in computer science.

Expressions

Everything in the λ-calculus is an expression. There are three types of expressions: variables, abstractions, and applications:

A variable is represented by a letter such as $x .$ The letter has no purpose other than to distinguish one variable from another.
An abstraction has the form $λ x . e$ where $x$ must be a variable and $e$ can be any expression. You can think of it as an anonymous function of one parameter $x$ with a function body $e .$
An application has the form $h x$ where $h$ and $x$ are expressions. If $h$ is an abstraction, then this is like applying a function $h$ to an argument $x .$

Using this recursive definition, we can tell whether any string of characters is a valid λ-expression. For example, $λ x . λ$ is invalid because the part after the abstraction’s period should be an expression, but λ by itself is not an expression. On the other hand, $λ a . λ b . λ c . c$ is a well-formed expression.

To interpret λ-expressions unambiguously, we need a few conventions:

Outer parentheses are unnecessary: $(((x)))$ is the same as $x .$
Applications are left-associative: $((a b) c)$ is the same as $a b c .$ On the other hand, the parentheses in $a (b c)$ are necessary to preserve meaning.
Abstraction bodies extend as far to the right as possible: $λ x . (x x)$ is the same as $λ x . x x,$ which is different from the application $(λ x . x) x .$

Free variables

When we look at all the variables in a given λ-expression, we classify some as free variables. A variable by itself, such as $x,$ is always a free variable. However, the $x$ in $λ x . x$ is not free because the abstraction binds it; we call these variables bound variables. All variables are free until they are bound in an enclosing abstraction. Let’s look at some examples:

Expression	Free variables
$x$	$x$
$λ x . x x$
$λ x . y$	$y$
$λ y . λ x . y$
$a (b c) d$	$a, b, c, d$
$(λ a . a b) a$	$a, b$

Be careful with that last example. Clearly $b$ is a free variable, but what about $a ?$ It looks like $a$ occurs three times, but there are actually two distinct variables here! The $a$ that occurs in the abstraction is a bound variable, while the rightmost $a$ is a free variable. When a variable is bound, everything inside the abstraction refers to this new bound variable, regardless of what the variable means outside. In these cases, we say that the new bound variable shadows the free variable. To drive the point home, consider the expression $a λ a . (a λ a . (a λ a . a)) .$ Here we have four distinct variables – one free and three bound – and they all reuse the letter $a .$

One final thing to note is that a variable’s freedom depends on the scope being considered. If we consider $λ x . b x$ as a whole, then $x$ is a bound variable. Even so, it is equally correct to say that $x$ is a free variable in $b x$ of that expression. For this reason, we should always be clear if we are talking about the whole expression or just a part of it.

Reduction

The heart of the lambda calculus lies in the reduction of expressions. Reduction is a process guided by a set of simple yet powerful rules, and it is the essential component that allows us to call the λ-calculus a calculus. There are three methods of reducing expressions, named using the Greek letters alpha, beta, and eta.

Alpha-reduction allows us to rename parameters in abstractions. We do this by changing the parameter, including and all its occurrences in the body, to a new letter. For example, $λ x . x$ is alpha-equivalent to $λ n . n .$ The parameter is just a placeholder – it’s name doesn’t really matter. However, there are two restrictions on alpha-reduction. First, the new variable must not occur as a free variable in the body. Consider $λ a . a b,$ an abstraction that applies its argument to $b .$ If we rename $a$ to $b,$ we get $λ b . b b,$ an abstraction that applies its argument to itself – something went wrong here! This new expression has a different meaning because we inadvertently captured the free variable $b,$ making it a bound variable. Second, the old variable must not occur in an abstraction where the new variable is already bound. Consider $λ a . λ b . a;$ if we rename $a$ to $b,$ we get $λ b . λ b . b,$ which is different because $b$ now refers to the inner bound variable rather than the outer one. As long as we avoid these two cases, alpha-reduction always results in expressions that intuitively have the same meaning.

Beta-reduction is what really makes things happen. It only works on applications of abstractions, and you can think of it as function application. An expression of the form $(λ x . e) a$ beta-reduces to $e [a / x],$ which denotes $e$ with $a$ substituted for all occurrences of $x$ in a special way called capture-avoiding substitution. For example, $(λ x . x x) u$ is beta-equivalent to $u u .$ However, it is not so straightforward to reduce $(λ x . λ y . x) y$ because it should be an abstraction that always returns the free variable $y,$ but simple substitution yields $λ y . y,$ which returns the bound variable $y$ instead! We avoid this problem using capture-avoiding substitution:

$x [a / x] = a,$ just like simple substitution.
$y [a / x] = y$ if $x \neq y,$ also like simple substitution.
$(h n) [a / x] = (h [a / x] n [a / x]) :$ we recursively perform capture-avoiding substitution on the two expressions in the application.
$(λ x . b) [a / x] = λ x . b :$ the variable $x$ is already bound by the abstraction, so there are no occurrences of the free variable $x$ that can be substituted.
$(λ y . b) [a / x] = λ y . (b [a / x])$ if $x \neq y$ and $y$ is not a free variable in $a .$

That last rule subtly avoids the problem we observed earlier. It prevents us from substituting an expression containing free variables that would be unintentionally captured. If $y$ does occur as a free variable in $a,$ then we must alpha-reduce the abstraction so that its parameter does not occur as a free variable in $a$ before performing beta-reduction.

Eta-reduction converts $λ x . g x$ to $g$ where $g$ is any expression. This makes sense as there is no real difference between an abstraction that applies $g$ to an argument, and $g$ itself. However, we must ensure that $x$ does not occur as a free variable in $g .$ Consider $λ x . x x;$ we cannot eta-reduce this because it is clearly different from $x .$

Boolean algebra

The amazing thing about this symbol-shunting calculus is that it is incredibly powerful! It’s hard to believe at first, but the λ-calculus can compute anything that is computable in theory. This includes everything your computer can do, and much more. We’ll start with Boolean algebra.

Their are two values in Boolean algebra: True and False. We’ll represent True with $T ≔ λ x . λ y . x$ and False with $F ≔ λ x . λ y . y .$ For clarity, we’ll use the letters $T$ and $F$ in expressions, although to make a correct λ-expression you need to replace them with their values. Note that, if we perform the application $b x y$ where $b$ is a Boolean, then the result will be $x$ if $b$ is True, and $y$ if $b$ is False. A secondary interpretation of $T$ and $F,$ then, is that they select the first or second of two items.

There are three principle Boolean operations: conjunction (And), disjunction (Or), and negation (Not). Here are their implementations:

Operation	Expression
And	$λ a . λ b . a b F$
Or	$λ a . λ b . a T b$
Not	$λ a . a F T$

The operation And takes two Boolean parameters. It then applies $a$ to two arguments, which selects one based on the value of $a .$ If $a$ is True, it returns $b,$ and if $a$ is False, it returns False. Try doing the beta-reduction on paper for each of the four input combinations – you’ll see that it gives the correct answer every time! If you understand how And works, it shouldn’t be too hard to figure out Or and Not.

Church numerals

The Booleans we implemented are actually called Church Booleans because they use Church encoding, named after Alonzo Church. We can also encode the natural numbers in this way, producing Church numerals.

We represent zero with $Z ≔ λ h . λ x . x .$ One is $λ h . λ x . h x,$ two is $λ h . λ x . h (h x),$ five is $λ h . λ x . h (h (h (h (h x)))),$ and so on. To represent some number, we simply apply $h$ that many times. With Church numerals, we can recover all of arithmetic! I’ll implement the successor operator (adds one) as well as addition, multiplication, and exponentiation:

Operation	Expression
Succ	$λ n . λ h . λ x . h (n h x)$
Add	$λ n . λ m . λ h . λ x . n f (m f x)$
Mult	$λ n . λ m . λ h . n (m h)$
Pow	$λ n . λ m . m n$

If you taught someone the rules of the λ-calculus, you could have them compute sums, products, and powers, and they would have no idea they were doing it! It looks like mindless symbol-manipulation, but it corresponds directly to our usual arithmetic.

Conclusion

I’ve only scratched the surface of the λ-calculus in the article. There are many more data types we could Church-encode, including linked lists. It’s also possible to define higher order functions like maps and filters – before you know it, you’ll have a full-featured functional programming language. In fact, Lisp was deliberately designed on a λ-calculus core over 50 years ago! I encourage you to check out my project Lam if you get tired of doing the reductions by hand. I’d love to twist your brain into a pretzel by showing how you can make recursive abstractions using the Y combinator, but that will have to wait for its own article.