Saturday, September 12, 2009

Balanced brackets, finite sets and ordinal numbers

import Data.List
import Data.Set as Set
import Data.MultiSet as MultiSet -- not required

Balanced brackets

The Dyck language is the set of correctly parenthesised strings, like "(()(()()))".

data Dyck = Dyck [Dyck]

If you disregard the type constructor, an inhabitant looks like [[], [[], []]]. This type represents rooted trees. A rooted tree is a list of rooted trees. To convert a list of rooted trees to a rooted tree, you add the root. To convert back, you remove it.

Traversing the tree gives us the string:

string (Dyck x) = "(" ++ concatMap string x ++ ")"

To be honest, we're missing strings like ()() and the real Dyck type is [Dyck]. But it isn't important since these types are isomorphic as the picture above shows.

Hereditarily finite sets

If we use sets instead of lists, then we get hereditarily finite sets.

data HFS = HFS (Set HFS)

instance Eq HFS where
(==) (HFS x) (HFS y) = x == y

instance Ord HFS where
compare (HFS x) (HFS y) = compare x y

I was a bit surprised this works - to talk about Set HFS you have to implement comparision for HFS (since sets in Data.Set are binary search trees inside), but to implement comparision for HFS you have to define HFS first. Mutual recursion.

Here are A=\{\emptyset\} and B=\{\emptyset, \{\emptyset\}\}.

hfs = HFS . Set.fromList

setA = hfs [hfs []]
setB = hfs [hfs [], setA]

-- Every natural number can be written in binary:
data Binary = Natural -> Bool
-- We can write the exponents in binary too:
data Binary = Binary -> Bool
-- And, since a function into Bool represents some set,
data Binary = Set Binary

This is the same as the equation for HFS. To convert, recursively change a number to the set of exponents: 15 = {0,1,2,3} = {{},{0},{1},{0,1}} = {{}, {{}}, {{0}}, {{},{0}}} = {{}, {{}}, {{{}}}, {{},{{}}}}. It turns out there's a paper on this subject.

Ordinal numbers

Here's another example. If you don't know ordinal numbers skip this paragraph. I'll talk only about ordinals lesser than \epsilon_{0}. They can be written in the form


where a_i is a nonincreasing sequence of ordinal numbers. These a_i may be written recursively in that form too, and this recursion will always stop at some point. You can put coefficients after each \omega^{a_i} and require that the sequence is decreasing. This looks very similar to previous examples.

data ON = ON (MultiSet ON) -- (small) ordinal

instance Eq ON where
(==) (ON x) (ON y) = x == y

instance Ord ON where
compare (ON x) (ON y) = compare x y

on = ON . MultiSet.fromList

If you don't have multisets installed (cabal install multiset), you may use lists instead and think of them as multisets. Just be careful to always put the items on the list in nonincreasing order.

on [] is the empty sum - 0
on [on []] is \omega^0=1
on [on [on []]] is \omega^1=\omega
on [on [on []],on []] is \omega+1

It is interesting that the ordering given by instance Ord is exactly the standard ordering used in mathematics to compare ordinals. (It doesn't matter if you use multisets or sorted lists.) We got it out of nothing. Some other operations are also nicely implemented. The function \omega^x is simply f x = on [x]. The natural sum is sum of multisets (or merging of sorted lists). Normal addition isn't hard to write too.

Natural numbers

Peano's natural numbers can be written as

data Nat = Succ Nat | Zero


data Nat = Nat (Maybe Nat)

The common pattern

What we have done above was finding a type X such that X=F(X) for some type constructor F (lists, sets, multisets, Maybe). This is the fixed point of F!

data Y f = Y (f (Y f))
a = Y [Y [], Y [Y [], Y []]]

ghci> :t a
a :: Y []
ghci> :k Y
Y :: (* -> *) -> *

It works. However, we can't show it (deriving (Show) won't help).

Problem: Write an instance of Show. My solution is a workaround.


I do hope you tried and convinced yourself that something like

instance (Show (f a), Show a) => Show (Y f) where
show x = "Y " ++ (show x)

or similar won't work. What we want is, in pseudocode:

instance (forall a. Show a => Show (f a)) => Show (Y f) where
show x = "Y " ++ (show x)

In other words we require that f is a type constructor that conveys showability: if x is showable, then f x is showable too. I made a separate type class for such f.

class Show' f where
show' :: (Show a) => f a -> String

instance Show' [] where
show' = show

instance Show' Set where
show' = show

instance Show' Multiset where
show' = show

instance Show' f => Show (Y f) where
show (Y x) = "Y " ++ (show' x)

ghci> a
Y [Y [],Y [Y [],Y []]]

Much better solutions are given in comments.

The same boilerplate can be used for Eq and Ord to get two last examples as Y Set and Y Multiset. The instance of Ord gives you lexicographic order on the words in Dyck language (Y []), ordering of natural numbers written in binary (Y Set), ordering of natural numbers in unary (Y Maybe), and the normal ordering of ordinals (Y Multiset).


data H a = H (Maybe (Integer, a))

What is Y H? Write functions

convert :: that_type -> Y H
convert2 :: Y H -> that_type

If you're stuck, try to construct some value of type Y H.


Our fixed point equation is t = Maybe (Integer, t). I discussed this last time: these are lists. The conversion functions we are looking for are fold and unfold. I'll write another version of foldr I mentioned previously:

foldr2 :: (Maybe (a,b) -> b) -> [a] -> b
foldr2 f = foldr (curry (f . Just)) (f Nothing)

convert :: Y Helper -> [Integer]
convert2 :: [Integer] -> Y Helper

convert = unfoldr (\(Y (H x)) -> x)
convert2 = foldr2 (Y . H)

Many other recursive types like binary trees or infinite streams can be treated similarily.

I ought to add infinite lists are lurking here. Our balanced brackets type has other inhabitants, like [[],[],...] or [[[...]]]. You shouldn't worry about that. First, we don't have to use all values of the type: as long as we're careful, they won't show up unexpectedly and you may regard the type as the type of finite lists. Second, you can enforce that with strictness. Third and most important point, infinite lists appear quite naturally and they do have applications. I'll give an example later.

[Update: expanded on the hereditarily finite sets. Thanks to Ryan Ingram and Edward Kmett for great solutions of Show (Y f).]

[Update 17 Sep: I've found a paper (7 MB) covering very much of what I said. Haven't read it yet, but looks very promising.]

Saturday, August 22, 2009

Some thoughts and exercises on folding

I assume you know what are foldl and foldr and have basic experience with them - you know how to express functions like sum, maximum using these folds. Here are some very nice diagrams showing fold and related functions.

Warmup exercise:

compose :: [a -> a] -> a -> a
compose [f1, f2, ..., fn] = f1 . f2 . ... . fn

Write map and compose using foldr. Write foldr using map and compose.

Solution at the end.

A finite automaton is some device that has finitely many states (finite memory), reads a string, letter by letter, and changes its state according to some fixed transition rules. For example, it can remember if the number of a's was even or odd - it toggles a single bit after each 'a'. Here's an automaton that checks if the whole word is built of 'x' and 'y' only:

aut = foldl (\state a -> state && (a == 'x' || a == 'y')) True

Automata are really doing foldl. In this case we could have written this more easily using all, but in general the behaviour can be quite complex.


I described a deterministic automaton - it always worked in the same way.

deterministic :: (a -> b -> a) -> a -> [b] -> a
deterministic = foldl

Now, we have a nondeterministic one - it can choose from some list of possible states and go on. Return all possible states the automaton could finish with.

nondeterministic :: (a -> b -> [a]) -> a -> [b] -> [a]

Don't worry about removing duplicates.


You can regard a nondeterministic automaton as a deterministic one whose states are lists of states.

Our transition function is a -> b -> [a]. To use foldl, we need [a] -> b -> [a].

convert :: (a -> b -> [a]) -> [a] -> b -> [a]
convert f m a = concatMap (\x -> f x a) m

nondeterministic f s = foldl (convert f) [s]

Does the convert function remind you of something? We can easily change the code to work with Maybe - the automaton will be deterministic, but it will be able to abort the computation anytime. It could also ask the user about the next state using IO. It's the list monad.

convert' :: (Monad m) => (a -> b -> m a) -> m a -> b -> m a
convert' f m a = m >>= (\x -> f x a)

nondeterministic' :: (Monad m) => (a -> b -> m a) -> a -> [b] -> m a
nondeterministic' f s = foldl (convert' f) (return s)

This function is called foldM and is available in Control.Monad.

foldM :: (Monad m) => (a -> b -> m a) -> a -> [b] -> m a

Suppose you have types a,b such that b = 1 + a b. Then, by substituting this equality into itself, we get

b = 1 + ab = 1 + a(1 + ab) = 1 + a + aab = 1 + a + aa(1 + ab) = 1 + a + aa + aaa + aaab = ... = 1 + a + aa + aaa + ...

b has changed into 1 + a + a a + a a a + ... This is the same as a list, [a].

Equality of types x=y means that we can pair corresponding x's and y's. But we don't need that. Instead, we just need a map b -> 1 + a b that doesn't have to be an isomorphism. We can change all "equals" signs in above reasoning to ->:

b -> 1 + ab -> 1 + a(1 + ab) -> ... -> 1 + a + aa + ... -> [a]

and it stays OK. So, given a function (b -> 1 + a b), we get b -> [a]. This is known as unfolding. This name is very suggestive - we unfolded b into a list of a's. Since 1 + a is called in Haskell Maybe a, we have a higher order function

unfold :: (b -> Maybe (a, b)) -> (b -> [a])

Exercise: Write it.


unfold f x = case f x of
Nothing -> []
Just (a,x') -> a:(unfold f x')

Data.List calls this function unfoldr.

If you invert this reasoning, given a function (f :: 1 + a b -> b) you get [a] -> b. This is folding. f Nothing is our initial value, and f $ Just (a,b) tells how to combine a and b.

Fold vs unfold
Some simple exercises. Recursion is not allowed directly, use only foldl, foldr and unfold.

1. Write factorial.

2. Write functions

toBinary :: Integer -> [Bool]
fromBinary :: [Bool] -> Integer

that convert between a natural number and its binary representation.
For example, fromBinary [False, True, True] == 6.

3. Now you have binary numbers. Write exponentiation by squaring.

power :: (Monoid a) => a -> Integer -> a

Exponent will be nonnegative.

A natural number is a list
An important degenerate case of lists are natural numbers. A list of () is really a natural number (its length). This is unary numeral system.

foldl :: (a -> b -> a) -> a -> [b] -> a

If we take b = (), we get

foldl' :: (a -> a) -> a -> Integer -> a

This is n-th power of a function.

Other resources
If you're interested in fold, check out these resources. They are really worth reading and will give you much insight and many intereresting ideas.

* A tutorial on the universality and expressiveness of fold
* The Monad.Reader 6
* Functional Programming with Bananas, Lenses, Envelopes and Barbed Wire
* Proof methods for structured corecursive programs

Solution to the warmup exercise:

map f = foldr (\a x -> f x : a) []
compose = foldr (.) id
foldr f a xs = compose (map f xs) a

Here's another version of compose, by roconnor:

compose l x = foldr ($) x l