Monday, April 16, 2012

Descending the level ladder

The following is a way to compute 2+1 using the GHC type system. The ":kind!" command tells GHCi to display normalized version of a type.
{-# LANGUAGE TypeFamilies #-}
data Zero
data Succ a

type family Add x y
type instance Add Zero x = x
type instance Add (Succ x) y = Succ (Add x y)

*Main> :kind! Add (Succ (Succ Zero)) (Succ Zero)
Add (Succ (Succ Zero)) (Succ Zero) :: *
= Succ (Succ (Succ Zero))

Notice we did not use Haskell values anywhere. Not even undefined. This way of computing 2+1 uses types only.

Imagine you are a malicious professor teaching Haskell to freshmen. You tell them that this is the normal way to compute 2+1 in Haskell. The students later code Fibonacci, multiplication etc. They get a Turing complete language.

They have no idea that Haskell has what we call values. To them, "values" are things like Succ Zero, what we call "types". They think Haskell has a type system, that prohibits nonsense such as Zero (Succ Succ). In reality, that type system is kinds.

Later, the students discover the truth; their values are really types, and their types are really kinds.

In this post I will argue we might live in such shifted world as well. There is a layer below Haskell values.

What is it? By Curry-Howard, we can think of values as proofs for types. For example, the identity function is a proof that p -> p. Analogously, we can think of "undervalues" as proofs for values. So I will call this level "proofs". We have
kinds > types > values > proofs
where A>B means that A is a type system for B.

One way to see "undervalues" is to check what are values in above scenario. If we add to the initial code the datatype
data Equal a b where
  Eq :: Equal a a
then a value of type
com :: forall (m :: *) (n :: *). Equal (Add m n) (Add n m)
could work as a proof that addition is commutative.

So, in the shifted scenario, we have a kind * (a poor man's Naturals), types Zero and Succ, and a value of type Equal (Add m n) (Add n m).

Shifting down, we have a type Natural, values Zero and Succ, and a proof of value m + n == n + m, which cannot be seen in normal Haskell.

data Natural = Zero | Succ Natural 

add :: Natural -> Natural -> Natural
add Zero x = x
add (Succ x) y = Succ (add x y)

*Main> add (Succ (Succ Zero)) (Succ Zero)
Succ (Succ (Succ Zero))

-- com :: forall (m :: Natural) (n :: Natural). add m n == add n m
-- cannot be written

When you write a type T in Haskell, you can check its correctness from above:

Is there a K such that type(T)=K?
(i.e. Is there a valid kind for T?)

but you can also check it from below:

Is there a V such that type(V)=T?
(i.e. Is there a valid value for T? Is T inhabited?)

Analogously, when you write a value V in Haskell, you can check its correctness from above:

Is there a T such that type(V)=T?
(i.e. Is there a valid type for V?)

but you can also check it from below:

Is there a P such that type(P)=V?
(i.e. Is there a valid proof for V? Is V inhabited?)

Checking from above corresponds to checking syntax, and checking from below to checking semantics.

The type Equal Int Bool is not invalid. You can use it in type-level computation. But it is uninhabited, and this is a sign it might not make sense. The same thing happens when you write an instance of Monoid which is not a monoid. You used an uninhabited value. Since syntax is OK, you can run your code. But you will run into trouble, since semantics is wrong. There is no proof of your monoid instance, just like there is no value of the type Equal Int Bool.

Encoding equational reasoning

A lot of functional literature contains proofs. Here is an example of what I mean - product of two associative operations is associative. Suppose (a,b)+(c,d)=(a+b,c+d). (The + symbol is overloaded and means 3 different things, beware.)

(x + y) + z =
             [use x = (fst x, snd x)]
((fst x, snd x) + (fst y, snd y)) + (fst z, snd z) =
             [Definition of product operation]
((fst x + fst y, snd x + snd y) + (fst z, snd z) =
             [Definition of product operation]
((fst x + fst y) + fst z, (snd x + snd y) + snd z) =
             [Associativity of + and +]
(fst x + (fst y + fst z), snd x + (snd y + snd z)) =
             [Definition of product operation]
(fst x, snd x) + (fst y + fst z, snd y + snd z) =
             [Definition of product operation]
(fst x, snd x) + ((fst y, snd y) + (fst z, snd z)) =
             [use (fst x, snd x) = x]
x + (y + z)

This should remind you of function composition:
    f     g     h
 A --> B --> C --> D
but on a lower level: types A,B,C,D become values, and values f,g,h become proofs.

It has always bugged me that such proofs are not easily executable in Haskell despite their routine feel.

This is because we cannot access the proof level. We can attempt to do so, by shifting everything up. I will adapt the proof and show that the product of two monoids is a monoid.

We start by encoding values as types. For simplicity, we will not transfer their types to complicated kinds.

The (:=) datatype encodes reasoning rules for equality.

{-# LANGUAGE GADTs, KindSignatures, RankNTypes, TypeOperators, TypeFamilies,
             ScopedTypeVariables #-}
data Fst x
data Snd x
data ProdOp (f :: * -> * -> *) (g :: * -> * -> *) x y

data a := b where 
  Refl :: a := a
  Sym :: a := b -> b := a
  Comp :: a := b -> b := c -> a := c
  Arg :: a := b -> c := d -> f a c := f b d
  PairC :: a := (Fst a, Snd a)
  DefProd :: ProdOp f g (a, b) (c, d) := (f a c, g b d)

Now, we can write proofs of associativity and left/right identity. The functions neutrLProd and neutrRProd take proofs that f,g have neutral elements m,n and show that their product has neutral element (m,n). The function assocProd does the same with associativity.
type NeutralL (+) e = forall x. e + x := x
neutrLProd :: NeutralL m e -> NeutralL n f -> NeutralL (ProdOp m n) (e,f)
neutrLProd e1 e2 =
                        -- (e,f) + x
   Arg Refl PairC
   `Comp`               -- (e,f) + (fst x, snd x)
   `Comp`               -- (e + fst x, f + snd x)
   Arg e1 e2
   `Comp`               -- (fst x, snd x)
   Sym PairC
                        -- x

type NeutralR (+) e = forall x. x + e := x
neutrRProd :: NeutralR m e -> NeutralR n f -> NeutralR (ProdOp m n) (e,f)
neutrRProd e1 e2 =
                        -- x + (e, f)
   Arg PairC Refl
   `Comp`               -- (fst x, snd x) + (e, f)
   `Comp`               -- (fst x + e, snd x + f)
   Arg e1 e2
   `Comp`               -- (fst x, snd x)
   Sym PairC
                        -- x

type Assoc (+) = forall x y z. ((x + y) + z) := (x + (y + z))
assocProd :: Assoc f -> Assoc g -> Assoc (ProdOp f g)
assocProd a1 a2 =
                        -- (x + y) + z
   Arg (Arg PairC PairC) PairC
   `Comp`               -- ((fst x, snd x) + (fst y, snd y)) + (fst z, snd z)
   Arg DefProd Refl
   `Comp`               -- (fst x + fst y, snd x + snd y) + (fst z, snd z)
   `Comp`               -- ((fst x + fst y) + fst z, (snd x + snd y) + snd z)
   Arg a1 a2
   `Comp`               -- (fst x + (fst y + fst z), snd x + (snd y + snd z))
   Sym DefProd
   `Comp`               -- (fst x, snd x) + (fst y + fst z, snd y + snd z)
   Arg Refl (Sym DefProd)
   `Comp`               -- (fst x, snd x) + ((fst y, snd y) + (fst z, snd z))
   Sym (Arg PairC (Arg PairC PairC))
                        -- x + (y + z)

(Exercise for the reader: Write the proof that product of two commutative operations is commutative.)

This is the normal monoid type class in Haskell, with unenforceable laws:

class Monoid m where
  (<>) :: m -> m -> m
  mempty :: m
  -- law (a <> b) <> c == a <> (b <> c)
  -- law mempty <> a == a
  -- law a <> mempty == a

Now we shift everything: types become kinds, values become types, and proofs become values. Since we do not have kind classes yet, we will simulate them with an additional parameter. (We are doing this trick, only on kind level.)

data Proxy i = Proxy

class Monoid' i where
  type Mappend i :: * -> * -> *
  type Mempty i :: *
  assoc :: Proxy i -> Assoc (Mappend i)
  neutrL :: Proxy i -> NeutralL (Mappend i) (Mempty i)
  neutrR :: Proxy i -> NeutralR (Mappend i) (Mempty i)

instance (Monoid' m, Monoid' n) => Monoid' (m,n) where
  type Mappend (m,n) = ProdOp (Mappend m) (Mappend n)
  type Mempty (m,n) = (Mempty m, Mempty n)

  assoc (Proxy :: Proxy (m,n)) = assocProd (assoc (Proxy :: Proxy m))
                                           (assoc (Proxy :: Proxy n))
  neutrL (Proxy :: Proxy (m,n)) = neutrLProd (neutrL (Proxy :: Proxy m))
                                             (neutrL (Proxy :: Proxy n))
  neutrR (Proxy :: Proxy (m,n)) = neutrRProd (neutrR (Proxy :: Proxy m))
                                             (neutrR (Proxy :: Proxy n))

That Proxy is needed to avoid ambiguity errors in type families. If I could, I would write assoc :: Assoc (Mappend i), assoc = assocProd assoc assoc.

We can write a monoid and prove it satisfies the monoid laws. If the language had a support for proofs, we could do that on value level as well.

In category theory, equations sometimes change to morphisms:

  • A monoid is a type m with a product of type (+) :: m -> m -> m and a proof of value (a + b) + c = a + (b + c),
    A monoidal category is a kind k with a product of kind (+) :: k -> k -> k and a value of type a + (b + c) -> (a + b) + c [and other things],
  • A monoid homomorphism between monoids m,n is a value of type m -> n with proofs of values f(m)*f(n)=f(m*n) and e=f(e);
    An applicative functor between monoidal categories m, n is a type f :: m -> n with values of type f m * f n -> f (m * n) and () -> f ().

Interestingly, some languages support "universe polymorphism", which allows to unify such declarations.

Final thoughts
Equational reasoning is a very powerful tool for program transformations. GHC uses them to make strong optimizations. It is also the base of hlint. It can express many folklore ideas - free theorems/parametricity, stream fusion, fold properties etc. I hope some day those things will become an integral part of the language or standard library. When you are adding a rewrite rule to GHC or a suggestion to hlint, you could write a proof that the rewriting is a consequence of laws. When you are writing an instance of a type class, you can prove required laws.

A well-known Haskell saying is that if you think a lot about types, then the values will write themselves. In the same way, if you think a lot about values, their proofs should - in theory - be easy. Perhaps a tool like Djinn could find the proof automatically in limited cases.

Haskellers often write types, and leave values as undefined. In the same way, you could write values, and worry about supplying proofs later. So at least in theory, adding proofs should not be a revolution. Currently, whatever you write in Haskell, your code will be full of intangible assumptions and invisible undefined proofs!

So there is an invisible world - the world of proofs - hidden below values. Is there an even lower level below proofs? If you take proofs to be objects, then arrows between them could mean their transformations.

This is the path taken by dependently typed languages such as Agda. However, as far as I know dependent types are not a requirement for proofs: we have only added an additional layer, we do not allow types to depend on values.


Andrés Sicard-Ramírez said...

It is possible to write the term

com :: forall (m :: *) (n :: *). Equal (Add m n) (Add n m)


I don't see how to use induction in your scenario.

g said...

I agree, I think it is not possible. At the very least, * should be changed to a kind for natural numbers.