monoidal

Tuesday, January 29, 2013

On safety and abstraction

Haskell does not have full-fledged dependent types. For example, there is no way to create naturals as a subset of integers. One way to sidestep this restriction is creating an abstract datatype. Define

newtype Natural = N Integer

and add a collection of methods (addition, printing etc.). Clients can interact with this type only via these methods. They cannot construct a pathological value such as N (-1), which could violate some invariants and crash the program.

Furthermore, we can change the implementation later on (for example, to data Natural = Zero | Succ Natural) and the clients will not notice, as long as we do not change the protocol.

It seems that abstract datatypes give us two things at once: safety from invalid values and modularity. I think this is a bad conflation. These two concerns should be separated.

Subsets

Without support of dependent types we can at least label where they should be. Define

newtype Subset a = Subset a

fromSubset :: Subset a -> a
fromSubset (Subset a) = a

unsafeToSubset :: a -> Subset a
unsafeToSubset = Subset

and change the definition of naturals to

newtype Natural = N (Subset Integer)

Here Subset a is an unspecified, context-dependent, subset of type a.

This way safety and modularity is separated. If you match on N it means that you break the module abstraction: your code depends on the implementation details. If, furthermore, you use Subset, you have to guarantee you're not constructing pathological values.

The same thing happens in restricted IO from GHC's manual, which can be defined with:

newtype RIO a = RIO (Subset (IO a))

Quotients

In naturals, unconstrained use of constructor N might lead to nonsensical objects. Sometimes there's a dual situation: unconstrained use of pattern matching might lead to invalid results.
For example, rationals are pairs of integers p/q, but we want 1/2 and 2/4 to be indistinguishable - the "numerator" operation breaks abstraction. In this post I talked about this phenomenon with functions represented by algorithms.

newtype Quotient a = Quotient a

toQuotient :: a -> Quotient a
toQuotient = Quotient

unsafeFromQuotient :: Quotient a -> a
unsafeFromQuotient (Quotient a) = a

Note that the direction of functions is reversed. Going from an object to quotient is well-defined, but when you go the other way you have to guarantee consistent behavior for equivalent answers.

Free groups (see sigfpe's post) can be defined with
newtype FreeGroup x = FreeGroup (Quotient [Either x x])
which means that an element of a free group is represented by a list, but with some identifications: [Right a, Left a] is equivalent to [] since aa^-1=1.

What about rationals? An explicit version of
data Rational = Rational Integer Integer
is
data Rational = Rational (Quotient (Integer, Subset Integer))

Here there are both mechanisms: equivalent pairs must be equal, and the denominator must be nonzero. Both constructing and deconstructing rationals is unsafe in full generality; the type tells you exactly in what manner.

Final thoughts

In my opinion marking quotients and subsets should become a custom, even in absence of dependent types. It makes things explicit where they should be.

If you think those newtypes are littering too much, you can at least make them transparent with

type Subset a = a
type Quotient a = a

and let your type definitions speak more.

Finally, while both Subset a and Quotient a are ambiguous (they do not specify the subset or the partition), you can make an indication with a phantom tag:

data NonNegative
newtype Subset' p a = Subset' a
type Natural = Subset' NonNegative Integer

but handling tags gets troublesome rather fast (there is no automatic way to reason on them).

Monday, September 17, 2012

A kind for regions

With a new powerful kind system in GHC, there are many lurking opportunities for expressive programming. Here's one.

What is the kind of the ST type constructor? Currently, it's ST :: * -> * -> *.

It should not be. In ST s a, the parameter s is special. It is not meant to be a "real" type like Int. It is used only to control the scope of references. One day I'd like to see a separate kind for s.

kind Region
ST :: Region -> * -> *
STRef :: Region -> * -> *
runST :: (forall (s :: Region). ST s a) -> a

This way it's clear that in ST s a the types s and a are of different nature. Attempting to write ST s s is a kind error.

Thursday, May 24, 2012

A toy version of Safe Haskell

The idea of Safe Haskell is to control unsafe operations. Here is an alternative idea of implementation. Define a class

> {-# LANGUAGE FlexibleContexts, Rank2Types #-}
> class Unsafe a where

The "a" parameter is unneccessary, but it is required by GHC.

Add the unsafe constraint to each operation.

> unsafePerformIO :: Unsafe () => IO a -> a
> unsafePerformIO = undefined  -- implementation skipped

> unsafeCoerce :: Unsafe () => a -> b
> unsafeCoerce = undefined

That's it. You can now code with unsafe functions. The constraint "Unsafe" is contagious: everything that internally uses unsafe* will be tagged. Old code does not have to be changed except type signatures.

The compiler only has to prohibit users from defining instance Unsafe (). If you do this, you are releasing the brakes, and back to normal Unsafe Haskell.

Safe Haskell has "trustworthy" modules, which can use unsafe features, but their authors claim their interfaces are safe.

This idea can be implemented if there was a special function allowed in trustworthy modules removing the constraint:

> unsafe :: forall a. (Unsafe () => a) -> a
> unsafe = undefined

Of course this falls short of real Safe Haskell, which must forbid Template Haskell, overlapping instances etc. I like the fact that unsafeness of unsafePerformIO is expressed in its type.

I hope this demonstrates zero-parameter type classes might be something reasonable. For now, you can simulate them with -XConstraintKinds and -XDataKinds:

{-# LANGUAGE ConstraintKinds, KindSignatures, DataKinds #-} class Unsafe' (a :: ()) type Unsafe = Unsafe' '()

Edit:

I just realized you can have unsafe instances:

instance (Unsafe (), Show a) => Show (IORef a) where                          
  show = show . unsafePerformIO . readIORef

Monday, April 16, 2012

Descending the level ladder

The following is a way to compute 2+1 using the GHC type system. The ":kind!" command tells GHCi to display normalized version of a type.

{-# LANGUAGE TypeFamilies #-}
data Zero
data Succ a

type family Add x y
type instance Add Zero x = x
type instance Add (Succ x) y = Succ (Add x y)

*Main> :kind! Add (Succ (Succ Zero)) (Succ Zero)
Add (Succ (Succ Zero)) (Succ Zero) :: *
= Succ (Succ (Succ Zero))

Notice we did not use Haskell values anywhere. Not even undefined. This way of computing 2+1 uses types only.

Imagine you are a malicious professor teaching Haskell to freshmen. You tell them that this is the normal way to compute 2+1 in Haskell. The students later code Fibonacci, multiplication etc. They get a Turing complete language.

They have no idea that Haskell has what we call values. To them, "values" are things like Succ Zero, what we call "types". They think Haskell has a type system, that prohibits nonsense such as Zero (Succ Succ). In reality, that type system is kinds.

Later, the students discover the truth; their values are really types, and their types are really kinds.

In this post I will argue we might live in such shifted world as well. There is a layer below Haskell values.

What is it? By Curry-Howard, we can think of values as proofs for types. For example, the identity function is a proof that p -> p. Analogously, we can think of "undervalues" as proofs for values. So I will call this level "proofs". We have

kinds > types > values > proofs

where A>B means that A is a type system for B.

One way to see "undervalues" is to check what are values in above scenario. If we add to the initial code the datatype

data Equal a b where
  Eq :: Equal a a

then a value of type

com :: forall (m :: *) (n :: *). Equal (Add m n) (Add n m)

could work as a proof that addition is commutative.

So, in the shifted scenario, we have a kind * (a poor man's Naturals), types Zero and Succ, and a value of type Equal (Add m n) (Add n m).

Shifting down, we have a type Natural, values Zero and Succ, and a proof of value m + n == n + m, which cannot be seen in normal Haskell.

data Natural = Zero | Succ Natural 

add :: Natural -> Natural -> Natural
add Zero x = x
add (Succ x) y = Succ (add x y)

*Main> add (Succ (Succ Zero)) (Succ Zero)
Succ (Succ (Succ Zero))

-- com :: forall (m :: Natural) (n :: Natural). add m n == add n m
-- cannot be written

Intuition
When you write a type T in Haskell, you can check its correctness from above:

Is there a K such that type(T)=K?
(i.e. Is there a valid kind for T?)

but you can also check it from below:

Is there a V such that type(V)=T?
(i.e. Is there a valid value for T? Is T inhabited?)

Analogously, when you write a value V in Haskell, you can check its correctness from above:

Is there a T such that type(V)=T?
(i.e. Is there a valid type for V?)

but you can also check it from below:

Is there a P such that type(P)=V?
(i.e. Is there a valid proof for V? Is V inhabited?)

Checking from above corresponds to checking syntax, and checking from below to checking semantics.

The type Equal Int Bool is not invalid. You can use it in type-level computation. But it is uninhabited, and this is a sign it might not make sense. The same thing happens when you write an instance of Monoid which is not a monoid. You used an uninhabited value. Since syntax is OK, you can run your code. But you will run into trouble, since semantics is wrong. There is no proof of your monoid instance, just like there is no value of the type Equal Int Bool.

Encoding equational reasoning

A lot of functional literature contains proofs. Here is an example of what I mean - product of two associative operations is associative. Suppose (a,b)+(c,d)=(a+b,c+d). (The + symbol is overloaded and means 3 different things, beware.)

(x + y) + z =
             [use x = (fst x, snd x)]
((fst x, snd x) + (fst y, snd y)) + (fst z, snd z) =
             [Definition of product operation]
((fst x + fst y, snd x + snd y) + (fst z, snd z) =
             [Definition of product operation]
((fst x + fst y) + fst z, (snd x + snd y) + snd z) =
             [Associativity of + and +]
(fst x + (fst y + fst z), snd x + (snd y + snd z)) =
             [Definition of product operation]
(fst x, snd x) + (fst y + fst z, snd y + snd z) =
             [Definition of product operation]
(fst x, snd x) + ((fst y, snd y) + (fst z, snd z)) =
             [use (fst x, snd x) = x]
x + (y + z)

This should remind you of function composition:

    f     g     h
 A --> B --> C --> D

but on a lower level: types A,B,C,D become values, and values f,g,h become proofs.

It has always bugged me that such proofs are not easily executable in Haskell despite their routine feel.

This is because we cannot access the proof level. We can attempt to do so, by shifting everything up. I will adapt the proof and show that the product of two monoids is a monoid.

We start by encoding values as types. For simplicity, we will not transfer their types to complicated kinds.

The (:=) datatype encodes reasoning rules for equality.

{-# LANGUAGE GADTs, KindSignatures, RankNTypes, TypeOperators, TypeFamilies,
             ScopedTypeVariables #-}
data Fst x
data Snd x
data ProdOp (f :: * -> * -> *) (g :: * -> * -> *) x y

data a := b where 
  Refl :: a := a
  Sym :: a := b -> b := a
  Comp :: a := b -> b := c -> a := c
  Arg :: a := b -> c := d -> f a c := f b d
  PairC :: a := (Fst a, Snd a)
  DefProd :: ProdOp f g (a, b) (c, d) := (f a c, g b d)

Now, we can write proofs of associativity and left/right identity. The functions neutrLProd and neutrRProd take proofs that f,g have neutral elements m,n and show that their product has neutral element (m,n). The function assocProd does the same with associativity.

type NeutralL (+) e = forall x. e + x := x
neutrLProd :: NeutralL m e -> NeutralL n f -> NeutralL (ProdOp m n) (e,f)
neutrLProd e1 e2 =
                        -- (e,f) + x
   Arg Refl PairC
   `Comp`               -- (e,f) + (fst x, snd x)
   DefProd
   `Comp`               -- (e + fst x, f + snd x)
   Arg e1 e2
   `Comp`               -- (fst x, snd x)
   Sym PairC
                        -- x

type NeutralR (+) e = forall x. x + e := x
neutrRProd :: NeutralR m e -> NeutralR n f -> NeutralR (ProdOp m n) (e,f)
neutrRProd e1 e2 =
                        -- x + (e, f)
   Arg PairC Refl
   `Comp`               -- (fst x, snd x) + (e, f)
   DefProd
   `Comp`               -- (fst x + e, snd x + f)
   Arg e1 e2
   `Comp`               -- (fst x, snd x)
   Sym PairC
                        -- x

type Assoc (+) = forall x y z. ((x + y) + z) := (x + (y + z))
assocProd :: Assoc f -> Assoc g -> Assoc (ProdOp f g)
assocProd a1 a2 =
                        -- (x + y) + z
   Arg (Arg PairC PairC) PairC
   `Comp`               -- ((fst x, snd x) + (fst y, snd y)) + (fst z, snd z)
   Arg DefProd Refl
   `Comp`               -- (fst x + fst y, snd x + snd y) + (fst z, snd z)
   DefProd
   `Comp`               -- ((fst x + fst y) + fst z, (snd x + snd y) + snd z)
   Arg a1 a2
   `Comp`               -- (fst x + (fst y + fst z), snd x + (snd y + snd z))
   Sym DefProd
   `Comp`               -- (fst x, snd x) + (fst y + fst z, snd y + snd z)
   Arg Refl (Sym DefProd)
   `Comp`               -- (fst x, snd x) + ((fst y, snd y) + (fst z, snd z))
   Sym (Arg PairC (Arg PairC PairC))
                        -- x + (y + z)

(Exercise for the reader: Write the proof that product of two commutative operations is commutative.)

This is the normal monoid type class in Haskell, with unenforceable laws:

class Monoid m where
  (<>) :: m -> m -> m
  mempty :: m
  -- law (a <> b) <> c == a <> (b <> c)
  -- law mempty <> a == a
  -- law a <> mempty == a

Now we shift everything: types become kinds, values become types, and proofs become values. Since we do not have kind classes yet, we will simulate them with an additional parameter. (We are doing this trick, only on kind level.)

data Proxy i = Proxy

class Monoid' i where
  type Mappend i :: * -> * -> *
  type Mempty i :: *
  assoc :: Proxy i -> Assoc (Mappend i)
  neutrL :: Proxy i -> NeutralL (Mappend i) (Mempty i)
  neutrR :: Proxy i -> NeutralR (Mappend i) (Mempty i)

instance (Monoid' m, Monoid' n) => Monoid' (m,n) where
  type Mappend (m,n) = ProdOp (Mappend m) (Mappend n)
  type Mempty (m,n) = (Mempty m, Mempty n)

  assoc (Proxy :: Proxy (m,n)) = assocProd (assoc (Proxy :: Proxy m))
                                           (assoc (Proxy :: Proxy n))
  neutrL (Proxy :: Proxy (m,n)) = neutrLProd (neutrL (Proxy :: Proxy m))
                                             (neutrL (Proxy :: Proxy n))
  neutrR (Proxy :: Proxy (m,n)) = neutrRProd (neutrR (Proxy :: Proxy m))
                                             (neutrR (Proxy :: Proxy n))

That Proxy is needed to avoid ambiguity errors in type families. If I could, I would write assoc :: Assoc (Mappend i), assoc = assocProd assoc assoc.

We can write a monoid and prove it satisfies the monoid laws. If the language had a support for proofs, we could do that on value level as well.

In category theory, equations sometimes change to morphisms:

A monoid is a type m with a product of type (+) :: m -> m -> m and a proof of value (a + b) + c = a + (b + c),
A monoidal category is a kind k with a product of kind (+) :: k -> k -> k and a value of type a + (b + c) -> (a + b) + c [and other things],
A monoid homomorphism between monoids m,n is a value of type m -> n with proofs of values f(m)*f(n)=f(m*n) and e=f(e);
An applicative functor between monoidal categories m, n is a type f :: m -> n with values of type f m * f n -> f (m * n) and () -> f ().

Interestingly, some languages support "universe polymorphism", which allows to unify such declarations.

Final thoughts
Equational reasoning is a very powerful tool for program transformations. GHC uses them to make strong optimizations. It is also the base of hlint. It can express many folklore ideas - free theorems/parametricity, stream fusion, fold properties etc. I hope some day those things will become an integral part of the language or standard library. When you are adding a rewrite rule to GHC or a suggestion to hlint, you could write a proof that the rewriting is a consequence of laws. When you are writing an instance of a type class, you can prove required laws.

A well-known Haskell saying is that if you think a lot about types, then the values will write themselves. In the same way, if you think a lot about values, their proofs should - in theory - be easy. Perhaps a tool like Djinn could find the proof automatically in limited cases.

Haskellers often write types, and leave values as undefined. In the same way, you could write values, and worry about supplying proofs later. So at least in theory, adding proofs should not be a revolution. Currently, whatever you write in Haskell, your code will be full of intangible assumptions and invisible undefined proofs!

So there is an invisible world - the world of proofs - hidden below values. Is there an even lower level below proofs? If you take proofs to be objects, then arrows between them could mean their transformations.

This is the path taken by dependently typed languages such as Agda. However, as far as I know dependent types are not a requirement for proofs: we have only added an additional layer, we do not allow types to depend on values.

Thursday, October 27, 2011

Kind polymorphism - a draft

I extended the post on kind polymorphism to Omega, and I'm rather satisfied - I captured 10 different senses of monoids, used level polymorphism and type functions on the way.

sumMonoid      :: Monoid Hask Int                 -- monoid
maybeMonad     :: Monoid EndHask Maybe            -- monad
envComonad     :: Monoid CoEndHask (Env s)        -- comonad
haskCat        :: Monoid Graph (->)               -- category
ixState        :: Monoid Indexed IxState          -- indexed monad
ixStore        :: Monoid CoIndexed IxStore        -- indexed comonad
intRing        :: Monoid Ab Int                   -- ring
sumCMonoid     :: Monoid Monoids Int              -- commutative monoid
complexAlgebra :: Monoid (VSpaces Float) Complex  -- R-algebra
EndHask''      :: Monoid' Cats (*0 ~> *0)         -- monoidal category

A draft is available here and Omega source code is available here. There are still things that can be improved and researched on.

Also, see a new paper on kind system: Giving Haskell a Promotion.

Wednesday, September 1, 2010

Algorithms and functions

Consider the following two functions:

f :: () -> ()
f = const ()

g :: () -> ()
g = id

Is there a way to write a function that tells them apart? Given an infinite loop, first will stop and the second will hang. But you can do better, give an exception and check if it was thrown:

data TestException = TestException deriving (Show, Typeable)

instance Exception TestException

test :: (() -> ()) -> Bool
test k = unsafePerformIO $ catch (k (throw TestException) `seq` return True)
                                 (\TestException -> return False)

Is this a safe trick?

On the one hand, it looks rather innocently: a function is given some box, and if it opens it, it's caught red-handed. Such function would hang given infinite loop. A function that doesn't touch the box, yet finishes, is const (). Such 'test' is a nice feature to have.

On the other hand, something is wrong, since f is more defined than g, and test f is not at least as defined as test g. This contradicts monotonicity. By giving two exceptions to (+), you can check which argument is evaluated first:

throw A + throw B

That means flip (+) is not the same as (+). Addition is not commutative!

Which of these points is correct?

The representation of a->b
Internally, the (->) type is a list of instructions - force a thunk, case, perform a primitive instruction like adding integers. In other words, it's an algorithm. You could conceivably write an algebraic datatype that encompassed all those options.

data Instruction a b = Force Thunk | Return Value...
type Algorithm a b = [Instruction a b]

Haskell has an evaluator that takes an algorithm and runs it step by step.

($) :: Algorithm a b -> a -> b

Having access to internal source code of an algorithm, you can write a "debugger" that stops if the function forces its argument. In a sense, this is what the function test is doing.

Lazy IO uses this trick. When a thunk is forced evaluation is stopped momentarily, and IO is performed.

Still, the denotational semantics argument seems disturbing.

The distinction

The solution to the dilemma is:

There are two different ways of interpreting values of type a -> b.

functions that assign a value of b to each value of a.

algorithms that are recipes how to turn a into b.

In Haskell, the function view is used, but in this post I'll use both to illustrate differences. I'll call the algorithm view "operational" and function view "denotational".

Representing algorithms is possible using ADT, as you seen above. Functions are represented using algorithms:

data Function a b = Function (Algorithm a b)

You can turn an algorithm into a function:

evaluate :: Algorithm a b -> Function a b
evaluate = Function

but the reverse conversion is ill-defined - one function has many representations. evaluate turns an algorithm into a "black box".

Think about rational numbers. You represent them as pairs of integers, even though a rational number is not a pair. Then, you write operations like addition, which don't break 'internal consistency'. Not every operation on pairs can be lifted to operation on rationals. Different pairs may represent same rationals. It's the same with functions stored as algorithms.

Questions:
The questions focus on differences between algorithms and functions.

1. Can you compare values of type a -> b for equality?
Answer

You can compare algorithms, since they are lists of instructions. You cannot in general compare functions because of the halting problem.

For example, the following two are different algorithms:

mergesort :: Ord a => [a] -> [a]
insertionsort :: Ord a => [a] -> [a]

But they are the same functions: evaluate mergesort is the same as evaluate insertionsort. In denotational view asymptotic complexity is discounted.

In operational view, \x -> 4 and \x -> 2+2 are two different values. Equational reasoning is not allowed, since there's "quoting" after lambda.

2. How would you show an algorithm? A function?
Answer

3. How would you read an algorithm? A function?
Answer

4. What abstract models of computation correspond to algorithms and functions?
Answer

5. How does semantic order on values a -> b look like?
Answer

6. Is the following function

f :: (() -> ()) -> Bool
"f g = [return True if evaluation of g () halts within 100 evaluation steps]"

well-defined?

Answer

7. What about this?

lub :: (Bool -> ()) -> ()
lub f = [return () if evaluation of f False or f True halts]

Answer

8. Since a->b in Haskell is seen as function type, not algorithm type, anything dependent of "internals" is ambiguous. How does Haskell deal with it?

Answer

By using the IO monad.

For example, catching exceptions is allowed only in IO, since they can be used to inspect evaluation order. Another example is:

Control.Concurrent.mergeIO :: [a] -> [a] -> IO [a]

You can deem lazy I/O as safe, if you take the 'algorithm' view. In the denotational view, lazy IO doesn't make sense unless you consider IO as nondeterministic.

In my opinion this is not a good approach, and it might be better if Haskell had separate types for computations and values. (A computation can be thought as an algorithm returning a value. For example, 2+2 and 4 are two different computations.)

Summary
In most languages, 'functions' are algorithms. In Haskell, the emphasis is on functions as in mathematics and referential transparency.

Since Haskell is running on a computer, operational semantics (algorithms) describe how things work on the lower level and you can't do away with them. Things like "debugging", "unsafePerformIO", "measuring time/space complexity of a function", "order of evaluation" are relevant to operational semantics and break the abstraction. Things like "referential transparency" or "functional reactive programming" are valid for functions.

I think this is what makes Haskell different from imperative languages. FP is about functions, not algorithms. This has advantages, like better support for composability - and disadvantages, like more difficult time/space reasoning.

Thursday, July 29, 2010

The universal space

Take (or imagine) a blank sheet of paper. This is a plane. You can put points and vectors on it. A vector connects two points, but it is "movable": if you can translate one vector into another, they are deemed equal. A vector doesn't really have a beginning - you can position it wherever you want.

In this setting, some operations make sense, like

vector + point = point:
Position the vector at a point, and its end will be the result.

point - point = vector:
Join the points to get a vector from A to B. (It's inverse of vector + point = point.)

vector + vector = vector:
Put the beginning of one vector to the end of the second.

But some don't, like adding two points, or subtracting vector - point. With added origin, you can do anything using coordinates, as in (2,1) + (3,0) = (5,1). In fact, with given origin you can consider every point as a vector from the origin. But without it, it's like physical motion - you cannot tell if something is moving without a frame of reference.

Although addition point + point doesn't make sense, (point + point) / 2 should - it's middle between the points. 0.2 point + 0.8 point is somewhere in the 4/5 of the segment. This is known as "affine combination".

An expression like vector + point - point can be computed in two different ways: (vector + point) - point and vector + (point - point) and it turns out the result will be the same. There must be some structure here.

Define mass of a vector as 0, and mass of point as 1. Then,

vector + point = point: 0 + 1 = 1

point - point = vector: 1 - 1 = 0

vector + vector = vector: 0 + 0 = 0

(point + point) / 2 = point: (1+1) / 2 = 1

In fact you can enlarge the space. Define an object to be either a vector with mass 0, or a pair (p,m) which is a point p with a nonzero mass m. I'll write a point as m*p. Then multiplication streches a vector, or changes a point's mass: a*(m*p) = (a*m)*p. Addition is done with:

m p + v = m (p + \frac{v}{m})

- translation

m_{1}p_{1} + m_{2}p_{2} = (m_{1}+m_{2}) (\frac{m_{1}}{m_{1}+m_{2}} p_{1} + \frac{m_{2}}{m_{1}+m_{2}} p_{2})

- weighted mean of points

m_{1}p_{1} - m_{1} p_{2} = m_{1} (p_{1} - p_2})

- a vector joining two points

In this space, points and vectors are equal partners, and anything can be added and multiplied by a number. It's a vector space with dimension one more than the original.

In the original space, you assign a meaning to a sum like 0.2a + 0.8b, but not to the summands. In this space, summands make sense and 0.2a + 0.8b is really a sum of 0.2a and 0.8b.

In mathematical terms, this is an assignment of a vector space to every affine space. This construction is known as the universal space. Now, vector spaces and affine spaces have their own meaning of functions. A linear map respects linear combinations: f(cx+dy) = c f(x)+d f(y). An affine map is something that respects affine combinations: f(cx+dy) = cf(x)+ df(y), where c+d=1. It's easy to present a linear map using a matrix. How to present an affine map? The answer is, given affine map, you can extend it to a linear map between universal spaces by f(m*p)=m*f(p) for all m. And that gives a matrix.

For category theory lovers, this construction is the left adjoint of a functor that assigns an affine space to a vector space by forgetting the origin.