# My First GADT

## GADT tutorial part 1 of 3

GADTs in OCaml… You probably don’t need them, but –who knows– maybe one day you will. Also they can make for fun puzzles to figure out. So here’s a small tutorial to get you started.

The primary use of GADTs is to delegate the maintenance of some invariants of your code to the OCaml compiler’s type checker. In other words, you can use GADTs to encode custom additional typing rules which relate to some aspect of your program’s logic (as opposed to the built-in typing rules which relate to the runtime safety of all programs).

## Prologue

A few notes before we start.

**You probably don’t need GADTs.** That’s because you
can use some other OCaml features to encode many of of the invariants
that you might care about. For example, you can use a module with a
small interface around an abstract or private type. You can also use
runtime checks and return `option`

or `Either.t`

values.

**What even are GADTs?** “GADT” stands for Generalised
Algebraic Data Type. I don’t know why it’s *Generalised*, but the
*ADT* part is essentially your sum types and product types. Sums
and products being the stuff of algebra, we have ADTs.

There are two syntaxes for sum types in OCaml, only the more recent
one can be used in GADTs. Here’s a comparison of the two syntaxes
for the `option`

type:

```
(* Classic, terse syntax *)
type 'a option =
| None
| Some of 'a
```

```
(* Modern, verbose, GADT-compatible syntax*)
type _ option =
| None : 'a option
| Some : 'a -> 'a option
```

There are two kinds of product types in OCaml: tuples and records. This tutorial uses a mix of them just like most of the code out there.

**This tutorial has several parts.** This first part
starts with a simple GADT definition and shows how to manipulate values
of that type. It doesn’t delve into anything too advanced.

A second part focuses on different techniques to use when making GADTs. A third part focuses on real-world examples.

**All the code is available to download and play with.**
Files are linked to at different points of the tutorial.

## My Very First GADT

For your very first GADT, you will make a type which is analogous to
`int list`

but with a type parameter indicating whether the
list is empty or not. In other words, you make a type where the possible
emptiness of a list (of integers) can be tracked by the type system.

First, declare some types which represent the property that you want to track.

```
type empty = Empty
type nonempty = NonEmpty
```

Then, declare a type `int_list`

with a single type
parameter which is instantiated with the property types.

```
type _ int_list =
| Nil : empty int_list
| Cons : int * _ int_list -> nonempty int_list
```

Note how each constructor (`Nil`

and `Cons`

)
instantiates the type parameter differently:

`Nil : empty int_list`

`Cons : … -> nonempty int_list`

Based on this information, the type checker is able to assign different property types for the parameter of different values.

```
let nil : empty int_list = Nil
let onetwo : nonempty int_list = Cons (1, Cons (2, Nil))
(* you don't need the type annotations,
the compiler can infer them. *)
```

That’s it, your very first GADT.

## Using the GADT: a specialised function

Because you can enforce emptiness/fullness, you can write exception
free version of, say, `Stdlib.List.hd`

:

```
let hd
: nonempty int_list -> int
= fun (Cons (x, _)) -> x
```

This function’s type restrict the input to only
`nonempty int_list`

. This has two effects: First, it prevents
callers from passing an `empty int_list`

value. In other
words, you encoded an additional rule for the type system to track.
Second, it allows you to only consider the `Cons`

constructor
within the body of the function.

You can try to break a few things: call `hd Nil`

, add a
`Nil`

case to the function definition, change the type
declaration of the function. Try it!

## Using the GADT: a generic function

You can also write functions that are generic and accept both
`empty`

and `nonempty`

parameters.

```
let is_empty
: type e. e int_list -> bool
= function
| Nil -> true
| Cons _ -> false
```

Note the `type e.`

prefix inside the type annotation of
the function. This is needed because the type parameter `e`

is different in different parts of the function. Without it the compiler
gives an error. Try it!

You can also write generic recursive functions. The type annotation uses the same prefix for the same reason.

```
let rec iter
: type e. (int -> unit) -> e int_list -> unit
= fun f l ->
match l with
| Nil -> ()
| Cons (x, xs) -> f x; iter f xs
```

Try to define a few useful functions for this type or check out the
file `veryfirst.ml`

.

## Using the GADT: existential constructor

You can write generic functions, but you cannot create a generic data-structure which hosts values of either property indiscriminately. If you try, you get an error message. Try the following value declaration:

```
(* broken code: type error *)
let kvs = [
("a", Nil);
("b", Cons (0, Nil));
("c", Cons (1, Cons (4, Nil)));
]
```

The error makes sense because the type of this value would be
`(string * _ int_list) List.t`

and what does that
`_`

even is in this context? Another way to think about the
error is to think about the return type for the following
function:

`let find_value key = List.assoc key kvs`

When you need to store values with different property parameters into
a single data-structure, you need to introduce an existential
constructor. It’s a new type with a new constructor which wraps the GADT
so that the property type parameter disappears. By convention, it’s
often call `Any`

(because it wraps *any* of the GADT
value) or `E`

(for “exists”).

`type any_int_list = Any : _ int_list -> any_int_list`

You can use it to host values with different property parameters into a single data-structure.

```
let kvs = [
("a", Any Nil);
("b", Any (Cons (0, Nil)));
("c", Any (Cons (1, Cons (4, Nil))));
]
```

You can recover the wrapped/hidden type parameter locally by matching:

```
let iter_any f xs =
match xs with
| Any Nil -> ()
| Any (Cons _ as xs) ->
(* in this branch only, [xs] has type [nonempty int_list] *)
iter f xs
```

## Kinda useless

This GADT is sort of useless. Sure you can write exception-less
versions of `List.hd`

and such, but it’s not practical.

The main issue is that `int_list`

is a new type: you
cannot use functions from the `Stdlib`

or any existing
library. You have to rewrite everything you need yourself. (You could
convert to `int Seq.t`

but then you lose the type
information.)

Note that because of the didactic intent behind
`int_list`

, I cut some corners in making it. It can be made
somewhat practical. But as is it is not.

A better alternative is to use a private type alias. See `altveryfirst.ml`

for an example.

## Part 2

Now that you know the basics of how to define and use a GADT, check
out the next part of this tutorial:

GADT Tips and Tricks.