Composition and Monads

PPL 2023

We have introduced a taste of functional programming using TypeScript and identified the following practices:

Focus on expressions - avoid statements
Use immutable variables and data structures - avoid mutations
Use functions as the primary abstraction method
Use higher-order functions to define high-level patterns
Define types using set operations (union, product, intersection, disjoint union)
Define recursive data types on the basis of disjoint union of base type and recursive type
Define functions over types with the pattern that the structure of the function reflects the structure of the type

This unit introduces a new functional programming practice which focuses on composition. It defines the monad design pattern which inherited its name from Category Theory.

It summarizes material presented in the following introductory lectures:

Absolute Best Intro to Monads for Software Engineers (Video 25mn)
Bartosz Mikewski - Category Theory 3.2: Kleisli category (Video 1h10). This is a more mathematical description of the same material.
Scott Waschlin - The Power of Composition (Video 1h05)

The TypeScript code associated to this unit is in monads/.

Motivating Example: Logging

We have seen how to define complex types and how to compose types with OR and AND operators:

AND: tuple or array or map
OR: union (discriminated union) usually product types

Given complex values, the following scenario become obstables to “easy composition” of functions:

Functions with multiple parameters: for example: replace(oldValue: string, newValue: string, inputStr: string) => string - it is difficult to use this function inside a pipe of functions because the previous function would supply only one value and it expects three. In general, the solution to this obstacle is to use currying (define a function replace(oldValue)(newValue)(inputStr) combined with partial functions - functions where some of the parameters are fixed.
Functions with one input but multiple possible outputs (the return type is a union). In this case, the next function in the pipe must be able to deal with the multiple options of the union.
Different types of inputs - multiple possible outputs
Multiple inputs - One output
Multiple inputs - Multiple outputs

A general solution to these problems of composition is to wrap unions and products of values into complex types and to override the composition operator for each of these complex types. We will demonstrate this general approach over multiple examples, and then regroup to identify the commonality across these examples.

Consider simple full functions [number => number]:

const square = (x: number): number => x * x;
const inc = (x: number): number => x + 1;

It is easy to compose them anyway we want:

inc(square(2));

pipe(square, inc)(2);

Consider a new requirement: We want to trace the calls to all these functions. This may be needed for debugging a system in production or for auditing reasons.

We do not want to introduce the I/O operation as part of the computation to avoid breaking separation of concerns and introducing a global dependency on a logger which would make it difficult to run these functions in concurrent runtime environment (this would require serialization of the logging operations).

Instead, we settle for this protocol:

We wrap the result in a map and add a logs field
The logs field accumulates the logs as operations are performed.

inc(square(2)) --> 
{
	result: 5,
	logs: [
		"Squared 2 to get 4",
		"Added 1 to 4 to get 5"
	]
}

First Attempt: Not Composable

To support this approach, we introduce a wrapper type which keeps track of the logs list.

type NumberWithLogs = {
  result: number;
  logs: string[];
};

// First attempt:
const square1 = (x: number): NumberWithLogs => ({
  result: x * x,
  logs: [`Squared ${x} to get ${x * x}`],
});

// the input param is a wrapper which "remembers" what has already been performed.
// the operation concatenates a new line to the logs field
const inc1 = (x: NumberWithLogs): NumberWithLogs => ({
  result: x.result + 1,
  logs: [...x.logs, `Added 1 to ${x.result} to get ${x.result + 1}`],
});

// { result: 5, logs: [ 'Squared 2 to get 4', 'Added 1 to 4 to get 5' ] }
console.log(inc1(square1(2)));

We now have a problems these functions are not composable!

inc1(5) –> bad type
square1(square1(2)) –> bad type
square1(inc1(2)) –> bad type

In general - we observe that “uniform” function types [N => N] became “non-uniform” [N => NwL] and [NwL => NwL].

To resolve this lack of compasability - let us:

disentangle the construction and maintenance of the logs list from the computation.
align all operations to the same type

Aligning Function Types to Lifted Functions

Let’s first create a “constructor” for the new type (we will call this a wrapper).

// The wrapper moves the initial parameter into the "NumberWithLogs" domain
const wrapNumberWithLogs = (x: number): NumberWithLogs => ({
  result: x,
  logs: [],
});

We then align all the functions to the same shape [NumberWithLogs => NumberWithLogs]:

const square2 = (x: NumberWithLogs): NumberWithLogs => ({
  result: x.result * x.result,
  logs: [...x.logs, `Squared ${x.result} to get ${x.result * x.result}`],
});

// This mechanism is now composable when we use the wrapper where needed
console.log(square2(square2(wrapNumberWithLogs(2))));
console.log(inc1(wrapNumberWithLogs(5)));

pipe(wrapNumberWithLogs, inc1, square2)(5);

Graphically - we think of two planes:

the “plain type” (numbers in our example)
the “embellished type” (NumberWithLogs in our example)

The last pipe can be described as this route:

NumberWithLogs       NwL[5] -inc1-> NwL[6] --square2--> NwL[36]
                 /
              wrap
              /
Number          5

In this diagram:

wrap is a diagonal operator (from normal type to embellished type)
inc1 and square2 are lifted operators (instead of [number=>number] - they are [NumberWithLogs => NumberWithLogs]

Disentangling Logs from Functions Logic

When we observe the resulting code, we realize it contains repeated code in the lifted functions: We see in all functions of type [NumberWithLogs => NumberWithLogs] the same code will appear:

 logs: [...x.logs, `...abc`]

This repetition is a bad smell of something wrong - we want to abstract it away. We also want to fix the violation of “separation of concern”: the new lifted functions (inc1, square2) must “know about log concatenation” - we want the function to only know about how to compute (increment, square).

// =================================================
// Remove the duplicated code:

// Just an intermediary step towards a solution:
// Separate log concatenation logic from core of function
const square3 = (x: NumberWithLogs): NumberWithLogs => {
  // Code that is specific to 'square'
  const result = {
    result: x.result * x.result,
    logs: [`Squared ${x.result} to get ${x.result * x.result}`],
  };
  // Code is always the same for all [NwL => NwL] functions
  return {
    result: result.result,
    logs: [...x.logs, ...result.logs],
  };
};

Let us now “abstract away” the repeated code in a separate function. We will write the application of functions in the “NumberWithlogs domain” as:

// Invoke a function on a plain type value:
inc(5);

// Invoke a function on a NumberWithLogs value
runWithLogs(wrapWithLogs(5), inc);

In our intermediary version, we used inc1(wrapWithLogs(5)). Now, runWithLogs will deal with log concatenation and the function will only do the part that is specific to the transformation it computes.

// Infer the type of runWithLogs:
const runWithLogs = (
  x: NumberWithLogs,
  transform: (y: number) => NumberWithLogs
) => {
  // transform is the "parametric" transformation
  // different for each function
  // It is a diagonal operator
  const newNumberWithLogs = transform(x.result);
  // The constant part is abstacted here
  return {
    result: newNumberWithLogs.result,
    logs: [...x.logs, ...newNumberWithLogs.logs],
  };
};

The signature of the “transform” functions is simplified:

Take a simple number as parameter
Return a NumberWithLogs with a single log message

Such transform functions all have the following structure:

Diagonal: from “Normal type” (number) to “Embellished Type” (NumberWithLogs)
They do not “know” about how to “compute” the embellished type (concatenate logs)

const square4 = (x: number): NumberWithLogs => ({
  result: x * x,
  logs: [`Squared ${x} to get ${x * x}`],
});

const inc4 = (x: number): NumberWithLogs => ({
  result: x + 1,
  logs: [`Added 1 to ${x} to get ${x + 1}`],
});

// Usage: we can now combine the calls in any combination
//        we only need to start with a "wrapped" value in the embellished type
const a = wrapNumberWithLogs(5);
const b = runWithLogs(a, inc4);
const c = runWithLogs(b, square4);
const d = runWithLogs(c, square4);
console.log(d);
/*
{
  result: 1296,
  logs: [
    'Added 1 to 5 to get 6',
    'Squared 6 to get 36',
    'Squared 36 to get 1296'
  ]
}
*/

If we want to use pipe, we realize the regular pipe function does not fit:

NumberWithLogs       NwL[5]  NwL[6]  NwL[36] NwL[1296]
                     /   |   /  |   /    |   /
                   wrap  | inc  | square | square
                   /     |/     |/       |/
number            5      5      6       36

If we want to “connect” diagonal operators, we must add the logic of the runWithLogs operator to consume the value returned by a function and pass it to the next. a new version of pipe is needed - that knows about the logic of this type with its running protocol.

Let us define pipeWithLogs which combines a sequence of diagonal operators and “overrides” the composition operator by using runWithLogs:

const isEmpty = <T>(l: T[]): boolean => l.length === 0;
// @Precondition: l is non-empty
const first = <T>(l: T[]): T => l[0];
const rest = <T>(l: T[]): T[] => l.slice(1);

const pipeWithLogs = (
  ...funcs: ((x: number) => NumberWithLogs)[]
): ((x: number) => NumberWithLogs) =>
  isEmpty(funcs)
    ? wrapNumberWithLogs
    : (x: number) => runWithLogs(first(funcs)(x), pipeWithLogs(...rest(funcs)));

const e = pipeWithLogs(inc4, square4, square4)(5);
console.log(e);

Generalization: The Monad Design Pattern

The design pattern we just described is called a Monad. Its aim is to facilitate composition of complex functions.

Monads have 3 components:

Wrapper Type (in our example NumberWithLogs)
Wrap Function (in our example wrapNumberWithLogs)
Run function: runs a diagonal transformation on a monadic value (runWithLogs)

To be convinced of the generality of this approach, let us examine another well known monad: Option.

The Option Monad

Option is used to encapsulate the fact that a parameter may have a concrete value or be undefined. For example:

x: number
y: Option<number> can be a number OR nothing
Option<User> can be a User OR nothing

We implement Option with the disjoint union pattern:

type Option<T> = Some<T> | None;
type None = { tag: "none" };
type Some<T> = { tag: "some"; value: T };

// Type predicate
const isSome = <T>(x: any): x is Some<T> => x.tag === "some";
const some = <T>(x: T): Option<T> => ({ value: x, tag: "some" });
const none = (): None => ({ tag: "none" });
const isNone = <T>(x: any): x is None => x.tag === "none";

// Wrap factory
const wrap = <T>(x: T | undefined): Option<T> =>
  x === undefined ? none() : some(x);

// Override application operator for Option
const bind = <T1, T2>(
  input: Option<T1>,
  transform: (x: T1) => Option<T2>
): Option<T2> => (isSome(input) ? transform(input.value) : input);

// Override composition operator for Option - composition uses bind
const pipeOption2 =
  <T1, T2, T3>(f1: (x: T1) => Option<T2>, f2: (x: T2) => Option<T3>) =>
  (x: T1) =>
    bind(f1(x), f2);

const pipeOption3 =
  <T1, T2, T3, T4>(
    f1: (x: T1) => Option<T2>,
    f2: (x: T2) => Option<T3>,
    f3: (x: T3) => Option<T4>
  ) =>
  (x: T1) =>
    bind(bind(f1(x), f2), f3);

Example Code Without Option

Let us hypothesize a common situation: we invoke an API call that returns information about the current user, which contains an optional field of type Pet, which also contains an optional nickName field. All the functions could return an “undefined” value when the object is missing.

type User = { name: string; pet: Pet | undefined };
type Pet = { nickName: string | undefined };

const getCurrentUser1 = (): User | undefined => ({
  name: "Michael",
  pet: { nickName: "doggy" },
});

const getPet1 = (user: User): Pet | undefined => user.pet;

const getNickName1 = (pet: Pet): string | undefined => pet.nickName;

Such functions are difficult to compose because they return a union of two incompatible values (true value or undefined). We need to add “guards” before passing the return value to another function. This leads to the usage of many if statements.

We would like to write a simple composition:

getNickName1(getPet1(getCurrentUser1()));
// or:
pipe(getCurrentUser1, getPet1, getNickName1)();

Instead we need to write a complex functions with many if statements:

const getPetNickName1 = (): string | undefined => {
  const user: User | undefined = getCurrentUser1();
  if (user === undefined) return undefined;

  const userPet: Pet | undefined = getPet1(user);
  if (userPet === undefined) return undefined;

  const userPetNickName: string | undefined = getNickName1(userPet);
  if (userPetNickName === undefined) return undefined;
  return userPetNickName;
};

console.log(getPetNickName1());

Code With Option

Let us now encapsulate all the cases of potential undefined use with the Option monad.

const getCurrentUser2 = (): Option<User> =>
  some({ name: "Michael", pet: { nickName: "doggy" } });

// No mention of "undefined" - no if
const getPetNickName2 = (): Option<string> => {
  const user: Option<User> = getCurrentUser2();
  const userPet: Option<Pet> = bind(user, (user: User) => wrap(user.pet));
  const userPetNickName = bind(userPet, (pet: Pet) => wrap(pet.nickName));
  return userPetNickName;
};

console.log(getPetNickName2());

This function hides all the potential undefined cases inside the bind operator of the Option monad.

We could compose the bind calls to highlight the composition nature of the code:

const getPetNickName3 = (): Option<string> =>
  bind(getCurrentUser2(), (user: User) =>
    bind(wrap(user.pet), (pet: Pet) => wrap(pet.nickName))
  );

console.log(getPetNickName3());

Code With “pipe functional composition”

Operators that can be composed in a bind chain must have type T1 => Option. They are "diagonal" from "normal types" to "option types". The resulting composition is also "diagonal".

We define a pipeOption composition function which overrides function composition - and instead of compute f(g(x)) (as the normal pipe does), will use bind(g(x), f).

const getPet2 = (user: User): Option<Pet> => wrap(user.pet);

const getNickName2 = (pet: Pet): Option<string> => wrap(pet.nickName);

// Can be composed as:
const getPetNickName4 = pipeOption(getCurrentUser2, getPet2, getNickName2);

console.log(getPetNickName4(undefined));
// => { value: 'doggy', tag: 'some' }

The List Monad

So far we have described two distinct monads:

NumberWithLogs: encapsulate logs manipulation around numeric computations
Option: encapsulate the possibility that computations may return a real value or an undefined value.

We now describe the well-known List container as a monad - that is, using the terminology of a monadic container type, a wrap constructor and a bind application operator. Based on these definitions, we also created a monad-specific pipe operator for generalized composition of diagonal operators.

We will then extend the general interface of monads to include the map operator and fold operators which are general to all monads.

// Observe the similarity of the functions between two monads: Option and List

// Recursive Type definition
export type List<T> = Empty | NonEmptyList<T>;
export type Empty = [];
export type NonEmptyList<T> = [T, ...Array<T>];

// Type predicates
export const isEmpty = <T>(x: Array<T>): x is Empty => x.length === 0;
export const isNonEmpty = <T>(x: Array<T>): x is NonEmptyList<T> => x.length > 0;

// Type accessors
export const first = <T>(x: NonEmptyList<T>): T => x[0];
export const rest = <T>(x: NonEmptyList<T>): List<T> => <List<T>>x.slice(1);

// Wrapper
export const wrapList = <T>(x: T): NonEmptyList<T> => [x];

// Bind (for lists, it is also nnown as flatMap)
// Since List is a recursive type, bind is a recursive function
export const bindList = <T1, T2>(
  l: List<T1>,
  f: (x: T1) => List<T2>
): List<T2> => (isEmpty(l) ? l : [...f(first(l)), ...bindList(rest(l), f)]);

Let us experiment with these functions:

// =================================================
// Examples

const l1: Empty = [];
const l2: NonEmptyList<number> = [1];
const l3: NonEmptyList<number> = [1, 2];

deepStrictEqual(first(l2), 1);
deepStrictEqual(first(l3), 1);

deepStrictEqual(rest(l2), []);
deepStrictEqual(rest(l3), [2]);

bindList([], (x: number) => wrapList(x));
// -> []

bindList([1,2], (x: number) => wrapList(x));
// -> [1, 2]
  
bindList([1,2], (x: number) => [x, x+1]);
// -> [1,2,2,3]
  
bindList([1,2], (x: number) => x%2 === 0 ? [x/2] : [x/2, x/2 + 1]);
// -> [0.5, 1.5, 1]

bindList operates in a way slightly similar to map: it takes as parameter a list and a diagonal function f - from T1 => List<T2>, it applies f to all the elements in the list and instead of returning the list of all values (like map would do), it returns the concatenation of all the values (because all the values are lists). For this reason, it is also known as flatMap.

pipeList composes a list of diagonal functions using the bindList (flatMap) mechanism.

  
// Compose and apply
deepStrictEqual(
  pipeList(
    () => [],
    (x: number) => wrapList(x * 2),
    (x: number) => wrapList(1 / x)
  )(),
  []
);

deepStrictEqual(
  pipeList(
    () => [1],
    (x: number) => [x * 2],
    (x: number) => [1 / x]
  )(),
  [0.5]
);

deepStrictEqual(
  pipeList(
    () => [1, 2], //                           [1,    2   ]
    (x: number) => [x - 1, x], //              [0, 1, 1, 2]
    (x: number) => (x === 0 ? [] : [1 / x]) // [   1, 1, 1/2]
  )(),
  [1, 1, 0.5]
);

deepStrictEqual(
  pipeList(
    () => [1, 2], //                  [1,               2]
    (x: number) => [x * 2, x * 4], // [2,      4,       4,       8]
    (x: number) => [1 / x, x * x] //  [1/2, 4, 1/4, 16, 1/4, 16, 1/8, 64]
  )(),
  [0.5, 4, 0.25, 16, 0.25, 16, 0.125, 64]
);

pipeList expands a tree of computations, where each operator is applied to each of the previous results and can return 0 values ([]), 1 value ([x]) or multiple values ([x, x+1]).

Fold: unlift from Monad to Regular Values

// fold: unlift from the List monad back to normal types

// foldList is very close to the function we have met under the name reduce
// Like bindList (flatMap), foldList is recursive to fold the recursive type into a simple value.
// foldList(l:List<T1>, ()=>T2, (firstVal:T1, restFolded: T2)=>T2)
// Example:
// foldList([], () => 0, (firstVal: number, restFolded: number) => firstVal + restFolded) -> 0
// foldList([1,2,3], () => 0, (firstVal: number, restFolded: number) => firstVal + restFolded) -> 6
export const foldList = <T1, T2>(
  l: List<T1>,
  handleEmpty: () => T2,
  handleNonEmpty: (firstVal: T1, restFolded: T2) => T2
): T2 =>
  isEmpty(l)
    ? handleEmpty()
    : handleNonEmpty(first(l), foldList(rest(l), handleEmpty, handleNonEmpty));

// Examples
deepStrictEqual(
  foldList(
    l1,
    () => 0,
    (val: number, acc) => val + acc
  ),
  0
);

deepStrictEqual(
  foldList(
    l2,
    () => 0,
    (val, acc) => val + acc
  ),
  1
);

deepStrictEqual(
  foldList(
    l3,
    () => 0,
    (val, acc) => val + acc
  ),
  3
);

In the same way we defined foldList, we can define foldOption:

// foldOption(o:Option<T1>, ()=>T2, (val:T1)=>T3)
// Example:
// foldOption(none(), () => undefined, (val: number) => val) -> undefined
// foldOption(some(1), () => undefined, (val: number) => val) -> 1
export const foldOption = <T1, T2, T3>(
  o: Option<T1>,
  handleNone: () => T2,
  handleSome: (val: T1) => T3
) => (isSome(o) ? handleSome(o.value) : handleNone());

// Examples
deepStrictEqual(
  foldOption(
    none(),
    () => undefined,
    (val) => val
  ),
  undefined
);

deepStrictEqual(
  foldOption(
    some(1),
    () => undefined,
    (val) => val
  ),
  1
);

Observe how similar the two fold functions are: foldList and foldOption are both used to unlift a monadic value into a simple type. They have as parameters a monadic value, and a handler function that matches each of the component subtypes of the monad type (none and some for Option, empty and non-empty for List).

Map: Lift Functions

The curried mapList operator takes as input a flat function [T1 => T2] and returns a lifted function [List<T1> => List<T2>].

// mapList
// Known as map
export const mapList =
  <T1, T2>(f: (x: T1) => T2): ((y: List<T1>) => List<T2>) =>
  (l: List<T1>) =>
    bindList(l, (x: T1) => wrapList(f(x)));

// Examples
deepStrictEqual(mapList((x: number) => x * 2)([]), []);
deepStrictEqual(mapList((x: number) => x * 2)([1]), [2]);
deepStrictEqual(mapList((x: number) => x * 2)([1,2]), [2, 4]);

Similarly, we can define a mapOption operator for the Option monad:

// mapOption: flat [T1=>T2] to lifted [Option<T1>=>Option<T2>]
export const mapOption =
  <T1, T2>(f: (x: T1) => T2): ((y: Option<T1>) => Option<T2>) =>
  (y: Option<T1>) =>
    bindOption(y, (x: T1) => wrapOption(f(x)));

// Examples
deepStrictEqual(mapOption((x: number) => x * 2)(none()),  none());
deepStrictEqual(mapOption((x: number) => x * 2)(some(1)), some(2));

Observe that the map functions for List and Option are defined in an almost identical manner: we compose bind, wrap and f in a specific manner, according to the general logic of monads.

Chain: Lift Diagonal Functions

In some cases, we are interested in combining a sequence of diagonal operators in which case the specialized pipeOption or pipeList functions will work. But in other cases, we would like to combine diagonal and flat operators (‘[T1 => Option]` and `[T1 => T2]`). In this case, neither `pipe` nor `pipeOption` are appropriate.

For such cases, we define the chain operator which lifts a diagonal operator into a lifted operator:

export const chainList =
  <T1, T2>(f: (x: T1) => List<T2>): ((y: List<T1>) => List<T2>) =>
  (y: List<T1>) => bindList(y, f);

export const chainOption =
  <T1, T2>(f: (x: T1) => Option<T2>): ((y: Option<T1>) => Option<T2>) =>
  (y: Option<T1>) => bindOption(y, f);

Here is an example of a lifted composition:

// All the functions are lifted in the monadic domain [Option<T> => Option<T>]
// We can now combine safe and unsafe functions in a single pipe.
pipe(
	wrapOption,                                       // lift the parameter 
	mapOption((x: number) => x * x),                  // safe function - always succeeds, map it
	chainOption((x) => x === 0 ? none() : some(1/x)), // unsafe function - chain it
	mapOption((x) => x * x)                           // safe function - map it
)(5);

The Monad Interface

We can now summarize the key operators defined consistently over any monad M:

bind: apply a diagonal transform f: [T1 => M[T2]] to monadic value M[t1]. This is the most basic operator, which overrides function application for normal types. Instead of calling f(x) we use bind(x, f).
wrap: wraps a plain value into a monadic value. wrap: T => M[T]. wrap is the unit (neutral) element of composition: bind(x, wrap) -> x and pipeM() returns wrap.
map: lifts horizontal transform from values to monadic values: for f: T1 => T2, we have map(f): M[T1] => M[T2]
chain: lifts diagonal transforms to lifted transform: chain: (T1 => M[T2]) => (M[T1] => M[T2])
pipeM: specialized composition of a sequence of diagonal operators - its value is itself a diagonal operator.
fold: convert a monadic value to a plain value with handlers for each sub-type of the monadic type. fold: (x: M[T1], handleA: () => T2, handleB: (curr: T1, acc: T2) => T2) => T2

The most useful monads are:

Option<T> = None | Some<T>: encapsulates the logic of having a value or none
Result<T> = Failure | Ok<T>: which we use extensively in the rest of the semester, encapsulates the logic of succeeding a computation or failing with an error
Either<T1, T2> = Left<T1> | Right<T2>: encapsulates the logic of two divergent results for a computation
List<T> = Empty | NonEmpty<T>: encapsulates the logic of computations that may return 0, 1 or more results.
Task<T>: encapsulate the logic of asynchronous computations that may fail or timeout
State<T>: encapsulate stateful computations which may update a shared state and return a value.

Principles of Programming Languages

Course Website