Type Inference System

Practical Session - Week #8

The Type Inference System is a TypeScript Implementation of the algorithm for Type Checking and Inference using Type Equations.

System Architecture

The layers of the system are:

Layer 1: Languages Definitions

The Type Inference algorithm maps expressions in one language (L5 expressions) into expressions in another language (Type Expressions).

The lower level of the Type Inference system implements the Abstract Syntax interfaces for expressions in these two languages.

L5-AST

Implements parser and abstract syntax for the following BNF concrete syntax:

/*
<program> ::= (L5 <exp>+)                  / Program(exps:List(exp))
<exp> ::= <define> | <cexp>                / DefExp | CExp
<define> ::= ( define <var-decl> <cexp> )  / DefExp(var:VarDecl, val:CExp)
<var> ::= <identifier>                     / VarRef(var:string)
<cexp> ::= <number>                        / NumExp(val:number)
        |  <boolean>                       / BoolExp(val:boolean)
        |  <string>                        / StrExp(val:string)
        |  <var-ref>
        |  ( lambda ( <var-decl>* ) <cexp>+ ) | ( lambda ( <var-decl>* ) : <TExp> <cexp>+ ) / ProcExp(args:VarDecl[], body:CExp[], returnTE: TExp))
        |  ( if <cexp> <cexp> <cexp> )     / IfExp(test: CExp, then: CExp, alt: CExp)
        |  ( quote <sexp> )                / LitExp(val:SExp)
        |  ( <cexp> <cexp>* )              / AppExp(operator:CExp, operands:CExp[]))
        |  ( let ( <binding>* ) <cexp>+ )  / LetExp(bindings:Binding[], body:CExp[]))
        |  ( letrec ( binding*) <cexp>+ )  / LetrecExp(bindings:Bindings[], body: CExp)
        |  ( set! <var> <cexp>)            / SetExp(var: varRef, val: CExp)
<binding>  ::= ( <var-decl> <cexp> )            / Binding(var:VarDecl, val:Cexp)
<prim-op>  ::= + | - | * | / | < | > | = | not |  eq? | string=?
                 | cons | car | cdr | list? | number?
                 | boolean? | symbol? | string?
                 | display | newline
<num-exp>  ::= a number token
<bool-exp> ::= #t | #f
<var-ref>  ::= an identifier token         / VarRef(var)
<var-decl> ::= an identifier token | (var : TExp) / VarRef(var, TE: TExp) ##### L5
<sexp>     ::= symbol | number | bool | string | ( <sexp>* )              ##### L3
*/

Note that we introduce optional type annotations in the L5 syntax in 2 places only:

TExp-AST

/*
Type checking language
Syntax with optional type annotations for var declarations and function return types.

<texp>         ::= <atomic-te> | <compound-te> | <tvar>
<atomic-te>    ::= <num-te> | <bool-te> | <void-te> | <str-te>
<num-te>       ::= number   // num-te()
<bool-te>      ::= boolean  // bool-te()
<str-te>       ::= string   // str-te()
<void-te>      ::= void     // void-te()
<compound-te>  ::= <proc-te> | <tuple-te>
<non-tuple-te> ::= <atomic-te> | <proc-te> | <tvar>
<proc-te>      ::= [ <tuple-te> -> <non-tuple-te> ] // proc-te(param-tes: list(te), return-te: te)
<tuple-te>     ::= <non-empty-tuple-te> | <empty-te>
<non-empty-tuple-te> ::= ( <non-tuple-te> *)* <non-tuple-te> // tuple-te(tes: list(te))
<empty-te>     ::= Empty
<tvar>         ::= a symbol starting with T // tvar(id: Symbol, contents; Box(string|boolean))
*/

The definition of the TExp data types, the parser of TExp is provided in file TExp.ts

Examples of Types Expressions:

// Purpose: Compute the type of an expression
// Traverse the AST and check the type according to the exp type.
export const typeofExp = (exp: A.Parsed, tenv: E.TEnv): Result<T.TExp> =>
    A.isNumExp(exp) ? makeOk(T.makeNumTExp()) :
    A.isBoolExp(exp) ? makeOk(T.makeBoolTExp()) :
    A.isStrExp(exp) ? makeOk(T.makeStrTExp()) :
    A.isPrimOp(exp) ? TC.typeofPrim(exp) :
    A.isVarRef(exp) ? E.applyTEnv(tenv, exp.var) :
    A.isIfExp(exp) ? typeofIf(exp, tenv) :
    A.isProcExp(exp) ? typeofProc(exp, tenv) :
    A.isAppExp(exp) ? typeofApp(exp, tenv) :
    A.isLetExp(exp) ? typeofLet(exp, tenv) :
    A.isLetrecExp(exp) ? typeofLetrec(exp, tenv) :
    A.isDefineExp(exp) ? typeofDefine(exp, tenv) :
    A.isProgram(exp) ? typeofProgram(exp, tenv) :
    makeFailure(`Unknown type: ${JSON.stringify(exp, null, 2)}`);

Layer 2: Substitution and Equation ADTs

The Type Inference algorithm is defined in terms of 2 basic formal tools:

Equations are the tool we use to represent a typing statement in the algorithm.

Substitution ADT

export interface Sub {tag: "Sub"; vars: TVar[]; tes: TExp[]; }
export const isSub = (x: any): x is Sub => x.tag === "Sub";

The interface of the Substitution ADT includes:

Equation ADT

export interface Equation {left: T.TExp, right: T.TExp}
export const makeEquation = (l: T.TExp, r: T.TExp): Equation => ({left: l, right: r});

Layer 3: Exp-to-Pool, Exp-to-Equations, Solve

The key logic of the algorithm lies in the solveEquations algorithm – which turns a list of Type Equations into a single coherent Type Substitution which satisfies all the constraints.

solveEquations relies on the computation of unification between two type expressions – as shown in the following fragment:

export const solveEquations = (equations: Equation[]): Res.Result<S.Sub> =>
    solve(equations, S.makeEmptySub());

// Purpose: Solve the equations, starting from a given substitution.
//          Returns the resulting substitution, or error, if not solvable
const solve = (equations: Equation[], sub: S.Sub): Res.Result<S.Sub> => {
    const solveVarEq = (tvar: T.TVar, texp: T.TExp): Res.Result<S.Sub> =>
        Res.bind(S.extendSub(sub, tvar, texp), sub2 => solve(rest(equations), sub2));

    const bothSidesAtomic = (eq: Equation): boolean =>
        T.isAtomicTExp(eq.left) && T.isAtomicTExp(eq.right);

    const handleBothSidesAtomic = (eq: Equation): Res.Result<S.Sub> =>
        T.isAtomicTExp(eq.left) && T.isAtomicTExp(eq.right) && T.eqAtomicTExp(eq.left, eq.right)
        ? solve(rest(equations), sub)
        : Res.makeFailure(`Equation with non-equal atomic type ${JSON.stringify(eq, null, 2)}`);

    if (isEmpty(equations)) {
        return Res.makeOk(sub);
    }

    const eq = makeEquation(S.applySub(sub, first(equations).left),
                            S.applySub(sub, first(equations).right));

    return T.isTVar(eq.left) ? solveVarEq(eq.left, eq.right) :
           T.isTVar(eq.right) ? solveVarEq(eq.right, eq.left) :
           bothSidesAtomic(eq) ? handleBothSidesAtomic(eq) :
           T.isCompoundTExp(eq.left) && T.isCompoundTExp(eq.right) && canUnify(eq) ?
                solve(R.concat(rest(equations), splitEquation(eq)), sub) :
           Res.makeFailure(`Equation contains incompatible types ${JSON.stringify(eq, null, 2)}`);
};

Compound type expressions are unified by unifying their components one by one.

This is implemented by these 2 functions:

// Signature: canUnify(equation)
// Purpose: Compare the structure of the type expressions of the equation
const canUnify = (eq: Equation): boolean =>
    T.isProcTExp(eq.left) && T.isProcTExp(eq.right) &&
    (eq.left.paramTEs.length === eq.right.paramTEs.length);

// Signature: splitEquation(equation)
// Purpose: For an equation with unifyable type expressions,
//          create equations for corresponding components.
// Type: [Equation -> List(Equation)]
// Example: splitEquation(
//            makeEquation(parseTExp('(T1 -> T2)'),
//                         parseTExp('(T3 -> (T4 -> T4))')) =>
//            [ {left:T2, right: (T4 -> T4)},
//              {left:T3, right: T1)} ]
// @Pre: isCompoundExp(eq.left) && isCompoundExp(eq.right) && canUnify(eq)
const splitEquation = (eq: Equation): Equation[] =>
    (T.isProcTExp(eq.left) && T.isProcTExp(eq.right)) ?
        R.zipWith(makeEquation,
                  cons(eq.left.returnTE, eq.left.paramTEs),
                  cons(eq.right.returnTE, eq.right.paramTEs)) :
    [];

Exercise: Extend the Type Inference system to support if expressions

The part of the Type Inference System which “knows” L5 Expressions is the function makeEquationsFromExp(exp, pool).

The structure of this function is a case for each type of L5 expression.

The code we provide supports:

From Typing Rule for if-expressions the type equations derived from a composite if-expression: (if _p _c _a) we derive 3 equations:

// Purpose: compute the type of an if-exp
// Typing rule:
//   if type<test>(tenv) = boolean
//      type<then>(tenv) = t1
//      type<else>(tenv) = t1
// then type<(if test then else)>(tenv) = t1
const typeofIf = (ifExp: A.IfExp, tenv: E.TEnv): Result<T.TExp> => {
    const testTE = typeofExp(ifExp.test, tenv);
    const thenTE = typeofExp(ifExp.then, tenv);
    const altTE = typeofExp(ifExp.alt, tenv);
    const constraint1 = bind(testTE, (testTE: T.TExp) => checkEqualType(testTE, T.makeBoolTExp(), ifExp));
    const constraint2 = bind(thenTE, (thenTE: T.TExp) => 
                            bind(altTE, (altTE: T.TExp) => 
                                checkEqualType(thenTE, altTE, ifExp)));
    return bind(constraint1, (_c1: true) =>
                bind(constraint2, (_c2: true) => thenTE));
};

Substitution

// Implementation of the Substitution ADT
// ========================================================
// A substitution is represented as a 2 element list of equal length
// lists of variables and type expression.
// The empty substitution is [[], []]

export interface Sub {tag: "Sub"; vars: TVar[]; tes: TExp[]; }
export const isSub = (x: any): x is Sub => x.tag === "Sub";

// Constructors:
// Signature: makeSub(vars, tes)
// Purpose: Create a substitution in which the i-th element of 'variables'
//          is mapped to the i-th element of 'tes'.
// Example: makeSub(
//             map(parseTE, ["x", "y", "z"]),
//             map(parseTE, ["number", "boolean", "(number -> number)"])
//          => {tag: "Sub", vars: [x y z], [numTexp, boolTexp, ProcTexp([NumTexp, NumTexp])]}
//          makeSub(map(parseTE, ["x", "y", "z"]),
//                  map(parseTE, ["number", "boolean", "(z -> number)"]))
//          => error makeSub: circular substitution
// Pre-condition: (length variables) = (length tes)
//                variables has no repetitions (set)
export const makeSub = (vars: TVar[], tes: TExp[]): Result<Sub> =>
    mapv(zipWithResult(checkNoOccurrence, vars, tes), _ => ({ tag: "Sub", vars: vars, tes: tes }));

export const makeEmptySub = (): Sub => ({tag: "Sub", vars: [], tes: []});

Whenever creating a substitution, we verify that the invariant holds that when a TVar is associated to a TExp, it does not occur inside the TExp.

This is enforced by this function:

// Purpose: when attempting to bind tvar to te in a sub - check whether tvar occurs in te.
// Return error if a circular reference is found.
export const checkNoOccurrence = (tvar: TVar, te: TExp): Result<true> => {
    const check = (e: TExp): Result<true> =>
        isTVar(e) ? ((e.var === tvar.var) ? bind(unparseTExp(te), up => makeFailure(`Occur check error - circular sub ${tvar.var} in ${JSON.stringify(up, null, 2)}`)) : 
                                            makeOk(true)) :
        isAtomicTExp(e) ? makeOk(true) :
        isProcTExp(e) ? bind(mapResult(check, e.paramTEs), _ => check(e.returnTE)) :
        makeFailure(`Bad type expression ${e} in ${JSON.stringify(te, null, 2)}`);
    return check(te);
};

Note on how we compare two substitutions to perform tests In order to compare two substitutions, we “normalize” them by sorting their string representation and then compare the strings.

This in effect achieves set equality:

export const subToStr = (sub: Sub): Result<string> =>
    bindResult(zipWithResult((v, t) => bindResult(unparseTExp(t), up => makeOk(`${v.var}:${up}`)), sub.vars, sub.tes),
               (vts: string[]) => makeOk(vts.sort().join(", ")));

Example of applySub from L5-substitutions-ADT-tests.ts:

export const applySub = (sub: Sub, te: TExp): TExp =>
    isEmptySub(sub) ? te :
    isAtomicTExp(te) ? te :
    isTVar(te) ? subGet(sub, te) :
    isProcTExp(te) ? makeProcTExp(map((te) => applySub(sub, te), te.paramTEs), applySub(sub, te.returnTE)) :
    te;

describe('applySub', () => {
        it('applies a substitution on a type expression', () => {
            const sub1 = sub(["T1", "T2"], ["number", "boolean"]);
            const te1 = parseTE("(T1 * T2 -> T1)");
            const unparsed = bind(sub1, (sub: S.Sub) =>
                                bind(te1, (te: TExp) =>
                                    unparseTExp(S.applySub(sub, te))));
            expect(unparsed).toEqual(makeOk("(number * boolean -> number)"));
        });
    });

Example of combineSub From L5-substitutions-ADT-tests.ts:

    describe('combineSub', () => {
        it('combines substitutions (the o operator)', () => {
            // {T1:(number -> S1), T2:(number -> S4)} o {T3:(number -> S2)} =>
            // {T1:(number -> S1), T2:(number -> S4), T3:(number -> S2)}
            const sub11 = sub(["T1", "T2"], ["(number -> S1)", "(number -> S4)"]);
            const sub12 = sub(["T3"], ["(number -> S2)"]);
            const expected1 = bind(sub(["T1", "T2", "T3"], ["(number -> S1)", "(number -> S4)", "(number -> S2)"]), subToStr);
            const res1 = bind(safe2(S.combineSub)(sub11, sub12), subToStr);
            expect(res1).toEqual(expected1);
        });
    });