2.2 - Higher Order Functions in Scheme and Local Variables

PPL 2023

In the previous lecture, we defined a programming language (which is a subset of Scheme) by introducing the elements of the programming language incrementally. We first defined \(L1\) which consists of primitive atomic data types (number and boolean), the possibility to define global variables (with the define special form) and recursive combinations of primitive operations.

We then introduced conditonal expressions (the if and cond forms) and user defined procedures (the lambda expressions which evaluate into Closures) in \(L2\). We observed that \(L2\) is already capable of describing infinite computations and supports the definition of recursive procedures.

We finally extended \(L2\) into \(L3\) by introducing compound data values - with the Pair and List data constructors and the corresponding procedures (cons, car, cdr) and primitive values (empty list and literal notation for pairs and list values).

We now complete the definition of the language on which we will work - by studying how it supports higher order functions (this does not require any modification to \(L3\)) and supporting local variables (this does require an addition to the language).

Higher Order Procedures in \(L3\)

Can we define higher-order functions in \(L3\)?

For all the procedures we will define in \(Li\) from this point on, we will attach a contract in the form of formatted comments. The contract includes at least:

;; Signature:
;; Type:
;; Purpose:

Within these comments, we use a type langage to define the type of expressions which is extremely close to the TypeScript type language - but restricted to the data types that we have defined in our language - Pair and List instead of Arrays and Maps. For function types, we write [T1 * T2 -> T3] for a function taking 2 parameters of type T1 and T2 and returning a value of type T3. We denote type variables by using type names that start with a capital T. In the next Chapter, we will formalize this type language and implement meta-programs operating over it (type checker and type inference systems).

Map

Let us try to define our favorite higher-order functions:

;; Signature: map(f,l)
;; Type: [ (T1->T2) * List(T1) -> List(T2)]
;; Purpose: Apply f to all elements in l and return the list of the results.
(define map
  (lambda (f l)
    (if (empty? l)
        '()
        (cons (f (car l))
              (map f (cdr l))))))

The key part to verify according to the evaluation rules - is that the invocation of (f (car l)) will be evaluated as expected according to the evaluation rules of \(L3\).

Verify it!

(map (lambda (x) (* x x)) '(1 2 3 4 5)) ;--> '(1 4 9 16 25)

(map (lambda (x) (> x 2)) '(1 2 3 4))   ;--> '(#f #f #t #t)

Verify that the type we specified for map is correct by checking that all the sub-expressions used in the body of the function map are well typed.

Filter

;; Signature: filter(pred,l)
;; Type: [ (T->Boolean) * List(T) -> List(T) ]
;; Purpose: Return the list of elements in l that satisfy pred.
(define filter
  (lambda (pred l)
    (if (empty? l)
        '()
        (if (pred (car l))
            (cons (car l) (filter pred (cdr l)))
            (filter pred (cdr l))))))
(filter even? '(0 1 2 3 4))              ;--> '(0 2 4)
(filter (lambda (x) (> x 2)) '(1 2 3 4)) ;--> '(3 4)

Reduce

;; Signature: reduce(reducer, init, l)
;; Type: [(T1 * T2 -> T2) * T2 * List(T1) -> T2]
;; Purpose: Combine all the values of l using reducer
;;   (reduce + 0 '(1 2 3)) --> (+ 1 (+ 2 (+ 3 0)))
(define reduce
  (lambda (reducer init l)
    (if (empty? l)
        init
        (reducer (car l) 
                 (reduce reducer init (cdr l))))))

Compare this first definition of reduce with this one:

;; Signature: reduce2(reducer, init, l)
;; Type: [(T1 * T2 -> T2) * T2 * List(T1) -> T2]
;; Purpose: Combine all the values of l using reducer
;;   (reduce2 + 0 '(1 2 3)) --> ?
(define reduce2
  (lambda (reducer init l)
    (if (empty? l)
        init
        (reduce2 reducer
                 (reducer (car l) init)
                 (cdr l)))))

Are the two functions reduce and reduce2 equivalent?

Procedures as Parameters

Let us observe the process of defining functional abstractions through abstraction of repeated patterns of code.

Consider a procedure that computes the sum of the numbers in a range [a, b]:

;; Signature: sum-integers(a,b)
;; Type: (Number*Number)->Number
;; Purpose: compute the sum of all integers a to b
(define sum-integers 
  (lambda (a b)
    (if (> a b)
        0
        (+ a (sum-integers (+ a 1) b)))))

Let us then define a procedure to compute the sum of the cubes of all the numbers between a and b:

(define cube (lambda (x) (* x x x)))

;; Signature: sum-cubes(a,b)
;; Type: (Number*Number)->Number
;; Purpose: compute the sum of the cube of all integers a to b
(define sum-cubes 
  (lambda (a b)
    (if (> a b)
        0
        (+ (cube a) (sum-cubes (+ a 1) b)))))

Consider yet a third function - used to compute a numerical approximation of \(\pi\) using the formula: \(1/a\times(a+2) + 1/(a+4)\times(a+6) + 1/(a+8)*(a+10) + \ldots\)

This formula converges to \(\pi/8\) when starting with \(a=1\).

;; Signature: pi-sum(a,b)
;; Type: (Number*Number)->Number
;; Purpose: compute the sum of 1/a*(a+2) + ... + 1/(a+4n)*(a+4n+2) 
;;          s.t. a+4n <= b < a + 4(n + 1)
(define pi-sum
  (lambda (a b)
    (if (> a b)
        0
        (+ (/ 1 (* a (+ a 2)))
           (pi-sum (+ a 4) b)))))

Let us now try to generalize the structure of these 3 procedures - that is, define a functional abstraction that describes the commonality between these 3 functions.

We observe the following repeated pattern:

(define <name>
  (lambda (a b)
    (if (> a b)
        0
        (+ (<term> a)
           (<name> (<next> a) <b>)

Based on this observation, we define a functional abstraction:

;; Signature: sum(term, a, next, b)
;; Type: [ [Number->Number] * Number * [Number->Number] * Number -> Number]
;; Purpose: Compute the sum: (term a) + (term (next a)) + .... + (term n) 
;;          where n = (next (next (... (next a)))) <= b and (next n) > b.
(define sum
  (lambda (term a next b)
    (if (> a b)
        0
        (+ (term a)
           (sum term (next a) next b)))))

We can now redefine the 3 procedures above using the new sum abstraction:

(define sum-integers2
  (lambda (a b)
    (sum identity a add1 b)))
    
(define sum-cubes2
  (lambda (a b)
    (sum cube a add1 b)))
    
(define pi-sum2
  (lambda (a b)
    (sum pi-term a pi-next b)))
    
(define pi-term 
  (lambda (x)
    (/ 1.0 (* x (+ x 2)))))
    
(define pi-next
  (lambda (n) (+ n 4)))

Given this new functional abstraction sum - we can define other functions. For example, a numerical approximation of the definite integral of a numerical function using the formula:

\[\int_{a}^{b} f(x) dx = [\sum_{n=1}^{\left \lceil{b-a/dx}\right \rceil } f(a+ n\times dx + dx/2)]\times dx\]

for a small value of \(dx\).

As we spot a sum in this formula, the sum abstraction is relevant for the implementation:

(define dx 0.001)
(define add-dx (lambda (x) (+ x dx)))

;; Signature: integral(f,a,b)
;; Type: [ [Number->Number] * Number * Number -> Number]
;; Purpose: Compute an approximation of the definite integral of f between a and b.
(define integral
  (lambda (f a b)
    (* (sum f (+ a (/ dx 2.0)) add-dx b)
       dx)))

Example:

(integral cube 0 1) ;--> 0.249999875000001 (True value is 0.25)

Local Variables

Up to this point, we only defined global variables using the define special form.

Local variables are a programming language feature which encourages programmers to limit the scope within which variables are known, and to avoid hidden dependencies between global variables and functions that refer to them.

When defining a local variable, we must provide:

Up to this point, we observed variables in only two contexts:

We first define the notions of scope, and bound and free variable occurrences before introducing local variables in the language.

Parameters, Scope, Bound and Free Variable Occurrences

A lambda form includes parameters and a body. Within a lambda expression, the body is the scope of the parameters - this means that all the variables that occur within the body are intended to be bound to the parameters. This is the case even if there is preceding definition of a variable with the same name.

For example:

(define x 1)

(lambda (x) (* x x)) ;--> occurrences of x in the body refer 
                     ;to the parameter variable x 
                     ;and NOT to the global variable x.

In such a case, we say that the occurrences of x in the body of the lambda expression are bound occurrences. Other variables are said to occur free.

For example, in the expression:

(lambda (f a b dx)
  (* (sum f
          (+ a (/ dx 2.0))
          (lambda (x) (+ x dx))
          b)
     dx))

the variables f, a, b, dx all occur bound - while the variable sum occurs free.

Deriving the Let Shortcut Notation

Consider the computation of the following function:

\[f(x,y) = x(1+xy)^2+y(1-y)+(1+xy)(1-y)\]

Let us implement a Scheme procedure to compute this function:

(define f
  (lambda (x y)
    (+ (* x 
          (square (+ 1 (* x y))))
       (* y
          (- 1 y))
       (* (+ 1 (* x y))
          (- 1 y)))))

We observe that the same sub-expressions are repeated: (+ 1 (* x y)) and (- 1 y).

This not only makes it more difficult to read the code - it also makes the computation slower - because these expressions will actually be computed twice each time the function is invoked.

One way to avoid these repetitions is to abstract away the repeated sub-expressions - in exactly the same way we abstracted away the repeated sub-expressions when defining the sum abstraction.

Our abstraction mechanism involves defining new functions:

(define f1 
  (lambda (x y) (+ 1 (* x y))))
  
(define f2
  (lambda (y) (- 1 y)))

Using these two helper functions, we can rewrite f as:

(define f
  (lambda (x y)
    (+ (* x (square (f1 x y)))
       (* y (f2 y))
       (* (f1 x y) (f2 y)))))

The problem is that we need to define names for these helpers - and define new functions for each one.

Alternatively - we can proceed as follows:

Let \(a = 1+xy\), \(b = 1-y\), \(f(x,y) = xa^2+yb+ab\)

Which in Scheme is implemented as:

(define f
  (lambda (x y)
  
    ((lambda (a b)
        (+ (* x (square a))
           (* y b)
           (* a b)))
           
      ;; Values of a and b which depend on x and y
      (+ 1 (* x y))
      (- 1 y))
    ))

This structure is convenient - it allows us to define local variables a and b which are defined within the scope of x and y and are only used within a single expression.

The Let abbreviation

The syntactic form of the definition of local variables is not readable - because the value which initializes the local variables (a and b) are far away from their declaration.

The Scheme language defines a syntactic abbreviation which is internally turned into the same lambda application form called the let form to encourage programmers to use this construct.

The structure of the let expression is:

(let ( (<var1> <exp1>)
       (<var2> <exp2>)
       ...
       (<varn> <expn>) )
   <body>)

Internally, this form is replaced by the equivalent syntactic form:

( (lambda (<var1> ... <varn>) <body>)
  <exp1> ... <expn> )

Hence its evaluation rule is already defined as per the rules of \(L3\) evaluation.

Still, it is useful to remember the way let forms are evaluated:

Procedures as Returned Values

We have seen so far examples of higher order procedures that receive procedures as parameters. Let us now review cases where procedures return computed procedures.

Returning a Closure

Let us first work out the mechanics of procedures returning procedure values:

We first start from an expression that has a concrete value (a number) and abstract it away on each of the variables that occur in the expression:

(define x 3)
(define y 0)

(+ x y y) ;--> 3
(lambda (x) (+ x y y)) ;--> #<Closure (x) (+ x 0 0)>
(lambda (y) (lambda (x) (+ x y y))) ;--> #<Closure (y) (lambda (x) (+ x y y))>

Now, let us apply these closure values:

( (lambda (x) (+ x y y)) 5 ) ;--> 5
( (lambda (y) (lambda (x) (+ x y y))) 2) ;--> #<Closure (x) (+ x 2 2)>
( ((lambda (y) (lambda (x) (+ x y y))) 2) 5) ;--> 9

The important point to notice is that when a Closure value is constructed, the value of the free variables is substituted in the body of the closure. For example, in the second expression above, the closure value has no free variable y - but the occurrences of y have been substituted by 2.

Derivative Example

We design a function which given a numeric function as a parameter, returns a new function which computes a numerical approximation of the derivative of the function - using the formula: \(f'(x) = [f(x+dx)-f(x)]/dx\)

;; Signature: derive(f, dx)
;; Type: [ (Number->Number) * Number -> (Number->Number) ]
;; Purpose: Construct a function that computes a numerical approx
;;          of the derivative of f with resolution dx.
(define derive
  (lambda (f dx)
    (lambda (x)
      (/ (- (f (+ x dx))
            (f x))
         dx))))

When this function is evaluated, it returns a Closure value which remembers the value of dx that was used.

(let [(f1 (derive square 0.001))
      (f2 (derive square 0.1)) ]
  (display (f1 2))  ;--> 4.000999999999699 (Real value is 4)
  (f2 2)) ;--> 4.100000000000001 

We can iterate this procedure to compute the nth derivative of a numerical function:

;; Signature: nth-deriv(f,n)
;; Type: [ (Number->Number) * Number -> (Number->Number) ]
;; Purpose: construct a function that computes a numerical approximation of the nth derivative of f
(define nth-deriv
  (lambda (f n)
    (lambda (x)
      (if (= n 0)
          (f x)
          ( (nth-deriv (derive f 0.0001) (- n 1)) x)))))

Consider an alternative definition:

(define nth-deriv-early
  (lambda (f n)
    (if (= n 0)
        f
        (derive (nth-deriv-early f (- n 1)) 0.0001))))

The time at which the closures are created is different. Trace the behavior and comment on which version is more desirable.

Summary

  1. To define a programming language, we provide:
    1. The syntax of the language - which defines the set of all possible expressions in the language.
    2. The set of all possible values that can be computed by the language.
    3. Computation rules for each type of expression.
  2. Scheme is a functional programming language which has a minimalistic definition. We defined a subset of Scheme characterized by the following syntax:
    1. Atomic syntactic expressions: literal numbers, literal booleans, a set of variables bound to primitive functions.
    2. All compound expressions are built using a list notation - (e1 ... en) - with spaces between the sub-expressions.
    3. Compound expressions are grouped in 2 families: special forms and non-special forms.
    4. The special forms we defined are:
      • define
      • lambda
      • if
      • cond
      • let
  3. The computed values are the union of the following types:
    • number
    • boolean
    • primitive function
    • closure
    • pair
    • list
  4. Computation rules define how to compute recursively the value of each type of expression in the syntax. They define the semantics of the language.
  5. Compound values in the Scheme subset we consider are of type Pair(T1,T2) and List(T).
    • Pair(T1,T2) denotes the cartesian product of types T1 and T2.
    • The value constructor for values of type Pair is the cons primitive function.
    • The accessors for values of type Pair are car and cdr and type predicate pair?.
    • List(T) is defined inductively as:
      • Either empty-list (denoted ‘()).
      • Or (<val> . <list>) where <val> is in T and <list> is in List(T).
    • List(T) is implemented in Scheme using Union(Pair(T,List(T)), EmptyList).