2.2 - Higher Order Functions in Scheme and Local Variables
PPL 2023
In the previous lecture, we defined a programming language (which is a subset of Scheme) by introducing the elements of the programming language incrementally. We first defined \(L1\) which consists of primitive atomic data types (number and boolean), the possibility to define global variables (with the define special form) and recursive combinations of primitive operations.
We then introduced conditonal expressions (the if
and cond
forms) and user defined procedures (the lambda
expressions which evaluate into Closures) in \(L2\). We observed that \(L2\) is already capable of describing infinite computations and supports the definition of recursive procedures.
We finally extended \(L2\) into \(L3\) by introducing compound data values - with the Pair and List data constructors and the corresponding procedures (cons, car, cdr) and primitive values (empty list and literal notation for pairs and list values).
We now complete the definition of the language on which we will work - by studying how it supports higher order functions (this does not require any modification to \(L3\)) and supporting local variables (this does require an addition to the language).
Higher Order Procedures in \(L3\)
Can we define higher-order functions in \(L3\)?
For all the procedures we will define in \(Li\) from this point on, we will attach a contract in the form of formatted comments. The contract includes at least:
;; Signature:
;; Type:
;; Purpose:
Within these comments, we use a type langage to define the type of expressions which is extremely close to the TypeScript type language - but restricted to the data types that we have defined in our language - Pair and List instead of Arrays and Maps.
For function types, we write [T1 * T2 -> T3]
for a function taking 2 parameters of type T1
and T2
and returning a value of type T3
. We denote type variables by using type names that start with a capital T. In the next Chapter, we will formalize this type language and implement meta-programs operating over it (type checker and type inference systems).
Map
Let us try to define our favorite higher-order functions:
;; Signature: map(f,l)
;; Type: [ (T1->T2) * List(T1) -> List(T2)]
;; Purpose: Apply f to all elements in l and return the list of the results.
(define map
(lambda (f l)
(if (empty? l)
'()
(cons (f (car l))
(map f (cdr l))))))
The key part to verify according to the evaluation rules - is that the invocation of (f (car l))
will be evaluated as expected according to the evaluation rules of \(L3\).
Verify it!
(map (lambda (x) (* x x)) '(1 2 3 4 5)) ;--> '(1 4 9 16 25)
(map (lambda (x) (> x 2)) '(1 2 3 4)) ;--> '(#f #f #t #t)
Verify that the type we specified for map
is correct by checking that all the sub-expressions used in the body of the function map
are well typed.
Filter
;; Signature: filter(pred,l)
;; Type: [ (T->Boolean) * List(T) -> List(T) ]
;; Purpose: Return the list of elements in l that satisfy pred.
(define filter
(lambda (pred l)
(if (empty? l)
'()
(if (pred (car l))
(cons (car l) (filter pred (cdr l)))
(filter pred (cdr l))))))
(filter even? '(0 1 2 3 4)) ;--> '(0 2 4)
(filter (lambda (x) (> x 2)) '(1 2 3 4)) ;--> '(3 4)
Reduce
;; Signature: reduce(reducer, init, l)
;; Type: [(T1 * T2 -> T2) * T2 * List(T1) -> T2]
;; Purpose: Combine all the values of l using reducer
;; (reduce + 0 '(1 2 3)) --> (+ 1 (+ 2 (+ 3 0)))
(define reduce
(lambda (reducer init l)
(if (empty? l)
init
(reducer (car l)
(reduce reducer init (cdr l))))))
Compare this first definition of reduce
with this one:
;; Signature: reduce2(reducer, init, l)
;; Type: [(T1 * T2 -> T2) * T2 * List(T1) -> T2]
;; Purpose: Combine all the values of l using reducer
;; (reduce2 + 0 '(1 2 3)) --> ?
(define reduce2
(lambda (reducer init l)
(if (empty? l)
init
(reduce2 reducer
(reducer (car l) init)
(cdr l)))))
Are the two functions reduce
and reduce2
equivalent?
Procedures as Parameters
Let us observe the process of defining functional abstractions through abstraction of repeated patterns of code.
Consider a procedure that computes the sum of the numbers in a range [a, b]:
;; Signature: sum-integers(a,b)
;; Type: (Number*Number)->Number
;; Purpose: compute the sum of all integers a to b
(define sum-integers
(lambda (a b)
(if (> a b)
0
(+ a (sum-integers (+ a 1) b)))))
Let us then define a procedure to compute the sum of the cubes of all the numbers between a and b:
(define cube (lambda (x) (* x x x)))
;; Signature: sum-cubes(a,b)
;; Type: (Number*Number)->Number
;; Purpose: compute the sum of the cube of all integers a to b
(define sum-cubes
(lambda (a b)
(if (> a b)
0
(+ (cube a) (sum-cubes (+ a 1) b)))))
Consider yet a third function - used to compute a numerical approximation of \(\pi\) using the formula: \(1/a\times(a+2) + 1/(a+4)\times(a+6) + 1/(a+8)*(a+10) + \ldots\)
This formula converges to \(\pi/8\) when starting with \(a=1\).
;; Signature: pi-sum(a,b)
;; Type: (Number*Number)->Number
;; Purpose: compute the sum of 1/a*(a+2) + ... + 1/(a+4n)*(a+4n+2)
;; s.t. a+4n <= b < a + 4(n + 1)
(define pi-sum
(lambda (a b)
(if (> a b)
0
(+ (/ 1 (* a (+ a 2)))
(pi-sum (+ a 4) b)))))
Let us now try to generalize the structure of these 3 procedures - that is, define a functional abstraction that describes the commonality between these 3 functions.
We observe the following repeated pattern:
(define <name>
(lambda (a b)
(if (> a b)
0
(+ (<term> a)
(<name> (<next> a) <b>)
Based on this observation, we define a functional abstraction:
;; Signature: sum(term, a, next, b)
;; Type: [ [Number->Number] * Number * [Number->Number] * Number -> Number]
;; Purpose: Compute the sum: (term a) + (term (next a)) + .... + (term n)
;; where n = (next (next (... (next a)))) <= b and (next n) > b.
(define sum
(lambda (term a next b)
(if (> a b)
0
(+ (term a)
(sum term (next a) next b)))))
We can now redefine the 3 procedures above using the new sum
abstraction:
(define sum-integers2
(lambda (a b)
(sum identity a add1 b)))
(define sum-cubes2
(lambda (a b)
(sum cube a add1 b)))
(define pi-sum2
(lambda (a b)
(sum pi-term a pi-next b)))
(define pi-term
(lambda (x)
(/ 1.0 (* x (+ x 2)))))
(define pi-next
(lambda (n) (+ n 4)))
Given this new functional abstraction sum
- we can define other functions. For example, a numerical approximation of the definite integral of a numerical function using the formula:
for a small value of \(dx\).
As we spot a sum in this formula, the sum
abstraction is relevant for the implementation:
(define dx 0.001)
(define add-dx (lambda (x) (+ x dx)))
;; Signature: integral(f,a,b)
;; Type: [ [Number->Number] * Number * Number -> Number]
;; Purpose: Compute an approximation of the definite integral of f between a and b.
(define integral
(lambda (f a b)
(* (sum f (+ a (/ dx 2.0)) add-dx b)
dx)))
Example:
(integral cube 0 1) ;--> 0.249999875000001 (True value is 0.25)
Local Variables
Up to this point, we only defined global variables using the define
special form.
Local variables are a programming language feature which encourages programmers to limit the scope within which variables are known, and to avoid hidden dependencies between global variables and functions that refer to them.
When defining a local variable, we must provide:
- A way to name the variable
- How to initialize the variable
- A way to decide what is the scope of the variable - that is, what is the part of the program where this variable is visible.
Up to this point, we observed variables in only two contexts:
- In
define
statements(define <var> <exp>)
- In
lambda
expressions(lambda (<var> ...) <exp>)
We first define the notions of scope, and bound and free variable occurrences before introducing local variables in the language.
Parameters, Scope, Bound and Free Variable Occurrences
A lambda
form includes parameters and a body.
Within a lambda
expression, the body is the scope of the parameters - this means that all the variables that
occur within the body are intended to be bound to the parameters. This is the case even if there is preceding definition of a variable with the same name.
For example:
(define x 1)
(lambda (x) (* x x)) ;--> occurrences of x in the body refer
;to the parameter variable x
;and NOT to the global variable x.
In such a case, we say that the occurrences of x
in the body of the lambda
expression are bound occurrences.
Other variables are said to occur free.
For example, in the expression:
(lambda (f a b dx)
(* (sum f
(+ a (/ dx 2.0))
(lambda (x) (+ x dx))
b)
dx))
the variables f
, a
, b
, dx
all occur bound - while the variable sum
occurs free.
Deriving the Let Shortcut Notation
Consider the computation of the following function:
\[f(x,y) = x(1+xy)^2+y(1-y)+(1+xy)(1-y)\]Let us implement a Scheme procedure to compute this function:
(define f
(lambda (x y)
(+ (* x
(square (+ 1 (* x y))))
(* y
(- 1 y))
(* (+ 1 (* x y))
(- 1 y)))))
We observe that the same sub-expressions are repeated:
(+ 1 (* x y))
and (- 1 y)
.
This not only makes it more difficult to read the code - it also makes the computation slower - because these expressions will actually be computed twice each time the function is invoked.
One way to avoid these repetitions is to abstract away the repeated sub-expressions - in exactly the same way we abstracted away the repeated sub-expressions when defining the sum
abstraction.
Our abstraction mechanism involves defining new functions:
(define f1
(lambda (x y) (+ 1 (* x y))))
(define f2
(lambda (y) (- 1 y)))
Using these two helper functions, we can rewrite f
as:
(define f
(lambda (x y)
(+ (* x (square (f1 x y)))
(* y (f2 y))
(* (f1 x y) (f2 y)))))
The problem is that we need to define names for these helpers - and define new functions for each one.
Alternatively - we can proceed as follows:
Let \(a = 1+xy\), \(b = 1-y\), \(f(x,y) = xa^2+yb+ab\)
Which in Scheme is implemented as:
(define f
(lambda (x y)
((lambda (a b)
(+ (* x (square a))
(* y b)
(* a b)))
;; Values of a and b which depend on x and y
(+ 1 (* x y))
(- 1 y))
))
This structure is convenient - it allows us to define local variables a
and b
which are defined within the scope of x
and y
and are only used within a single expression.
The Let abbreviation
The syntactic form of the definition of local variables is not readable - because the value which initializes
the local variables (a
and b
) are far away from their declaration.
The Scheme language defines a syntactic abbreviation which is internally turned into the same lambda application form called the let
form to encourage programmers to use this construct.
The structure of the let
expression is:
(let ( (<var1> <exp1>)
(<var2> <exp2>)
...
(<varn> <expn>) )
<body>)
Internally, this form is replaced by the equivalent syntactic form:
( (lambda (<var1> ... <varn>) <body>)
<exp1> ... <expn> )
Hence its evaluation rule is already defined as per the rules of \(L3\) evaluation.
Still, it is useful to remember the way let
forms are evaluated:
- The let variables have scope over the body of the let only.
- Each
<vari>
is bound to the value of each<expi>
. - All the bindings are performed simultaneously - they do not depend on each other.
- The initial values
<expi>
are computed outside the scope of the let. - The evaluation of the let form involves the creation of closure value and its application to the initial values.
Procedures as Returned Values
We have seen so far examples of higher order procedures that receive procedures as parameters. Let us now review cases where procedures return computed procedures.
Returning a Closure
Let us first work out the mechanics of procedures returning procedure values:
We first start from an expression that has a concrete value (a number) and abstract it away on each of the variables that occur in the expression:
(define x 3)
(define y 0)
(+ x y y) ;--> 3
(lambda (x) (+ x y y)) ;--> #<Closure (x) (+ x 0 0)>
(lambda (y) (lambda (x) (+ x y y))) ;--> #<Closure (y) (lambda (x) (+ x y y))>
Now, let us apply these closure values:
( (lambda (x) (+ x y y)) 5 ) ;--> 5
( (lambda (y) (lambda (x) (+ x y y))) 2) ;--> #<Closure (x) (+ x 2 2)>
( ((lambda (y) (lambda (x) (+ x y y))) 2) 5) ;--> 9
The important point to notice is that when a Closure value is constructed, the value of the free variables is substituted
in the body of the closure. For example, in the second expression above, the closure value has no free variable y
- but the occurrences of y
have been substituted by 2.
Derivative Example
We design a function which given a numeric function as a parameter, returns a new function which computes a numerical approximation of the derivative of the function - using the formula: \(f'(x) = [f(x+dx)-f(x)]/dx\)
;; Signature: derive(f, dx)
;; Type: [ (Number->Number) * Number -> (Number->Number) ]
;; Purpose: Construct a function that computes a numerical approx
;; of the derivative of f with resolution dx.
(define derive
(lambda (f dx)
(lambda (x)
(/ (- (f (+ x dx))
(f x))
dx))))
When this function is evaluated, it returns a Closure value which remembers the value of dx
that was used.
(let [(f1 (derive square 0.001))
(f2 (derive square 0.1)) ]
(display (f1 2)) ;--> 4.000999999999699 (Real value is 4)
(f2 2)) ;--> 4.100000000000001
We can iterate this procedure to compute the nth derivative of a numerical function:
;; Signature: nth-deriv(f,n)
;; Type: [ (Number->Number) * Number -> (Number->Number) ]
;; Purpose: construct a function that computes a numerical approximation of the nth derivative of f
(define nth-deriv
(lambda (f n)
(lambda (x)
(if (= n 0)
(f x)
( (nth-deriv (derive f 0.0001) (- n 1)) x)))))
Consider an alternative definition:
(define nth-deriv-early
(lambda (f n)
(if (= n 0)
f
(derive (nth-deriv-early f (- n 1)) 0.0001))))
The time at which the closures are created is different. Trace the behavior and comment on which version is more desirable.
Summary
- To define a programming language, we provide:
- The syntax of the language - which defines the set of all possible expressions in the language.
- The set of all possible values that can be computed by the language.
- Computation rules for each type of expression.
- Scheme is a functional programming language which has a minimalistic definition. We defined a subset of Scheme characterized by the following syntax:
- Atomic syntactic expressions: literal numbers, literal booleans, a set of variables bound to primitive functions.
- All compound expressions are built using a list notation -
(e1 ... en)
- with spaces between the sub-expressions. - Compound expressions are grouped in 2 families: special forms and non-special forms.
- The special forms we defined are:
- define
- lambda
- if
- cond
- let
- The computed values are the union of the following types:
- number
- boolean
- primitive function
- closure
- pair
- list
- Computation rules define how to compute recursively the value of each type of expression in the syntax. They define the semantics of the language.
- Compound values in the Scheme subset we consider are of type Pair(T1,T2) and List(T).
- Pair(T1,T2) denotes the cartesian product of types T1 and T2.
- The value constructor for values of type Pair is the
cons
primitive function. - The accessors for values of type Pair are
car
andcdr
and type predicatepair?
. - List(T) is defined inductively as:
- Either empty-list (denoted ‘()).
- Or
(<val> . <list>)
where<val>
is in T and<list>
is in List(T).
- List(T) is implemented in Scheme using Union(Pair(T,List(T)), EmptyList).