TypeScript: Complex Data Types, JSON, FP processing of JSON
PPL 2023
On the basis of the distinction atomic types / compound types, we can define more complex data types - that embed recursively maps and arrays. We introduce Javascript Object Notation (JSON) which has become a common language to exchange complex data values across systems (for example, between a client and a server or to store values in databases).
We present the different types of JSON values and finally, we describe how higher-order functional patterns make processing of JSON values convenient and expressive.
Literal Expressions in Programming Languages
In general, programming languages provide ways to specify values as part of expressions. Literal expressions are the syntactic representation of values that appear as part of expressions.
For example, consider the expression:
let v = 42;
(v > 0) ? "ok" : "nok";
In this compound expression, the sub-expressions 42
, 0
, "ok"
and "nok"
are all syntactic representations of
primitive values. These expressions are string encodings of values each according to the rules of the
syntax of the language and to the type of the value (number, string, boolean).
Distinguish between literal expressions and the values they represent:
- Literal expressions are part of the syntax of a programming language; they are evaluated.
- Values are the result of the evaluation; they are part of the semantics of the language.
All programming languages provide syntax to write literal expressions for atomic values.
For compound values, languages like Java and C++ provide no or very limited ways to specify literal expressions. For example, in Java to bind a list to its value, the following syntax must be used:
ArrayList<String> names = new ArrayList<String>(Arrays.asList("avi", "bob", "chloe"));
In contrast, in JavaScript, one would write directly a compound array values as:
let names = ["avi", "bob", "chloe"];
The key difference is that the Java expression is an invocation of constructor methods, while in JavaScript compound values can be written directly. This difference becomes more significant as we consider even more complex value types.
Complex Data Values in JavaScript
By combining arrays and maps, one can construct expressive data structures in JavaScript. For example, we can encode values such as:
let person1 = {name:"avi", age:25},
person2 = {name:"beate", age:23},
csDept = { name:"CS", faculty:"Natural Sciences"},
course = { name:"PPL", dept:csDept, students:[person1, person2]}
course
// ==>
// { name: 'PPL',
// dept: { name: 'CS', faculty: 'Natural Sciences' },
// students: [ { name: 'avi', age: 25 }, { name: 'beate', age: 23 } ] }
We can also write this value directly, without building it “piece by piece” as a compound literal value expression.
let ppl = {
name: 'PPL',
dept: { name: 'CS', faculty: 'Natural Sciences' },
students: [ { name: 'avi', age: 25 }, { name: 'beate', age: 23 } ] };
ppl.students[0].name;
// => 'avi'
In contrast, in Java, we would have to construct this complex value by invoking new
on each sub-component, then consruct the value bottom up until the toplevel expression. We would also need to define classes to describe the type of each subexpression - Person
, Dept
, Course
.
class Person { private String name; private int age; ...}
class Dept { private String name; private String faculty; ...}
class Course { private String name; private Dept dept; private ArrayList<Person> students; ...
public void addStudent(Persont p) { students.add(p); }
public static void main(String[] args) {
Person person1 = new Person("avi", 25);
Person person2 = new Person("beate", 23);
Dept csDept = new Dept("cs", "Natural Sciences");
Course ppl = new Course("ppl", csDept);
ppl.addStudent(person1);
ppl.addStudent(person2);
}
}
The JavaScript handling of complex literal values is more concise. Conciseness is important because it encourages the programmer to use the facility.
But the main difference is that complex data structures can be defined without having to define each unit as a new type - they come with less “baggage”. The key benefit of the easy manipulation of complex values is that different processes can exchange complex values without sharing code. In JavaScript, this is enabled by the JSON interface.
The JSON Interface
JSON stands for Java Script Object Notation. It is a standard way to serialize JavaScript compound values into strings and vice-versa, parse JSON strings into compound JavaScript values.
The key difference between the way we described the encoding of compound literal expressions above and JSON is that:
- Compound literal expressions are part of the syntax of the JavaScript language. A compound literal expression is a sub-expression of a JavaScript program - which when it is evaluated yields a compound value. Compound literal expressions are read by the parser of the programming language when a program is loaded, interpreted or compiled.
- JSON is a runtime mechanism which permits a process to read and write compound values (from files or from sockets).
JSON plays a central role in all data exchange in client-server communication, and serialization of values in databases.
JSON is supported in a wide range of programming languages - not only in JavaScript. There are JSON libraries in Java, C++, Python etc.
JSON plays a role similar to XML in enabling data exchange across heterogeneous processes.
The key parts of JSON are the stringify()
and parse()
methods which form the JSON interface.
let person1 = { name : "Yosi", age : 31, city : "Beer Sheva" },
person1JSON = JSON.stringify(person1);
console.log(person1JSON)
console.log(`person1JSON is of type ${typeof person1JSON}`)
// =>
// {"name":"Yosi","age":31,"city":"Beer Sheva"}
// person1JSON is of type string
let person2 = JSON.parse(person1JSON)
console.log(person2)
console.log(`person2 is of type ${typeof person2}`)
// =>
// { name: 'Yosi', age: 31, city: 'Beer Sheva' }
// person2 is of type object
JSON Syntax
The JSON syntax (which means the way complex values are serialized into strings) is a bit more “strict” than the syntax in JavaScript maps: all keys must be written with double quotes so that you need to write { "a":1 }
and not { a:1 }
.
The JSON notation supports values of the following data types:
- string
- number
- boolean
- null
- maps whose key values are JSON objects (recursively)
- array of JSON values (recursively)
In JavaScript, values can be all of the above, plus any other valid JavaScript expression, including:
- functions
- undefined
The JSON interface defines the following two methods:
JSON.stringify(o)
maps a value to a string.JSON.parse(s)
maps a string (written according to the JSON syntax) to a value.
JSON.stringify()
and JSON.parse()
also work on atomic values:
console.log(typeof(JSON.parse('2')))
console.log(typeof(JSON.parse('true')))
console.log(typeof(JSON.parse('"abc"')))
// =>
// number
// boolean
// string
Complex Compound Data
We have described compound data as data combined using two container types - arrays and maps.
These two container types can be combined in a recursive manner - so that arrays can contain other arrays or maps, and maps can contain other maps or arrays recursively.
For example:
let dept = {"deptName":"accounting",
"employees": [
{ "firstName":"John", "lastName":"Doe" },
{ "firstName":"Anna", "lastName":"Smith" },
{ "firstName":"Peter", "lastName":"Jones" }
]};
The dept
value is a map which has a key employees
with a value which is an embedded array whose items are nested maps.
To take apart the components of this value, the getters can be combined:
dept.employees[0].firstName
// => 'John'
dept["employees"][0]["lastName"]
// => 'Doe'
Similarly, arrays can contain nested arrays or maps.
JavaScript Arrays are heterogeneous - they can contain items of different types.
let heteroArr = [1, true, "a"],
nestedArr = [1, [2, 3], {a:1, b:2}]
console.log("heteroArr contains items of the following types: ")
heteroArr.forEach(x => console.log(typeof x))
console.log("nestedArr contains items of the following types: ")
nestedArr.forEach(x => console.log(typeof x))
// =>
// heteroArr contains items of the following types:
// number
// boolean
// string
// nestedArr contains items of the following types:
// number
// object
// object
Map and Array Mutators
In JavaScript, compound values are mutable. This means that, in addition to accessing parts of a compound value, JavaScript allows us:
- to change the value of components of a larger compound value and
- to remove components of larger compound values.
The fact that variables and the basic data types in JavaScript are mutable makes real functional programming difficult - since FP requires variables and compound data structures to be immutable. There are libraries, though, that enable the definition of immutable data structures and their efficient manipulation - such as https://facebook.github.io/immutable-js/.
dept
// =>
// { deptName: 'accounting',
// employees:
// [ { firstName: 'John', lastName: 'Doe' },
// { firstName: 'Anna', lastName: 'Smith' },
// { firstName: 'Peter', lastName: 'Jones' } ] }
dept.deptName = 'programming'
delete(dept.employees[0].lastName)
dept
// =>
// { deptName: 'programming',
// employees:
// [ { firstName: 'John' },
// { firstName: 'Anna', lastName: 'Smith' },
// { firstName: 'Peter', lastName: 'Jones' } ] }
let mutableArr = [0, 1, 2]
mutableArr[0] = 1
mutableArr
// => [ 1, 1, 2 ]
Array Primitive Mutators
JavaScript Arrays have many built-in primitives that allow mutation or copy of array values. It is sometimes difficult to keep track of which operations modify the array (mutators) and which operations create new array values.
For example, sort()
and reverse()
are mutators:
let a1 = [1,5,3,2,4]
console.log(a1.sort())
console.log(a1)
// =>
// [ 1, 2, 3, 4, 5 ]
// [ 1, 2, 3, 4, 5 ]
As you see above, after the sort()
method is invoked on an array, the array is modified in place.
sort()
is a mutator. Similarly, reverse()
is a mutator:
let a2 = [1,2,3,4]
console.log(a2.reverse())
console.log(a2)
// =>
// [ 4, 3, 2, 1 ]
// [ 4, 3, 2, 1 ]
The methods push()
and pop()
are also mutators:
a1.pop()
// => 5
a1
// => [ 1, 2, 3, 4 ]
a1.push(5)
// => 5
a1
// => [ 1, 2, 3, 4, 5 ]
The methods shift()
and unshift()
are similar to pop()
and push()
but operate on the beginning of the array (while push and pop operate on the end of the array). They are also mutators:
a1.shift()
// => 1
a1
// => [ 2, 3, 4, 5 ]
a1.unshift(1)
// => 5
a1
// => [ 1, 2, 3, 4, 5 ]
In contrast, the concat()
method returns a copy of the original arrays and does not modifiy its arguments:
a1.concat(a2)
// => [ 1, 2, 3, 4, 5, 4, 3, 2, 1 ]
a1
// => [ 1, 2, 3, 4, 5 ]
a2
// => [ 4, 3, 2, 1 ]
Useful Array Methods
The following methods are included in the Array interface:
length
: returns the number of elements in the arrayincludes(x)
: returns true if x is an element in the array, false otherwiseindexOf(x)
: returns the index of x within the array, -1 if x is not in the arrayjoin(delimiter)
: returns a string with all the items of the array serialized with delimiter between them.slice(start, end)
: returns a shallow copy of a sub-sequence from the array.splice(index)
: splits the array at the index position, and returns the suffix of the array after index (mutator).
let am = [1,2,3,4,5,6]
console.log(`am has ${am.length} items`)
console.log(`am.includes(2) is ${am.includes(2)}`)
console.log(`am.indexOf(2) is ${am.indexOf(2)}`)
console.log(`am.join() is ${am.join()}`)
console.log(`am.slice(2,4) is ${am.slice(2,4)}`)
console.log(`am.splice(2) is ${am.splice(2)}`)
console.log(`am is [${am}] after splice(2)`)
am has 6 items
am.includes(2) is true
am.indexOf(2) is 1
am.join() is 1,2,3,4,5,6
am.slice(2,4) is 3,4
am.splice(2) is 3,4,5,6
am is [1,2] after splice(2)
Higher Order Methods on Arrays and JSON
The Array interface includes higher order methods - that is, methods which receive a function as an argument. These methods are extremely versatile:
a.sort((a,b) => a > b)
: sort the elements with the comparator passed as a parameter (mutator).a.map(x => x*x)
: return a new array of the same length asa
[f(x) for x in a]a.filter(pred)
: return a new array with all elementsx
ina
that satisfypred(x)
a.find(pred)
: return the first elementx
ofa
which satisfiespred(x)
(similar tofilter
but returns only one item)a.findIndex(pred)
: same asfind
, but returns the position of the first itemx
ina
which satisfiespred(x)
a.every(pred)
: returntrue
if all elements ina
satisfypred
a.some(pred)
: returntrue
if at least one element ina
satisfiespred
a.forEach(f)
: applies the functionf
on all elements ofa
- useful only whenf
has side effects.a.reduce((acc, item) => transformer, init)
: transforms the array by applying the same transformer function on each item ina
and accumulating the successive results of the transformer.
Of these many methods, the most important are:
map
filter
reduce
The importance of these functions is that they allow us to manipulate complex values as a whole, instead of writing code that iterates (loops) over each item. This gives a declarative flavor to the code that manipulates JSON values - similar to SQL over relational data.
Consider the following example JSON value:
const people = [
{
name: 'Avi',
age: 23,
gender: 'm',
cs: true
},
{
name: 'Ben',
age: 25,
gender: 'm',
cs: false,
},
{
name: 'Gila',
age: 24,
gender: 'f',
cs: true,
},
{
name: 'Dalia',
age: 27,
gender: 'f',
cs: false,
},
];
We can use filter()
with appropriate predicates to perform operations similar to a select
in SQL:
// Select the people who are CS students
people.filter(p => p.cs)
// =>
// [ { name: 'Avi', age: 23, gender: 'm', cs: true },
// { name: 'Gila', age: 24, gender: 'f', cs: true } ]
// Select people by gender
people.filter(p => p.gender == 'f')
// =>
// [ { name: 'Gila', age: 24, gender: 'f', cs: true },
// { name: 'Dalia', age: 27, gender: 'f', cs: false } ]
map
can be used to compute transformations of the input items.
For example, assume we want to compute the year of birth of each person, we can compute:
people.map(p => 2021 - p.age)
// => [ 1998, 1996, 1997, 1994 ]
If we want to modify the maps of each person, we can use mutation:
// When a fat-array function body contains more than one expression - we must put the body in {} and use "return"
// When the body has only one expression, we can skip both (x => x*x)
people.map(p => {p.birthYear = 2021-p.age; return p})
// =>
// [ { name: 'Avi', age: 23, gender: 'm', cs: true, birthYear: 1998 },
// { name: 'Ben', age: 25, gender: 'm', cs: false, birthYear: 1996 },
// { name: 'Gila', age: 24, gender: 'f', cs: true, birthYear: 1997 },
// { name: 'Dalia', age: 27, gender: 'f', cs: false, birthYear: 1994 } ]
As we can see, the application of a function with mutation through map
modified the original JSON objec
people
// =>
// [ { name: 'Avi', age: 23, gender: 'm', cs: true, birthYear: 1998 },
// { name: 'Ben', age: 25, gender: 'm', cs: false, birthYear: 1996 },
// { name: 'Gila', age: 24, gender: 'f', cs: true, birthYear: 1997 },
// { name: 'Dalia', age: 27, gender: 'f', cs: false, birthYear: 1994 } ]
map()
and filter()
return an array based on the original array.
Reduce
The reduce()
method transforms the array into a single value through iterative application through each item.
It applies a function against an accumulator and each value of the array (from left-to-right) to reduce it to a single value.
reduce()
is often called fold()
or accumulate()
in other FP languages.
It iterates over a list, applying a function to an accumulated value and the next item in the list, until the iteration is complete and the accumulated value gets returned.
reduce()
takes a reducer function and an initial value, and returns the accumulated value.
The reducer function receives two arguments: the current accumulator and the current item.
array.reduce(
reducer: (accumulator: any, current: any) => any,
initialValue: any
) => accumulator: any
This is useful for any type of aggregation which would be similar to a group by
statement in SQL over the array.
// Sum of all elements in the array
[2,4,6].reduce((acc, n) => acc + n, 0);
// => 12
For each element in the array, the reducer is called and passed the accumulator and the current value.
The reducer’s job is to “fold” the current value into the accumulated value.
The reducer returns the new accumulated value, and reduce() moves on to the next value in the array.
The reducer needs an initial value to start with, so the initial value is passed as a parameter to reduce()
.
NOTE: In general, it makes sense that the initial value should be a neutral element for the accumulator function.
In the case of our sum, the first time the reducer is called, acc starts with 0 - the value we passed to reduce()
as the second parameter.
The reducer returns 0 + 2 (2 was the first element in the array), which is 2.
For the next call, acc = 2, n = 4 and the reducer returns the result of 2 + 4 (6).
In the last iteration, acc = 6, n = 6, and the reducer returns 12.
Since the iteration is finished, .reduce() returns the final accumulated value, 12.
In this case, we passed in an anonymous reducing function, but we can abstract it and give it a name:
const sumReducer = (acc, n) => acc + n;
[2,4,6].reduce(sumReducer, 0);
// => 12
If we want to trace the behavior of the reducer, we can add calls to log:
const tracingSumReducer = (acc, n) => { console.log(`Reduce ${acc} + ${n}`); return acc+n; }
[2, 4, 6].reduce(tracingSumReducer, 0);
// =>
// Reduce 0 + 2
// Reduce 2 + 4
// Reduce 6 + 6
// 12
Reduce can map to values of different types than those present in the array on which it iterates:
// Sum age of all persons in the people array
people.reduce((acc, p) => acc + p.age, 0)
// => 99
And reduce can return compound values as well:
// Count number of men and women
people.reduce(
(counter, p) => {
p.gender == 'm' ? counter.men++ : counter.women++;
return counter;
}, {men:0, women:0})
// => { men: 2, women: 2 }
Summary
Literal Expressions and Complex Values
- Programming languages define as part of their syntax ways to describe literal expressions which are evaluated to values.
- JavaScript (and most dynamic programming languages which do not require type declarations for all variables) allow programmers to write literal compound expressions which encode compound values.
- Compound values can be embedded recursively to yield complex data structures.
- JavaScript variables and data structures are mutable.
JSON
- JSON is a string serialization of JavaScript values (atomic and compound).
- The JSON interface includes 2 functions:
JSON.stringify()
andJSON.parse()
- JSON values can include embedded arrays and maps in a recursive manner - down to atomic values in the leaves.
- JSON compound values can be taken apart using composed accessors - for example
dept.employees[0].firstName
. - JSON compound values can be mutated in place using composed mutators - for example,
dept.employees[0].firstName = 'avi'
- Components within a JSON compound value can be removed using the
delete()
operator. - Arrays include primitive methods to access and mutate them:
length, includes(x), indexOf(x), join(delimiter), slice(start, end), splice(index)
. - Arrays include higher-order methods which receive functions as parameters. In particular,
map(), filter()
andreduce()
are extremely useful.
Exercises
Map, Filter, Reduce
- Define a function
product(arrayNumbers)
which returns the product of all the numbers in an array of numbers. Example:product([1,2,3]) -> 6
What should be the value of
product([])
? Justify. - Using the Ramda function range (http://ramdajs.com/docs/#range) - define a version of the
factorial()
function usingreduce()
.import * as R from 'ramda'; R.range(1,5) [ 1,2,3,4 ]
-
Define
map()
usingreduce()
. - Define
filter()
usingreduce()
.
JSON
-
Run the exercises in http://reactivex.io/learnrx/
-
Consider the following JSON value (obtained from a call to a Netflix API):
let movieLists = [
{
name: "New Releases",
videos: [
{
"id": 70111470,
"title": "Die Hard",
"boxart": "http://cdn-0.nflximg.com/images/2891/DieHard.jpg",
"uri": "http://api.netflix.com/catalog/titles/movies/70111470",
"rating": 4.0,
"bookmark": []
},
{
"id": 654356453,
"title": "Bad Boys",
"boxart": "http://cdn-0.nflximg.com/images/2891/BadBoys.jpg",
"uri": "http://api.netflix.com/catalog/titles/movies/70111470",
"rating": 5.0,
"bookmark": [{ id: 432534, time: 65876586 }]
}
]
},
{
name: "Dramas",
videos: [
{
"id": 65432445,
"title": "The Chamber",
"boxart": "http://cdn-0.nflximg.com/images/2891/TheChamber.jpg",
"uri": "http://api.netflix.com/catalog/titles/movies/70111470",
"rating": 4.0,
"bookmark": []
},
{
"id": 675465,
"title": "Fracture",
"boxart": "http://cdn-0.nflximg.com/images/2891/Fracture.jpg",
"uri": "http://api.netflix.com/catalog/titles/movies/70111470",
"rating": 5.0,
"bookmark": [{ id: 432534, time: 65876586 }]
}
]
}
]
This JSON value is a sort of tree with two levels - with a list of video objects at the 2nd level.
Using only map(), filter()
and reduce()
- compute the following values:
- The number of movies in the “Dramas” category
- The number of movies (in all categories)
- The list of the names and urls of movies with a rating of 5.0