TypeScript: Complex Data Types, JSON, FP processing of JSON

PPL 2023

On the basis of the distinction atomic types / compound types, we can define more complex data types - that embed recursively maps and arrays. We introduce Javascript Object Notation (JSON) which has become a common language to exchange complex data values across systems (for example, between a client and a server or to store values in databases).

We present the different types of JSON values and finally, we describe how higher-order functional patterns make processing of JSON values convenient and expressive.

Literal Expressions in Programming Languages

In general, programming languages provide ways to specify values as part of expressions. Literal expressions are the syntactic representation of values that appear as part of expressions.

For example, consider the expression:

let v = 42;
    (v > 0) ? "ok" : "nok";

In this compound expression, the sub-expressions 42, 0, "ok" and "nok" are all syntactic representations of primitive values. These expressions are string encodings of values each according to the rules of the syntax of the language and to the type of the value (number, string, boolean).

Distinguish between literal expressions and the values they represent:

All programming languages provide syntax to write literal expressions for atomic values.

For compound values, languages like Java and C++ provide no or very limited ways to specify literal expressions. For example, in Java to bind a list to its value, the following syntax must be used:

ArrayList<String> names = new ArrayList<String>(Arrays.asList("avi", "bob", "chloe"));

In contrast, in JavaScript, one would write directly a compound array values as:

let names = ["avi", "bob", "chloe"];

The key difference is that the Java expression is an invocation of constructor methods, while in JavaScript compound values can be written directly. This difference becomes more significant as we consider even more complex value types.

Complex Data Values in JavaScript

By combining arrays and maps, one can construct expressive data structures in JavaScript. For example, we can encode values such as:

let person1 = {name:"avi", age:25},
    person2 = {name:"beate", age:23},
    csDept = { name:"CS", faculty:"Natural Sciences"},
    course = { name:"PPL", dept:csDept, students:[person1, person2]}
  course
// ==>
//      { name: 'PPL',
//        dept: { name: 'CS', faculty: 'Natural Sciences' },
//        students: [ { name: 'avi', age: 25 }, { name: 'beate', age: 23 } ] }

We can also write this value directly, without building it “piece by piece” as a compound literal value expression.

let ppl = { 
  name: 'PPL',
  dept: { name: 'CS', faculty: 'Natural Sciences' },
  students: [ { name: 'avi', age: 25 }, { name: 'beate', age: 23 } ] };
    ppl.students[0].name;
// =>     'avi'

In contrast, in Java, we would have to construct this complex value by invoking new on each sub-component, then consruct the value bottom up until the toplevel expression. We would also need to define classes to describe the type of each subexpression - Person, Dept, Course.

class Person { private String name; private int age; ...}
class Dept { private String name; private String faculty; ...}
class Course { private String name; private Dept dept; private ArrayList<Person> students; ...
    public void addStudent(Persont p) { students.add(p); }

    public static void main(String[] args) {
        Person person1 = new Person("avi", 25);
        Person person2 = new Person("beate", 23);
        Dept csDept = new Dept("cs", "Natural Sciences");
        Course ppl = new Course("ppl", csDept);
        ppl.addStudent(person1);
        ppl.addStudent(person2);
    }
}

The JavaScript handling of complex literal values is more concise. Conciseness is important because it encourages the programmer to use the facility.

But the main difference is that complex data structures can be defined without having to define each unit as a new type - they come with less “baggage”. The key benefit of the easy manipulation of complex values is that different processes can exchange complex values without sharing code. In JavaScript, this is enabled by the JSON interface.

The JSON Interface

JSON stands for Java Script Object Notation. It is a standard way to serialize JavaScript compound values into strings and vice-versa, parse JSON strings into compound JavaScript values.

The key difference between the way we described the encoding of compound literal expressions above and JSON is that:

JSON plays a central role in all data exchange in client-server communication, and serialization of values in databases. JSON is supported in a wide range of programming languages - not only in JavaScript. There are JSON libraries in Java, C++, Python etc.
JSON plays a role similar to XML in enabling data exchange across heterogeneous processes.

The key parts of JSON are the stringify() and parse() methods which form the JSON interface.

let person1 = { name : "Yosi", age : 31, city : "Beer Sheva" },
    person1JSON = JSON.stringify(person1);
  console.log(person1JSON)
  console.log(`person1JSON is of type ${typeof person1JSON}`)
// =>
//     {"name":"Yosi","age":31,"city":"Beer Sheva"}
//     person1JSON is of type string
let person2 = JSON.parse(person1JSON)
  console.log(person2)
  console.log(`person2 is of type ${typeof person2}`)
// =>
//     { name: 'Yosi', age: 31, city: 'Beer Sheva' }
//     person2 is of type object

JSON Syntax

The JSON syntax (which means the way complex values are serialized into strings) is a bit more “strict” than the syntax in JavaScript maps: all keys must be written with double quotes so that you need to write { "a":1 } and not { a:1 }.

The JSON notation supports values of the following data types:

In JavaScript, values can be all of the above, plus any other valid JavaScript expression, including:

The JSON interface defines the following two methods:

JSON.stringify() and JSON.parse() also work on atomic values:

console.log(typeof(JSON.parse('2')))
console.log(typeof(JSON.parse('true')))
console.log(typeof(JSON.parse('"abc"')))
// =>
// number
// boolean
// string

Complex Compound Data

We have described compound data as data combined using two container types - arrays and maps.

These two container types can be combined in a recursive manner - so that arrays can contain other arrays or maps, and maps can contain other maps or arrays recursively.

For example:

let dept = {"deptName":"accounting",
            "employees": [
                    { "firstName":"John", "lastName":"Doe" },
                    { "firstName":"Anna", "lastName":"Smith" },
                    { "firstName":"Peter", "lastName":"Jones" }
                ]};

The dept value is a map which has a key employees with a value which is an embedded array whose items are nested maps.

To take apart the components of this value, the getters can be combined:

dept.employees[0].firstName
// =>     'John'
dept["employees"][0]["lastName"]
// =>     'Doe'

Similarly, arrays can contain nested arrays or maps.

JavaScript Arrays are heterogeneous - they can contain items of different types.

let heteroArr = [1, true, "a"],
    nestedArr = [1, [2, 3], {a:1, b:2}]
  console.log("heteroArr contains items of the following types: ")
  heteroArr.forEach(x => console.log(typeof x))
  console.log("nestedArr contains items of the following types: ")
  nestedArr.forEach(x => console.log(typeof x))
// =>
//    heteroArr contains items of the following types: 
//    number
//    boolean
//    string
//    nestedArr contains items of the following types: 
//    number
//    object
//    object

Map and Array Mutators

In JavaScript, compound values are mutable. This means that, in addition to accessing parts of a compound value, JavaScript allows us:

The fact that variables and the basic data types in JavaScript are mutable makes real functional programming difficult - since FP requires variables and compound data structures to be immutable. There are libraries, though, that enable the definition of immutable data structures and their efficient manipulation - such as https://facebook.github.io/immutable-js/.

dept
// =>
//    { deptName: 'accounting',
//      employees: 
//       [ { firstName: 'John', lastName: 'Doe' },
//         { firstName: 'Anna', lastName: 'Smith' },
//         { firstName: 'Peter', lastName: 'Jones' } ] }
dept.deptName = 'programming'
delete(dept.employees[0].lastName)
dept
// =>
//    { deptName: 'programming',
//      employees: 
//       [ { firstName: 'John' },
//         { firstName: 'Anna', lastName: 'Smith' },
//         { firstName: 'Peter', lastName: 'Jones' } ] }
let mutableArr = [0, 1, 2]
  mutableArr[0] = 1
  mutableArr
// =>     [ 1, 1, 2 ]

Array Primitive Mutators

JavaScript Arrays have many built-in primitives that allow mutation or copy of array values. It is sometimes difficult to keep track of which operations modify the array (mutators) and which operations create new array values.

For example, sort() and reverse() are mutators:

let a1 = [1,5,3,2,4]
    console.log(a1.sort())
    console.log(a1)
// =>
//    [ 1, 2, 3, 4, 5 ]
//    [ 1, 2, 3, 4, 5 ]

As you see above, after the sort() method is invoked on an array, the array is modified in place. sort() is a mutator. Similarly, reverse() is a mutator:

let a2 = [1,2,3,4]
    console.log(a2.reverse())
    console.log(a2)
// =>
//  [ 4, 3, 2, 1 ]
//  [ 4, 3, 2, 1 ]

The methods push() and pop() are also mutators:

a1.pop()
// =>    5
a1
// =>   [ 1, 2, 3, 4 ]
a1.push(5)
// => 5
a1
// =>     [ 1, 2, 3, 4, 5 ]

The methods shift() and unshift() are similar to pop() and push() but operate on the beginning of the array (while push and pop operate on the end of the array). They are also mutators:

a1.shift()
// => 1
a1
// =>  [ 2, 3, 4, 5 ]
a1.unshift(1)
// =>  5
a1
// =>    [ 1, 2, 3, 4, 5 ]

In contrast, the concat() method returns a copy of the original arrays and does not modifiy its arguments:

a1.concat(a2)
// =>     [ 1, 2, 3, 4, 5, 4, 3, 2, 1 ]
a1
// =>    [ 1, 2, 3, 4, 5 ]
a2
// =>    [ 4, 3, 2, 1 ]

Useful Array Methods

The following methods are included in the Array interface:

let am = [1,2,3,4,5,6]
    console.log(`am has ${am.length} items`)
    console.log(`am.includes(2) is ${am.includes(2)}`)
    console.log(`am.indexOf(2) is ${am.indexOf(2)}`)
    console.log(`am.join() is ${am.join()}`)
    console.log(`am.slice(2,4) is ${am.slice(2,4)}`)
    console.log(`am.splice(2) is ${am.splice(2)}`)
    console.log(`am is [${am}] after splice(2)`)
    am has 6 items
    am.includes(2) is true
    am.indexOf(2) is 1
    am.join() is 1,2,3,4,5,6
    am.slice(2,4) is 3,4
    am.splice(2) is 3,4,5,6
    am is [1,2] after splice(2)

Higher Order Methods on Arrays and JSON

The Array interface includes higher order methods - that is, methods which receive a function as an argument. These methods are extremely versatile:

Of these many methods, the most important are:

The importance of these functions is that they allow us to manipulate complex values as a whole, instead of writing code that iterates (loops) over each item. This gives a declarative flavor to the code that manipulates JSON values - similar to SQL over relational data.

Consider the following example JSON value:

const people = [
  {
    name: 'Avi',
    age: 23,
    gender: 'm',
    cs: true
  },
  {
    name: 'Ben',
    age: 25,
    gender: 'm',
    cs: false,
  },
   {
    name: 'Gila',
    age: 24,
    gender: 'f',
    cs: true,
  },
   {
    name: 'Dalia',
    age: 27,
    gender: 'f',
    cs: false,
  },
];

We can use filter() with appropriate predicates to perform operations similar to a select in SQL:

// Select the people who are CS students
people.filter(p => p.cs)
// =>
//    [ { name: 'Avi', age: 23, gender: 'm', cs: true },
//      { name: 'Gila', age: 24, gender: 'f', cs: true } ]

// Select people by gender
people.filter(p => p.gender == 'f')
// =>
//    [ { name: 'Gila', age: 24, gender: 'f', cs: true },
//      { name: 'Dalia', age: 27, gender: 'f', cs: false } ]

map can be used to compute transformations of the input items. For example, assume we want to compute the year of birth of each person, we can compute:

people.map(p => 2021 - p.age)
// =>    [ 1998, 1996, 1997, 1994 ]

If we want to modify the maps of each person, we can use mutation:

// When a fat-array function body contains more than one expression - we must put the body in {} and use "return"
// When the body has only one expression, we can skip both (x => x*x)
people.map(p => {p.birthYear = 2021-p.age; return p})
// =>
//  [ { name: 'Avi', age: 23, gender: 'm', cs: true, birthYear: 1998 },
//    { name: 'Ben', age: 25, gender: 'm', cs: false, birthYear: 1996 },
//    { name: 'Gila', age: 24, gender: 'f', cs: true, birthYear: 1997 },
//    { name: 'Dalia', age: 27, gender: 'f', cs: false, birthYear: 1994 } ]

As we can see, the application of a function with mutation through map modified the original JSON objec

people
// =>
//    [ { name: 'Avi', age: 23, gender: 'm', cs: true, birthYear: 1998 },
//      { name: 'Ben', age: 25, gender: 'm', cs: false, birthYear: 1996 },
//      { name: 'Gila', age: 24, gender: 'f', cs: true, birthYear: 1997 },
//      { name: 'Dalia', age: 27, gender: 'f', cs: false, birthYear: 1994 } ]

map() and filter() return an array based on the original array.

Reduce

The reduce() method transforms the array into a single value through iterative application through each item. It applies a function against an accumulator and each value of the array (from left-to-right) to reduce it to a single value.

reduce() is often called fold() or accumulate() in other FP languages.

It iterates over a list, applying a function to an accumulated value and the next item in the list, until the iteration is complete and the accumulated value gets returned.

reduce() takes a reducer function and an initial value, and returns the accumulated value. The reducer function receives two arguments: the current accumulator and the current item.

array.reduce(
  reducer: (accumulator: any, current: any) => any,
  initialValue: any
) => accumulator: any

This is useful for any type of aggregation which would be similar to a group by statement in SQL over the array.

// Sum of all elements in the array
[2,4,6].reduce((acc, n) => acc + n, 0);
// => 12

For each element in the array, the reducer is called and passed the accumulator and the current value. The reducer’s job is to “fold” the current value into the accumulated value. The reducer returns the new accumulated value, and reduce() moves on to the next value in the array. The reducer needs an initial value to start with, so the initial value is passed as a parameter to reduce().

NOTE: In general, it makes sense that the initial value should be a neutral element for the accumulator function.

In the case of our sum, the first time the reducer is called, acc starts with 0 - the value we passed to reduce() as the second parameter.

The reducer returns 0 + 2 (2 was the first element in the array), which is 2.

For the next call, acc = 2, n = 4 and the reducer returns the result of 2 + 4 (6).

In the last iteration, acc = 6, n = 6, and the reducer returns 12.

Since the iteration is finished, .reduce() returns the final accumulated value, 12.

In this case, we passed in an anonymous reducing function, but we can abstract it and give it a name:

const sumReducer = (acc, n) => acc + n;
[2,4,6].reduce(sumReducer, 0);
// => 12

If we want to trace the behavior of the reducer, we can add calls to log:

const tracingSumReducer = (acc, n) => { console.log(`Reduce ${acc} + ${n}`); return acc+n; }
[2, 4, 6].reduce(tracingSumReducer, 0);
// =>
//  Reduce 0 + 2
//  Reduce 2 + 4
//  Reduce 6 + 6
//  12

Reduce can map to values of different types than those present in the array on which it iterates:

// Sum age of all persons in the people array
people.reduce((acc, p) => acc + p.age, 0)
// =>    99

And reduce can return compound values as well:

// Count number of men and women
people.reduce(
    (counter, p) => {
        p.gender == 'm' ? counter.men++ : counter.women++;
        return counter;
    }, {men:0, women:0})
// =>     { men: 2, women: 2 }

Summary

Literal Expressions and Complex Values

JSON

Exercises

Map, Filter, Reduce

  1. Define a function product(arrayNumbers) which returns the product of all the numbers in an array of numbers. Example:
    product([1,2,3]) -> 6
    

    What should be the value of product([])? Justify.

  2. Using the Ramda function range (http://ramdajs.com/docs/#range) - define a version of the factorial() function using reduce().
    import * as R from 'ramda'; 
    R.range(1,5)
    [ 1,2,3,4 ]
    
  3. Define map() using reduce().

  4. Define filter() using reduce().

JSON

  1. Run the exercises in http://reactivex.io/learnrx/

  2. Consider the following JSON value (obtained from a call to a Netflix API):

let movieLists = [
		{
			name: "New Releases",
			videos: [
				{
					"id": 70111470,
					"title": "Die Hard",
					"boxart": "http://cdn-0.nflximg.com/images/2891/DieHard.jpg",
					"uri": "http://api.netflix.com/catalog/titles/movies/70111470",
					"rating": 4.0,
					"bookmark": []
				},
				{
					"id": 654356453,
					"title": "Bad Boys",
					"boxart": "http://cdn-0.nflximg.com/images/2891/BadBoys.jpg",
					"uri": "http://api.netflix.com/catalog/titles/movies/70111470",
					"rating": 5.0,
					"bookmark": [{ id: 432534, time: 65876586 }]
				}
			]
		},
		{
			name: "Dramas",
			videos: [
				{
					"id": 65432445,
					"title": "The Chamber",
					"boxart": "http://cdn-0.nflximg.com/images/2891/TheChamber.jpg",
					"uri": "http://api.netflix.com/catalog/titles/movies/70111470",
					"rating": 4.0,
					"bookmark": []
				},
				{
					"id": 675465,
					"title": "Fracture",
					"boxart": "http://cdn-0.nflximg.com/images/2891/Fracture.jpg",
					"uri": "http://api.netflix.com/catalog/titles/movies/70111470",
					"rating": 5.0,
					"bookmark": [{ id: 432534, time: 65876586 }]
				}
			]
		}
	]

This JSON value is a sort of tree with two levels - with a list of video objects at the 2nd level.

Using only map(), filter() and reduce() - compute the following values: