101.05 - Lists

Functions on Lists

  • xs.length
  • xs.head, xs.tail, xs.init, xs.last
  • xs take n - first n elements of the list
  • xs drop n - the rest of the list after xs take n
  • xs(n)
  • xs ++ ys - concatenation
  • xs.reverse
  • xs updated (n, x)
  • xs indexOf x
  • xs contains x

Implementation examples:

def last[T](xs: List[T]): T = xs match {
  case List() => throw new Error("last of empty list")
  case List(x) => x
  case y :: ys => last(ys)
}

Complexity is proportional to the length of xs.

def init[T](xs: List[T]): List[T] = xs match {
  case List() => throw new Error("init of empty list")
  case List(x) => List()
  case y :: ys => y :: init(ys)
}

Complexity is proportional to the length of xs.

def concat[T](xs: List[T], ys: List[T]): List[T] = xs match {
  case List() => ys
  case z :: zs => z :: concat(zs, ys)
}

Complexity is proportional to the length of xs.

def reverse[T](xs: List[T]): List[T] = xs match {
  def List() => xs
  def y :: ys => reverse(ys) ++ List(y)
}

Complexity is (length of xs) x (length of xs), because of reversal AND concatenation.

def removeAt[T](xs: List[T], n: Int): List[T] = (xs take n) ::: (xs drop n+1)
}

Pairs and tuples

Example: implementing merge sort. This algorithm sorts a list by: splitting the list in two sub-lists, then sorting each sub-list and finally merging the two sorted sub-lists together.

def msort(xs: List[Int]): List[Int] = {
  val n = xs.length/2
  // if 0 or 1 elements in the list, the list is already sorted
  if (n == 0) xs
  else {
    def merge(xs: List[Int], ys: List[Int]): List[Int] = ???
    val (first, second) = xs splitAt n
    merge(msort(first), msort(second))
  }
}

A first implementation of merge():

def merge(xs: List[Int], ys: List[Int]) = 
    xs match {
      case Nil => ys
      case x :: xs1 => ys match {
        case Nil => xs
        case y :: ys1 =>
          if (x < y) x :: merge(xs1, ys)
          else y :: merge(xs, ys1)
      }
    }

In the following statement:

val (first, second) = xs splitAt n

(first, second) is a pair. Other example of a pair:

val pair = ("answer", 42)

Also works for tuples (T1, ..., Tn).

Tuples are classes modelled after the following pattern:

case class Tuple2[T1, T2](_1: + T1, _2: + T2) {
  override def toString = "(" + _1 + "," + _2 + ")"
}

Rewriting the merge() function with pattern matching over pairs:

def merge(xs: List[Int], ys: List[Int]): List[Int] = 
  (xs, ys) match {
    case (Nil, ys) => ys
    case (xs, Nil) => xs
    case (x :: xs1, y :: ys1) =>
      if (x < y) x :: merge(xs1, ys) else y :: merge(xs, ys1)
  }

Implicit parameters

The merge() function above works for List[Int] because we don't have a generic < operator that works for any type. How can this function be parameterized with a comparison function?

def msort[T](xs: List[T])(lt: (T, T) => Boolean): List[T] = {
  val n = xs.length/2
  // if 0 or 1 elements in the list, the list is already sorted
  if (n == 0) xs
  else {
    def merge(xs: List[Int], ys: List[Int]): List[Int] = 
      (xs, ys) match {
        case (Nil, ys) => ys
        case (xs, Nil) => xs
        case (x :: xs1, y :: ys1) =>
          if lt(x, y) x :: merge(xs1, ys) else y :: merge(xs, ys1)
      }
    val (first, second) = xs splitAt n
    merge(msort(first)(lt), msort(second)(lt))
  }
}

val nums = List(1, -4, 5, 2, 3)
msort(nums)((x:Int, y: Int) => x < y)

val fruits = List("apple", "pineapple", "orange", "banana")
msort(fruits)((x:String, y: String) => x.compareTo(y) < 0)

There already is an Ordering parametric class in the Scala library, so we can rewrite this as follows:

def msort[T](xs: List[T])(ord: Ordering[T]): List[T] = {
  val n = xs.length/2
  // if 0 or 1 elements in the list, the list is already sorted
  if (n == 0) xs
  else {
    def merge(xs: List[Int], ys: List[Int]): List[Int] = 
      (xs, ys) match {
        case (Nil, ys) => ys
        case (xs, Nil) => xs
        case (x :: xs1, y :: ys1) =>
          if ord.lt(x, y) x :: merge(xs1, ys) else y :: merge(xs, ys1)
      }
    val (first, second) = xs splitAt n
    merge(msort(first)(ord.lt), msort(second)(ord.lt))
  }
}

val nums = List(1, -4, 5, 2, 3)
msort(nums)(Ordering.Int)

val fruits = List("apple", "pineapple", "orange", "banana")
msort(fruits)(Ordering.String)

We don't need to pass any of the ordering functions around, if we choose to use implicit parameters:

def msort[T](xs: List[T])(implicit ord: Ordering[T]): List[T] = {
  val n = xs.length/2
  // if 0 or 1 elements in the list, the list is already sorted
  if (n == 0) xs
  else {
    def merge(xs: List[Int], ys: List[Int]): List[Int] = 
      (xs, ys) match {
        case (Nil, ys) => ys
        case (xs, Nil) => xs
        case (x :: xs1, y :: ys1) =>
          if ord.lt(x, y) x :: merge(xs1, ys) else y :: merge(xs, ys1)
      }
    val (first, second) = xs splitAt n
    merge(msort(first), msort(second))
  }
}

val nums = List(1, -4, 5, 2, 3)
msort(nums)

val fruits = List("apple", "pineapple", "orange", "banana")
msort(fruits)

Higher-order List functions

Usual patterns when working with lists:

  • transforming each element of the list in a certain way;
  • retrieving a sub-list of elements satisfying a criterion;
  • combining list elements using an operator;

Applying a function to elements of a List:

def scaleList(xs: List[Double], factor: Double): List[Double] = xs match {
  case Nil => xs
  case y :: ys => y * factor :: scaleList(ys, factor)
}

This can be generalised by a method map in the List class as follows:

abstract class List[T] { ...
  def map[U](f: T => U): List[U] = this match {
    case Nil => this
    case x : xs => f(x) :: xs.map(f)
  }
}

def scaleList(xs: List[Double], factor: Double) = 
    xs map (x => x * factor)

Another example, first using pattern matching, then using map:

def squareList(xs: List[Int]): List[Int] = xs match {
  case Nil => xs
  case y :: ys => (y * y) :: squareList(ys) 
}

def squareList(xs: List[Int]): List[Int] = xs map (x => x * x)

Extracting a sub-list containing all list elements satisfying a condition:

def posElements(xs: List[Int]): List[Int] = xs match {
  case Nil => xs
  case y :: ys => if (y > 0) y :: posElements(ys) else posElements(ys)
}

This can be generalised by a method filter in the List class:

abstract class List[T] {
  ...
  def filter(p: T => Boolean): List[T] = this match {
    case Nil => Nil
    case x :: xs => if (p(x)) x :: xs.filter(p) else xs.filter(p)
  }
}

def posElems(xs: List[Int]): List[Int] = xs filter (x => x > 0)

More methods to extract sub-lists based on predicates:

  • xs.filterNot(p)
  • xs.partion(p) - same as (xs.filter(p), xs.filterNot(p)) calculated in a singke traversal
  • xs.takeWhile(p) - the longest prefix of the list that matches the predicate
  • xs.dropWhile(p) - the remainder of the list after dropping the prefix matching the predicate
  • xs.span(p) - same as (xs.takeWhile(p), xs.dropWhile(p)) calculated in a single traversal

Example: having a list of characters List("a", "a", "a", "b", "c", "c", "a") we want to encode this list using the count for each sequence in the list: List(("a", 3), ("b", 1), ("c", 2), ("a", 1)). We do this in two steps, first building the intermediary list List(List("a", "a", "a"), List("b"), List("c", "c"), List("a")):

val data = List("a", "a", "a", "b", "c", "c", "a")

def pack[T](xs: List[T]): List(List[T]) = xs match {
  case Nil => Nil
  case x :: xs1 => 
    val (first, rest) = xs span (y => y == x)
    first :: pack(rest)
}

def encode[T](xs: List[T]): List[(T, Int)] = 
  pack(xs) map(ys => (ys.head, ys.length))

encode(data)

Reduction of Lists

"Fold" or "Reduce" combinators. They combine the adjacent elements of a list using a given operator.

Example:

sum(List(x1, ..., xn)) = 0 + x1 + ... + xn
product(List(x1, ..., xn)) = 1 * x1 * ... * xn
def sum(xs: List[Int]): Int = xs match {
  case List() => 0
  case y :: ys => y + sum(ys)
}

Can be generalised using the method reduceLeft:

List(x1, ..., xn) reduceLeft op = ( ... (x1 op x2) op ...) op xn
def sum(xs: List[Int]): Int = (0 :: xs) reduceLeft ((x, y) => x + y)
def product(xs: List[Int]): Int = (1 :: xs) reduceLeft ((x, y) => x * y)

or using underscores:

def sum(xs: List[Int]): Int = (0 :: xs) reduceLeft (_ + _)
def product(xs: List[Int]): Int = (1 :: xs) reduceLeft (_ * _)

A more general function, foldLeft takes an additional accumulator parameter:

(List(x1, ..., xn) foldLeft z) op = ( ... (z op x1) op ...) op xn
def sum(xs: List[Int]): Int = (xs foldLeft 0) (_ + _)
def product(xs: List[Int]): Int = (xs foldLeft 1) (_ * _)

Possible implementations of foldLeft and reduceLeft:

abstract class List[T] { ...

  def reduceLeft(op: (T, T) => T): T = this match {
    case Nil => throw new Error("Nil.reduceLeft")
    case x :: xs => (xs foldLeft x)(op)
  }

  def foldLeft[U](z: U)(op: (U, T) => U): U = this match {
    case Nil => z
    case x :: xs => (xs foldLeft op(z, x))(op)
  }
}

foldLeft and reduceLeft unfold on trees that reduce to the left. They have two complimentary operations: foldRight and leanRight that lean to the right.

List(x1, ... , x{n-1}, xn) reduceRight op 
  = x1 op ( ... (x{n-1} op xn) ... )

(List(x1, ... , x{n-1}, xn) foldRight acc) op 
  = x1 op ( ... (xn op acc) ... )

Possible implementations of foldRight and reduceRight:

abstract class List[T] { ...

  def reduceRight(op: (T, T) => T): T = this match {
    case Nil => throw new Error("Nil.reduceLeft")
    case x:: Nil => x
    case x :: xs => op(x, xs reduceRight(op))
  }

  def foldRight[U](z: U)(op: (T, U) => U): U = this match {
    case Nil => z
    case x :: xs => op(x, (xs foldRight z)(op))
  }
}

For operators that are associative and commutative, foldLeft and foldRight are equivalent.

Reasoning about concat

Considering the concat operation on lists ++, this operation is governed by two rules:

  • concatenation it is associative
  • Nil is a neutral element, at the left or at the right
(xs ++ ys) ++ zs = xs ++ (ys ++ zs)
xs ++ Nil = xs
Nil + xs = xs

This can be proved by structural induction on lists. Structural induction is derived from the natural induction that goes like this:

In order to prove a property P(n) for all integers n >=b, we will:

  • prove the base case P(b)
  • prove the induction step: if P(n) then P(n + 1) for n >= b

For example, given:

def factorial(n: Int): Int = {
  if (n == 0) 1               // 1st clause
  else n * factorial(n - 1)   // 2nd clause
}

show that for each n >= 4 we have:

factorial(n) >= power(2, n)
  • base case: factorial(4) = 24 > 16 = power(2, 4)
  • induction step: assuming that factorial(n) >= power(2, n) show that factorial(n + 1) >= power (2, n + 1) factorial (n + 1) = (n + 1) factorial(n) > 2 power(2, n) because:
    • factorial(n) > power(2, n) as assumed above
    • (n + 1) >= 5 for n >= 4 and 5 > 2

Structural induction is analogous to the natural induction. In order to show that a property P(xs) holds for all lists:

  • prove the base case: show that P(Nil) holds
  • prove the induction step: assuming that P(xs) holds show that P(x :: xs) holds for any x and any xs

Using structural induction we have to show that, for any lists xs, ys, zs:

(xs ++ ys) ++ zs = xs ++ (ys ++ zs)

Take the following definition of concat:

def concat[T](xs: List[T], ys: List[T]) = xs match {
  case Nil => ys
  case x :: xs1 => x :: concat(xs1, ys)
}

This can be distilled to the following two clauses:

Nil ++ ys = ys                          // 1st clause
(x :: xs1) ++ ys = x :: (xs1 ++ ys)     // 2nd clause

Structural induction:

  • base case: (Nil ++ ys) ++ zs = ys ++ zs Nil ++ (ys ++ zs) = ys ++ zs
  • induction step:
    assuming that (xs ++ ys) ++ zs = xs ++ (ys ++ zs) left: ((x :: xs) ++ ys) ++ zs = (x :: (xs ++ ys)) ++ zs = x :: ((xs ++ ys) ++ zs) = x :: (xs ++ (ys ++ zs)) right: (x :: xs) ++ (ys ++ zs) = x :: (xs ++ (ys ++ zs))

A larger equational proof on Lists

Example: using the reverse function to discuss about equational proofs:

Nil.reverse = Nil                               // 1st clause
(x :: xs).reverse = xs.reverse ++ List(x)       // 2nd clause

We will try to prove that:

xs.reverse.reverse = xs
  • base case: Nil.reverse.reverse = Nil.reverse = Nil

  • induction step:
    assuming that xs.reverse.reverse = xs left: (x :: xs).reverse.reverse = (xs.reverse ++ List(x)).reverse right: x :: xs = x :: xs.reverse.reverse

    We cannot simplify more, but still need to show that: (xs.reverse ++ List(x)).reverse = x :: xs.reverse.reverse

    We replace xs.reverse by ys at the left and right. ys is an arbitrary list. We now need to prove that: (ys ++ List(x)).reverse = x :: ys.reverse

  • base case: (Nil ++ List(x)).reverse = List(x).reverse = (x :: Nil).reverse = Nil.reverse ++ List(x) = Nil ++ (x :: Nil) = x :: Nil = x :: Nil.reverse

  • induction step:
    assuming that (ys ++ List(x)).reverse = x :: ys.reverse want to show that: ((y :: ys) ++ List(x)).reverse = x :: (y :: ys).reverse left: ((y :: ys) ++ List(x)).reverse = (y :: (ys ++ List(x)). reverse = (ys ++ List(x)).reverse ++ List(y) = (x :: ys.reverse) ++ List(y) = x :: (ys.reverse ++ List(y)) = x :: (y :: ys).reverse right: x :: (y :: ys).reverse