Scala Saturday – The Stream.takeWhile Method

When you want a set number of items from the beginning of a stream, you use Stream.take. When you don’t care so much about the number, but rather the nature of the items, use Stream.takeWhile.

Stream.takeWhile starts at the beginning of the stream and applies a predicate to each item, one by one. It returns each item in a new stream until it reaches an item that does not meet the predicate:

Stream.takeWhile takes items from the beginning of a stream until it reaches an item that does not meet the given predicate.
Taking Items from a Stream While a Predicate Holds

Note that Stream.takeWhile does not return the item that fails to meet the predicate. It stops with the last item that meets the predicate.

Perhaps you have a sensor feeding you temperature readings once per minute:

import java.time.{LocalDateTime, Month}

case class Reading(
  temperature: Double, 
  timestamp: LocalDateTime)

val readings = Stream(
  Reading(89.5, 
    LocalDateTime.of(2015, Month.JULY, 19, 10, 0)),
  Reading(90.1, 
    LocalDateTime.of(2015, Month.JULY, 19, 10, 1)),
  Reading(89.9, 
    LocalDateTime.of(2015, Month.JULY, 19, 10, 2)),
  Reading(90.0, 
    LocalDateTime.of(2015, Month.JULY, 19, 10, 3)),
  Reading(90.1, 
    LocalDateTime.of(2015, Month.JULY, 19, 10, 4)),
  Reading(-1000.0, 
    LocalDateTime.of(2015, Month.JULY, 19, 10, 5)),
  Reading(-1000.0, 
    LocalDateTime.of(2015, Month.JULY, 19, 10, 6)),
  Reading(90.2, 
    LocalDateTime.of(2015, Month.JULY, 19, 10, 7))
)

Now to get all readings up until the thermometer returns the data-not-valid indicator, employ Stream.takeWhile:

val valid = readings takeWhile { _.temperature != -1000.0 }

// valid: scala.collection.immutable.Stream[Reading] = 
//   Stream(
//     Reading(89.5,2015-07-19T10:00), 
//     Reading(90.1,2015-07-19T10:01), 
//     Reading(89.9,2015-07-19T10:02), 
//     Reading(90.0,2015-07-19T10:03), 
//     Reading(90.1,2015-07-19T10:04)
//   )

Finally, Stream.takeWhile doesn’t balk if it never reaches an element that fails to meet the predicate:

val all = Stream(1,3,4,7) takeWhile { _ < 10 }
// all: scala.collection.immutable.Stream[Int] = 
//   Stream(1, 3, 4, 7)

As with Stream.take, the other collection types also define takeWhile:

Scala Saturday – The Stream.take Method

Sometimes you have a collection of items, and you only want the first so many items. You may not know or even care what the first n items are, you just know that you want the first n items. Stream.take can do that for you.

To Infinity … and Beyond!

OK, it’s unusual that you just don’t care what items you get from a sequence, but there are occasions: for instance, random numbers. You really don’t care what the number is. You just don’t want it to be the same number every time, and perhaps you want it to fall within a certain range. Other than that, you don’t care: you just want a number!

val seed = new java.util.Date().hashCode
val rand = new scala.util.Random(seed)
val someNum = rand.nextInt
// someNum: Int = 1717783198  (well, this time anyway)

Now, assume that you have an application that needs a steady stream of random numbers. (Maybe even the number of random numbers you want is itself random!) Wouldn’t it be great to turn your random number into a stream?

Scala streams are lazy data structures. That means that as you iterate over the stream, it produces the next item on demand. It may not even have calculated what the next item is until you ask for it. In fact, stream can be infinite! How do you have an infinite stream? Well, back to the random number generator.

You can call rand.nextInt until the cows come home and just keep getting values. Well, the Stream companion object gives us a way to turn rand (or any function that calculates a value from an input) into an infinite stream—Stream.iterate:

let rands = Stream.iterate(rand.nextInt) {
  _ => rand.nextInt
}

Stream.iterate takes a starting value of type A and a function that takes a value of type A and returns another value of type A—an integer in this case. Now you can call Stream.take on rands, and get as many or as few values as you may want:

val taken = rands take 5
// taken: scala.collection.immutable.Stream[Int] = 
//   Stream(-2112282457,
//     -1552737819,
//     -745613730,
//     795338080,
//     -219200225)

Additionally, Scala’s #:: operator allows you to create the infinite stream of random numbers in another way:

def rands: Stream[Int] = rand.nextInt() #:: rands
val taken = rands take 5
// taken: scala.collection.immutable.Stream[Int] = 
//   Stream(-575330347,
//     631339635,
//     -911607409,
//     -614867864,
//     -1194280908)

In this case, you’re defining a function that calls itself recursively to provide the next element in the stream.

Use whichever approach you find the more natural.

Going for the Gold

There are, of course, times when your data set is not infinite. You happen not to need all of them, but only the first few. I mentioned early on how that you may not know or care what the items are. A better way of saying it is that you (or someone) have likely already done the work to prepare the data set—processed it, sorted it, whatever. Now that you have done all that preprocessing, you want the first few items.

Perhaps you are writing software to determine the gold, silver, and bronze medal winners in the Olympic Games. You have a type representing each competitor:

case class Competitor(name: String, score: Double)

You have the list of competitors for the 2012 men’s 110-meter hurdles, and they are already sorted according to ranking:

val competitors = Stream(
  Competitor("Aries MERRITT", 12.92),
  Competitor("Jason RICHARDSON", 13.04),
  Competitor("Hansle PARCHMENT", 13.12),
  Competitor("Lawrence CLARKE", 13.39),
  Competitor("Ryan BRATHWAITE", 13.40),
  Competitor("Orlando ORTEGA", 13.43),
  Competitor("Lehann FOURIE", 13.53)
)

Now to get the medalists, take the first three records:

val medalists = competitors take 3
// medalists: scala.collection.immutable.Stream[Competitor] =
//   Stream(
//     Competitor(Aries MERRITT,12.92),
//     Competitor(Jason RICHARDSON,13.04)
//     Competitor(Hansle PARCHMENT,13.12) )

Take It to the Limit

What happens, though, when you have a stream with fewer items than you attempt to take?

val numbersOfTheCounting = Stream(1,2,3) take 5
numbersOfTheCounting foreach println
// 1
// 2
// 3

Well, that’s nice. You ask for five, so the stream gives you all it has. If that happens to be fewer than you asked for, hey, no worries.

I should note that Stream is not unique in having a take method. The strict, sequential collection types—Array, List, Seq, and Vector—all have a take method that functions the same as Stream.take does. Even Map has a take method.

Scala Saturday – The List.++ Method

Occasionally, you need to combine two lists (or arrays or vectors or sequences) into one. List.++ to the rescue! (Along with Array.++, and Vector.++, and Seq.++.)

A widow Carol has three daughters: Marcia, Jan, and Cindy.

val ladies = List("Carol", "Marcia", "Jan", "Cindy")

A widower Mike has three sons: Greg, Peter, and Bobby.

val fellas = List("Mike", "Greg", "Peter", "Bobby")

That lovely lady meets that fellow, and they know it is much more than a hunch. They marry and form a family:

val bunch = ladies ++ fellas
// bunch: List[String] = 
//   List(Carol, Marcia, Jan, Cindy,
//        Mike, Greg, Peter, Bobby)

Of course, as you probably have guessed, order matters. Let’s reverse the arguments:

val bunch2 = fellas ++ ladies
// bunch2: List[String] = 
//   List(Mike, Greg, Peter, Bobby,
//        Carol, Marcia, Jan, Cindy)

You can also use ++ to chain a series of lists together:

val hobbits = List("Frodo", "Sam", "Pippin", "Merry")
val men = List("Aragorn", "Boromir")
val dwarves = List("Gimli")
val elves = List("Legolas")
val maiar = List("Gandalf")
val fellowship = hobbits ++ men ++ dwarves ++
                 elves ++ maiar
// fellowship: List[String] = 
//   List(Frodo, Sam, Pippin, Merry, Aragorn, 
//        Boromir, Gimli, Legolas, Gandalf)

And there’s nothing special about ++ in this regard. Because Scala allows infix notation, you can similarly chain other operations together:

val fellowslip = 
  hobbits ++ men ++ dwarves ++
  elves ++ maiar filter {
    !_.startsWith("G")
  } map {
    _.toUpperCase
  }
// fellowslip: List[String] =
//   List(FRODO, SAM, PIPPIN, MERRY,
//        ARAGORN, BOROMIR, LEGOLAS)

Scala Saturday – Pattern Matching, Part 4: Extractors

This week we take a look under the hood, as it were, of pattern matching in Scala. An extractor is any object that defines an unapply() method that Scala’s match expressions can use to evaluate whether the input value is a match or not.

One reason case classes are so convenient for pattern matching is because they define an unapply() method along with the other handy tools Scala gives you when you define a case class. You don’t have to define a case class to get an unapply() method, though. You can define one yourself and enjoy the benefits.

Boolean Extractors

Extractors allow you to give a readable name to a case that effectively communicates the nature of the match, but hide some of the clutter that can threaten the readability of your code.

A classic example is determining whether an integer is even or odd. Now you could do that this way:

n % 2 match {
  case 0 => n -> "even"
  case _ => n -> "odd"
}

Now n % 2 is a simple expression, and its use common enough in computer science that most of us recognize right away, whenever we see it, “Oh, right, even or odd.” But there is just the slightest context switch between thinking mathematically and thinking conceptually, i.e., determining intent. And that context switch slows us down. A more complex expression can slow us down even more.

Compare the above (admittedly simple) match expression to what you can do with a couple of extractors. First, define the pair of extractors this way:

object Even {
  def unapply(n: Int) = n % 2 == 0
}
object Odd {
  def unapply(n: Int) = n % 2 == 1
}

This illustrates one way you can create an extractor: Define unapply() so that it returns a Boolean. A true value indicates a match, false a failure.

Now after defining the extractors, you can use them in a match expression like this:

def oddOrEven(n: Int) = {
  n match {
    case Even() => n -> "even"
    case Odd() => n -> "odd"
  }
}

(1 to 10) map oddOrEven foreach println

// (1,odd)
// (2,even)
// (3,odd)
// (4,even)
// (5,odd)
// (6,even)
// (7,odd)
// (8,even)
// (9,odd)
// (10,even)

Look at the difference in readability. There is no context switch. It reads much smoother than the original match expression. You can see that oddOrEven takes a value, n, and returns a result based on whether n is even or odd. You don’t have to leave the realm of the conceptual to think mathematically and then turn right back around to think conceptually again.

A point of emphasis: When building a match expression, you need to cover all the bases and define a return condition for every case. That is easy for the even/odd test: There are only two cases.

But what if there are several cases? Take the wavelengths of colors in the spectrum of visible light. Each color corresponds to a range of wavelengths:

Color Wavelength Ranges
Color Wavelength
Red 620–750 nm
Orange 590–620 nm
Yellow 570–590 nm
Green 495–570 nm
Blue 450–495 nm
Violet 380–450 nm

No problem, right? Just define a quick little set of extractors:

object Red {
  def unapply(λ: Int) = 620 <= λ && λ < 750
}
object Orange {
  def unapply(λ: Int) = 590 <= λ && λ < 620
}
object Yellow {
  def unapply(λ: Int) = 570 <= λ && λ < 590
}
object Green {
  def unapply(λ: Int) = 495 <= λ && λ < 570
}
object Blue {
  def unapply(λ: Int) = 450 <= λ && λ < 495
}
object Violet {
  def unapply(λ: Int) = 380 <= λ && λ < 450
}

Then put those extractors to use in a function:

def colorOfLight(λ: Int) = {
  λ match {
    case Red() => s"$λ nm" -> "red"
    case Orange() => s"$λ nm" -> "orange"
    case Yellow() => s"$λ nm" -> "yellow"
    case Green() => s"$λ nm" -> "green"
    case Blue() => s"$λ nm" -> "blue"
    case Violet() => s"$λ nm" -> "violet"
  }
}

Now this line should run like a charm, right?

List(800,700,600,580,500,475,400,350)
  .map(colorOfLight)
  .foreach(println)

// Exception in thread "main" scala.MatchError: 
//   800 (of class java.lang.Integer)
// ...

Whoa, what happened? There is radiation outside the visible spectrum; λ could be greater than 750 nm (as it is in this case of 800 nm) or less than 380 nm. You therefore need a catch-all case to cover the values that are outside the explicit cases:

def colorOfLight(λ: Int) = {
  λ match {
    case Red() => s"$λ nm" -> "red"
    case Orange() => s"$λ nm" -> "orange"
    case Yellow() => s"$λ nm" -> "yellow"
    case Green() => s"$λ nm" -> "green"
    case Blue() => s"$λ nm" -> "blue"
    case Violet() => s"$λ nm" -> "violet"
    case _ => s"$λ nm" -> "invisible"
  }
}

Now your little three-liner really does run like a charm:

List(800,700,600,580,500,475,400,350)
  .map(colorOfLight)
  .foreach(println)

// (800 nm,invisible)
// (700 nm,red)
// (600 nm,orange)
// (580 nm,yellow)
// (500 nm,green)
// (475 nm,blue)
// (400 nm,violet)
// (350 nm,invisible)

Option Extractors

Another way to indicate a match with an extractor is to return an Option. If the input meets the criteria sought, you return the case name in a Some. If not, you return a None. Furthermore, you can capture the matching value or a collection of values based on calculations the extractor performs on the input value.

If you just need to capture one value, return an Option[A]. You could modify the even/odd case above so that you capture the input variable in another variable name:

object Even {
  def unapply(n: Int) = if (n % 2 == 0) Option(n) else None
}
object Odd {
  def unapply(n: Int) = if (n % 2 == 1) Option(n) else None
}

def oddOrEven(n: Int) = {
  n match {
    case Even(m) => m -> "even"
    case Odd(m) => m -> "odd"
  }
}

(1 to 10) map oddOrEven foreach println

// (1,odd)
// (2,even)
// (3,odd)
// (4,even)
// (5,odd)
// (6,even)
// (7,odd)
// (8,even)
// (9,odd)
// (10,even)

To capture multiple values, return an Option that contains a tuple of the number of values you want to capture.

Perhaps you are a clerk in a department store. You recommend that customers who are at least 6′ (72 inches) tall and have at least a 40-inch waist go to the Big & Tall section. Others you greet according to their proportions.

First, define a Measurements type with height and waist properties:

class Measurements(val height: Int, val waist: Int)

Then define three extractors:

  1. one to detect whether a customer meets the “big” criterion,
  2. a second to detect whether he meets the “tall” criterion, and
  3. a third to detect whether he meets both criteria.
object Big {
  def unapply(m: Measurements) =
    if (m.waist >= 40) Some(m.waist) else None
}
object Tall {
  def unapply(m: Measurements) =
    if (m.height >= 72) Some(m.height) else None
}
object BigAndTall {
  def unapply(m: Measurements) =
    (Big.unapply(m), Tall.unapply(m)) match {
      case (Some(w), Some(h)) => Some(w,h)
      case _ => None
    }
}

Now that you have the extractors defined, you can use them in a match expression that extracts the waist and height on a successful match:

def sizeUp(m: Measurements) = {
  m match {
    case BigAndTall(w,h) =>
      s"$w-inch waist and $h inches tall: " +
        "Let me show you to our big & tall section"
    case Tall(h) => s"$h inches tall: How's the weather up there?"
    case Big(w) => s"$w-inch waist: Big fella, ain'tcha?"
    case _ => "How may I help you?"
  }
}

Now you run some customers through the sizeUp function:

  val me = new Measurements(76, 36)
  val shrimp = new Measurements(58, 28)
  val hoss = new Measurements(80, 46)
  val tubby = new Measurements(63, 42)

  List(me, shrimp, hoss, tubby)
    .map(sizeUp)
    .foreach(println)
// 76 inches tall: How's the weather up there?
// How may I help you?
// 46-inch waist and 80 inches tall: 
//   Let me show you to our big & tall section
// 42-inch waist: Big fella, ain'tcha?

Sequence Extractors

Finally, you can extract a variable number of values and match only on elements that meet your conditions while ignoring the rest. Sequence extractors define an unapplySeq() method rather than unapply(). The unapplySeq() method must return an Option[Seq[A]].

You could build an extractor that gets the prime factors of an integer:

object Factors {
  def unapplySeq(n: Int): Option[Seq[Int]] = {
    @tailrec
    def go(factors: List[Int], candidates: Seq[Int]): List[Int] = {
      if (candidates.isEmpty) {
        factors
      } else {
        val head = candidates.head
        val tail = candidates.tail
        if (n % head == 0) {
          go(head :: factors, tail)
        } else {
          go(factors, tail.filter(_ % head != 0))
        }
      }
    }

    val factors = n :: go(List(1), (2 to (n/2)).toSeq)
    if (factors.isEmpty) None else Some(factors.reverse)
  }
}

I won’t explain the (rather brute force) factorization method above. Suffice it to say that it returns the prime factors in order. For example, for 15, it returns {1,3,5,15}. Now let’s say that we want to match on numbers that are divisible by three, but not two. That means we’re looking for factor sets that follow this pattern: {1,3,…}. Here is how you use Factors to match that pattern:

def divBy3Not2(n: Int) = n match {
  case Factors(1,3,_*) =>
    s"$n: Divisible by three, but not two"
  case _ => n.toString
}

The _* wildcard tells Scala that you don’t really care what follows. If the first two elements match, then it’s a match. Now you can put divBy3Not2 to use:

List(2,6,9,10,15,54)
  .map(divBy3Not2)
  .map(println)
// 2
// 6
// 9: Divisible by three, but not two
// 10
// 15: Divisible by three, but not two
// 54

Scala Saturday – Pattern Matching, Part 3: More Case Classes

Another tool in the pattern matching utility belt is disjoint unions. A disjoint union is a type that contains a small number of union cases. Each union case is treated as the same type as the disjoint union, but each case may comprise a different set of properties. You can build disjoint unions with Scala traits and case classes.

Rockin’ Disjoint (Unions)

You could use a disjoint union to represent the different cast type options when opening a socket to send UDP packets: broadcast, unicast, and multicast. If you are opening a socket to broadcast, you only need a port number. If you are opening a socket to unicast, you also need the IP address of the machine you are unicasting to. Finally, if you are opening a socket to multicast, you need an IP address (the multicast group), a port number, and optionally the ID of a network interface card if you want to target a specific network interface.

sealed trait SocketMeta
case class Broadcast(port: Int) extends SocketMeta
case class Unicast(addr: InetAddress, port: Int)
  extends SocketMeta
case class Multicast(addr: InetAddress, port: Int,
                     nicId: Option[String])
  extends SocketMeta

The SocketMeta trait serves as your disjoint union type. Define each union case as a case class that mixes in the SocketMeta trait. As you can see, each union case can have any number of properties, according to the familiar case class constructor syntax. Limit the number of union cases by marking SocketMeta as sealed. That way, all SocketMeta union cases must reside in this file.

Now it’s easy to write a function to build a socket based on the meta information:

def buildSocket(meta: SocketMeta) = {
  meta match {
    case Broadcast(port) =>
      // configure for broadcast
    case Unicast(ip, port) =>
      // configure for unicast
    case Multicast(ip, port, nic) =>
      // configure for multicast
  }
}

val s = buildSocket(Broadcast(5150))

Disjoint unions are good for times when the number of cases is small, and the number of properties in each case is small. For instance, you have likely worked with a particular disjoint union type many times. Option is essentially this:

sealed trait Option[+A]
case object None extends Option[Nothing]
case class Some[+A](get: A) extends Option[A]

(I know I have not covered case objects explicitly, but if you get the concept of a case class, case objects are pretty straightforward, too. They are singleton objects that you can use in pattern matching.)

Option is either some value or nothing at all; that’s it. There is no third alternative. It is very easy to cover all the bases in a fairly short block of code.

Option also illustrates one more thing about disjoint unions: union cases need not have any properties at all. None stands alone; it requires no properties.

Typing Lesson

It turns out that disjoint unions are a specific example of a more general kind of type: an algebraic data type. An algebraic data type is a composite type; that is, it is a combination of other types. An algebraic data type may comprise an infinite set of types or a finite set.

SocketMeta and Option both clearly consist of a finite number of types—a small finite number at that. As mentioned already, that’s what makes them so well suited to match expressions.

Scala Saturday – Pattern Matching, Part 2: Case Classes

In addition to regular classes, Scala also provides case classes for the purpose of pattern matching. They offer a few benefits that standard classes do not.

Getting Personal

Defining a case class is stupid easy:

case class Person(first: String, last: String)

Person has two fields, first and last, both of type string. Notice that the parameters don’t need a val keyword.

Defining a Person variable is even easier than a standard class because you can omit the new keyword:

val me = Person("Brad", "Collins")

Pattern matching on case class instances is pretty straightforward, too:

def greet(p : Person) = p match {
  case Person("Brad", "Collins") => "It's me!"
  case Person("Brad", _) => "Nice name"
  case Person(_, "Collins") => "Greetings, kinfolk"
  case _ => "Hello, stranger"
}

val me = Person("Brad", "Collins")
val kin = Person("Shad", "Collins")
val namesake = Person("Brad", "Rollins")
val stranger = Person("Ezra", "Shemiah")

val meGreeting = greet(me)
// meGreeting: String = It's me!

val kinGreeting = greet(kin)
// kinGreeting: String = Greetings, kinfolk

val namesakeGreeting = greet(namesake)
// namesakeGreeting: String = Nice name

val strangerGreeting = greet(stranger)
// strangerGreeting: String = Hello, stranger

To match on a Person object with specific first and last fields, specify both of them, as in the first match case above. To match on just one field, either first or last, specify the field to match, and use the wildcard pattern, the underscore (_), for the field you don’t care about, as in the second and third match cases above. If Person had more than two fields, you could match any subset of them by specifying the field values you need to be exact and using the wildcard pattern for the fields you don’t.

One other thing to note here: Order matters. What if you had defined greet this way?

def greet(p : Person) = p match {
  case Person("Brad", _) => "Nice name"
  case Person(_, "Collins") => "Greetings, kinfolk"
  case Person("Brad", "Collins") => "It's me!" // OOPS!
  case _ => "Hello, stranger"
}

The third case (line 4 above) would never match because Person("Brad", "Collins") would always match the first case. So, pay attention out there.

Personal Improvement

As with regular classes, we can add member functions and properties to the case class itself and also put some things in a companion object:

case class Person(first: String, last: String) {
  def swap = Person(last, first)
}
object Person {
  val anonymous = Person("", "")
}

val me = Person("Brad", "Collins")
// me: Person = Person(Brad,Collins)

val swapped = me.swap
// swapped: Person = Person(Collins,Brad)

val johnDoe = Person.anonymous
// johnDoe: Person = Person(,)

This example also demonstrates one of the extras you get with class classes: A toString() implementation. With a regular class, if you want toString() to show you something more descriptive than a type name, you have to overload toString() yourself.

Case classes also throw in implementations of equals(o) and hashCode() for no charge:

val me = Person("Brad", "Collins")
val swapped = me.swap
val myself = Person("Brad", "Collins")

val iAm = me == myself
// iAm: Boolean = true

val iAint = me == swapped
// iAint: Boolean = false

val meHash = me.hashCode()
// meHash: Int = 777586888

val myselfHash = myself.hashCode()
// myselfHash: Int = 777586888

val swappedHash = swapped.hashCode()
// swappedHash: Int = 1042444723

Notice how me and myself are equal though they are different object instances, and they also have the same hash code.

Finally, if you have a case class instance and need a new instance with, say, only one or two field values that are different, case classes throw in the copy() to do just that so that you don’t have to set every field value explicitly:

val me = Person("Brad", "Collins")
val kin = me.copy(first = "Shad")
// kin: Person = Person(Shad,Collins)

val namesake = me.copy(last = "Rollins")
// namesake: Person = Person(Brad,Rollins)

Personable Companions

The way case classes perform some of their magic is that Scala defines a companion object for each case class behind the scenes.

First, why don’t you need the new keyword when instantiating case classes? Because the companion object has an apply() method that takes the parameters defined in the case class constructor:

case class Person(first: String, last: String)
// Notional representation of what the
// compiler provides:
//
// object Person {
//   def apply(first: String, last: String) =
//     new Person(first, last)
// 
//   def unapply(p: Person): Option[String, String] =
//     Some(p.first, p.last)
// }

val me = Person("Brad", "Collins")
// Actually calls ...
// Person.apply("Brad", "Collins")

That unapply() method is how Scala destructures case classes when pattern matching. When you write this:

p match {
  case Person(f,l) => ...
}

… Scala uses Person.unapply(p) to populate the f and l variables.

One final note: When we added the anonymous field to the Person companion object above, the compiler was nice enough to merge that with the one it generated for us instead of overwriting the generated one with ours.

Watch out if you attempt to type that out into the REPL, you do overwrite the compiler-generated Person companion object. (Hint: Use the REPL’s :paste function to get around that.)

Scala Saturday – Pattern Matching, Part 1

A terribly useful technique in functional programming is pattern matching. Pattern matching is simply a form of conditional logic, like an if-else expression or (in other languages) a switch statement, but quite a bit more powerful and flexible. With pattern matching, you can perform matches and take action based on simple values, such as integers, but also on complex types and the values of their members. Here are some examples using the simpler types, and a future post(s) will illustrate how to match on more complex types.

On the Whole (Numbers)

Pattern matching on integers is very straightforward. Just use literals:

def getResponse(talents: Int) = talents match {
  case 5 => "Here are your five talents plus five more"
  case 2 => "Here are your two talents plus two more"
  case 1 => "Here is your one talent, which I hid"
  case _ => "Uh, wrong story"
}

val response = getResponse(5)
// response: String = 
//   Here are your five talents plus five more

val apocryphal = getResponse(42)
// apocryphal: String = Uh, wrong story

Of course, the set of all possible integers is very large. Our example here only covers three specific values: 1, 2, and 5. To cover every other case that you don’t specify explicitly, use the wildcard pattern _, the underscore. That’s how we could handle 42. Or 103. Or −7. Or any other integer.

Getting to the (Floating) Point

Floating point numbers are difficult to pin down because of the margin of error, so you don’t typically match on a specific number. Usually you match on ranges. Pattern matching allows for that, too:

def getState(temp: Double) = temp match {
  case x if x <= 32.0 => "solid"
  case x if x >= 212.0 => "gas"
  case _ => "liquid"
}

val atRoomTemp = getState(70.0)
// atRoomTemp: String = liquid

val atSouthPole = getState(-70.6)
// atSouthPole: String = solid

Putting a condition on a match with the if keyword like that is called a guard. Guards are simply Boolean expressions. You could alternatively write the “liquid” step in getState with a compound Boolean expression:

def getState2(temp: Double) = temp match {
  case x if 32.0 > x && x < 212.0 => "liquid"
  case x if x >= 212.0 => "gas"
  case _ => "solid"
}
  
val onHotSummerDay = getState2(98.5)
// onHotSummerDay: String = solid

In one more rewrite of getState, note that it is possible to use variables, not just literals, in guards:

def getState3(temp: Double) = {
  val freezingPoint = 32.0
  val boilingPoint = 212.0
  temp match {
    case x if x <= freezingPoint => "solid"
    case x if x >= boilingPoint => "gas"
    case _ => "liquid"
  }
}

val inDeathValley = getState3(134.0)
// inDeathValley: String = liquid

Stringly Typed Interfaces

Perhaps your application executes commands, and those commands are specified by names, that is, strings. (Some folks jokingly call them “stringly typed interfaces.”) Pattern matching is perfect for this task:

def execute(command: String, id: Int, value: String = "") =
  command match {
    case "add" => s"Added ${id}: ${value}"
    case "remove" => s"Added ${id}"
    case "update" => s"Added ${id}: ${value}"
    case _ => s"Illegal command: ${command}"
  }

val added = execute("add", 42, "foo")
// added: String = Added 42: foo

val updated = execute("update", 42, "bar")
// updated: String = Added 42: bar

val removed = execute("remove", 42)
// removed: String = Added 42

val wowbanged = execute("wowbang", 73)
// wowbanged: String = Illegal command: wowbang

Perhaps you want to allow some flexibility in your command names. For example, another name for “add” could be “create.” Pattern matching expressions allow you to stack multiple conditions for which you want to take the same action using a pipe (|):

def execute2(command: String, id: Int, value: String = "") =
  command match {
    case "add" | "create" => s"Added ${id}: ${value}"
    case "remove" | "delete" => s"Added ${id}"
    case "update" | "change" => s"Added ${id}: ${value}"
    case _ => s"Illegal command: ${command}"
  }

val created = execute2("create", 84, "baz")
// created: String = Added 84: baz

Here Are Your Options

Perhaps you have a user variable that is an Option[String]. If the user is logged in, user is a Some; otherwise the user is an unauthenticated guest. You’d like to generate a greeting based on whether the user is logged in or not:

def greet(user: Option[String]) = user match {
  case Some(name) => s"Welcome back, ${name}!"
  case None => "Hello, dear guest! Please sign in!"
}

val personal = greet(Some("bcollins"))
// personal: String = Welcome back, bcollins!

val generic = greet(None)
// generic: String = 
//   Hello, dear guest! Please sign in!

Notice how the match expression can, in the case of the Some, unpack the value for you and assign it to a variable, name in this case. You don’t have to do it yourself.

Now given that Option is a binary choice, you may be tempted to think that an if-else block is probably a better, um, option. In some simple cases it may be, but in this case, you want to generate a message that differs by more than just the username. Look at what it takes to get the same results with an if-else block as the match expression above:

def greet2(user: Option[String]) =
  if (user.isDefined) {
    val name = user.get
    s"Welcome back, ${name}!"
  } else {
    "Hello, dear guest! Please sign in!"
  }

Now that’s not too bad, but compare that to the original version that uses the match expression. The match expression is just cleaner: It is very easy to see what the conditions are, and you don’t have to clutter the code by unpacking the Some value yourself.

Bearing with the Tuples of the Week

Pattern matching expressions can unpack tuples so that you can match on all values or individual values:

def getProducer(chars: (String, String)) =
  chars match {
    case ("Tom", "Jerry") => "Hanna-Barbera"
    case ("Bugs", "Daffy") => "Warner Brothers"
    case ("Mickey", _) => "Disney"
    case (x, "Buzz") => s"Pixar with ${x} and Buzz"
    case _ => "other"
  }

val prod = getProducer("Tom", "Jerry")
// prod: String = Hanna-Barbera

val prod2 = getProducer("Mickey", "Minnie")
// prod2: String = Disney

val prod3 = getProducer("Mickey", "Donald")
// prod3: String = Disney

val prod4 = getProducer("Woody", "Buzz")
// prod4: String = Pixar with Woody and Buzz

val prod5 = getProducer("Andy", "Buzz")
// prod5: String = Pixar with Andy and Buzz

val prod6 = getProducer("Ren", "Stimpy")
// prod6: String = other

As you can see in the “Disney” line, you don’t care what the second element of the tuple is, so you throw it away with the wildcard pattern. On the other hand, on the “Pixar …” line, you want to capture the first element in the matched tuple and use it in the result. So you just give it a variable name, and the compiler assigns the value to the variable for you, just like with Some(name) above.

What’s Your Type?

Finally, pattern matching expressions can match even on different types and take action based on the type of the input:

def report(guests: Any) = guests match {
  case guest: String =>
    s"Our guest: ${guest}"
  case all: Array[String] =>
    s"Our guests: ${all.mkString(", ")}"
  case count: Int =>
    s"We have ${count} guests"
  case _ => "Huh?"
}

val one = report("Brad")
// one: String = Our guest: Brad

val many = report(Array("Me", "Myself", "I"))
// many: String = Our guests: Me, Myself, I

val count = report(13)
// count: String = We have 13 guests

val stumped = report(3.14)
// stumped: String = Huh?

As you can see, you can choose from any number of types to match on and take action accordingly.

Scala Saturday – The Set.diff Method

Sometimes you have two sets of information, and you want to know what is in one set that the other set lacks. You want to perform a set difference operation.

What’s the Difference?

Given two sets, A and B, the difference between A and B—the items in A that are not in B—is written as follows:

A \ B

Likewise, the difference in B and A—the items in B that are not in A—is written vice versa:

B \ A

Set operations are frequently easiest to convey with a diagram:

An illustration of two sets, A and B, represented as two filled circles that are slightly overlapping. The difference (A \ B) is the part of set A not overlapping set B. The difference (B \ A) is the part of set B not overlapping set A.
The Difference in Two Sets A and B

It’s worth noting that a set difference operation is like arithmetic subtraction in that it matters which operand comes first. That is …

A \ B ≠ B \ A

White & nerdy mathematicians say that the set difference operation does not commute.

Always Never the Same

Continuing with my penchant for using band lineups as examples, put the original lineup of the band Kansas in one set and the current lineup in another:

val original = Set(
  "Walsh", "Livgren", "Williams",
  "Steinhardt", "Hope", "Ehart")
val current = Set(
  "Platt", "Manion", "Williams",
  "Ragsdale", "Greer", "Ehart")

To get the original members no longer with the band, use the Set.diff method to take the difference in the original lineup and the current:

val departed = original.diff(current)
// departed: scala.collection.immutable.Set[String] = 
//   Set(Walsh, Steinhardt, Livgren, Hope)

To get the current members who were not in the original lineup, do the opposite—take the difference in the current lineup and the original:

val noobs = current.diff(original)
// noobs: scala.collection.immutable.Set[String] = 
//   Set(Platt, Ragsdale, Manion, Greer)

How ’bout that! Simple enough, I suppose, but seeing as set difference is a whole lot like arithmetic subtraction, wouldn’t it be really nice if we could express our set difference operations just like we do arithmetic subtraction?

A − B

Good news: You can! Well, almost. Scala defines a -- operator so that we can more mathematical-looking code:

val departed = original -- current
val noobs = current -- original
// departed: scala.collection.immutable.Set[String] = 
//   Set(Walsh, Steinhardt, Livgren, Hope)
// noobs: scala.collection.immutable.Set[String] = 
//   Set(Platt, Ragsdale, Manion, Greer)

Incidentally, Scala does also define a minus operator, but it is for removing one item at a time, not a set of items. However, with it you can write code like this:

val noRhythm = current - "Ehart" - "Greer"
// noRhythm: scala.collection.immutable.Set[String] = 
//   Set(Platt, Ragsdale, Manion, Williams)

Scala Saturday – The Option Type, Part 3

One more word on the Option type. While Option allows you a more type-safe alternative to nulls, it does not itself handle nulls. In other words, this …

val s = Some(null)

… gets you a Some containing a null. That doesn’t help so much if our aim is to get away from null, not kick it down the road.

The Java Way

In version 8, Java introduced its own Optional type, no doubt inspired by the Option and Maybe types available in the functional languages. It does handle nulls. Take the following classes as examples:

public class User {
  private String username;
  private String name;
  
  public User(String username) { this(username, null); }
  public User(String username, String name) {
    this.username = username;
    this.name = name;
  }
  
  public String getUsername() { return username; }
  public String getName() { return name; }
}

public class Session {
  private User user;
  
  public Session() { this(null); }
  public Session(User u) { this.user = u; }
  
  public User getUser() { return user; }
}

public class Application {
  private Session session;

  public Application() { this(null); }
  public Application(Session s) { this.session = s; }
  
  public Session getSession() { return session; }
}

Because Optional handles null in its map operation, this works:

Application app = new Application(new Session());
String name = Optional.of(app)
                .map(Application::getSession)
                .map(Session::getUser)
                .map(User::getName)
                .orElse("n/a");
// name = "n/a"

In our example, the Session has a null User property. But that’s OK, because the map operation on line 4 (highlighted above) takes that null and converts it to a None, or more accurately, an Empty per Java parlance.

Back in Scala Land

We cannot do quite the same thing in Scala. Now we, as enlightened Scala developers, would never code up something that uses nulls all over the place like the Java classes above. But the Java world is more comfortable with nulls, so let’s say that you are working with a library that contains those Java classes. You pull the JAR into your Scala project and write something like this:

val app = new Application(new Session())
val name = Option(app)
             .map(_.getSession)
             .map(_.getUser)
             .map(_.getName)
             .getOrElse("n/a")
// NullPointerException!!!

But that breaks down pretty quickly. Line 4 (highlighted above) yields an exception because _.getUser is null. Option.map in Scala does not convert a null to a None. So then, what do we do?

Well, you can instantiate an Option by applying the Option object. Option() does take an object reference and return a None if it is a null, or a Some if it is not. Also, remember that Option.flatMap takes an operation that returns an Option and flattens it so that it does not return a nested Option, but just an Option. Therefore, you can change your code to this:

let app = Application(Session())
val name = Option(app)
             .flatMap(a => Option(a.getSession))
             .flatMap(s => Option(s.getUser))
             .flatMap(u => Option(u.getName))
             .getOrElse("n/a")
// name : String = "n/a"

It’s a bit more headache than you’d like, but that’s what you get for using flaky Java code, right?

Scala Saturday – The Option Type, Part 2

Last week we introduced the Option type as a way of representing when a function may or may not return a value.

A Collection of One

Another way to think about an Option is as a collection that contains no more than one element. Consequently, Option provides some collection-like semantics, e.g., fold, map, exists, and filter. This allows us to chain operations together without having to check at each step whether the value is Some or None. We can put that off until the end of the sequence of operations and only do the check once. That way, the algorithm is more readable; it’s not cluttered with a bunch of if-else noise.

Say that you have a User type:

case class User(id: String, name: String)

Let’s say that your system is a website. In the top, right corner of the site, you want to display the name of the user who is currently logged in. You need a function that returns the user currently signed in:

val authenticatedUser: () => Option[User] = // ...

Now why would this function return an Option? Well, the user browsing your site may not have logged in at this point. If that’s the case, then the current user is None. So then, the code to get the name of the current user is this:

val nameOpt = authenticatedUser().map(_.name)
// nameOpt: Option[String]

Notice how the type of nameOpt is Option[String]. In other words, assuming that authenticatedUser() returns a Some[User], Scala knows that the User.name property is a String. Nevertheless it propagates the uncertainty, if you will, along through the map operation. Any operations you do on an Option only happen if the Option is a Some. Otherwise, Scala happily ignores the operation and continues to propagate the None.

OK, so now that we have a final result Option, what do we do with it? That’s where Option.getOrElse() comes in:

val name = nameOpt getOrElse "Guest"
// name: String

If nameOpt is a Some, name is the value in the Some. If nameOpt is a None, then name is "Guest". You can test both cases in the Scala REPL. To see it work in the case of a Some:

val authenticatedUser: () => Option[User]  =
    () => Some(User("bcollins", "Brad Collins"))
val name = authenticatedUser()
             .map(_.name)         // Some("Brad Collins")
             .getOrElse("Guest")  // "Brad Collins"
// name: String = Brad Collins

And then to see it with a None:

val authenticatedUser: () => Option[User]  =
    () => None
val name = authenticatedUser()
             .map(_.name)         // None
             .getOrElse("Guest")  // "Guest"
// name: String = Guest

Unraveling the Options

Remember how the collections API defines the flatMap method? For instance, this is the List.flatMap() method:

// signature slightly modified for readability
def flatMap[B](f: (A) ⇒ List[B]): List[B]

It’s like map, but flatMap takes a transformation function that takes an element and returns a list of values rather than just a single value. Then instead of returning a list of lists, flatMap flattens them into a single list.

Option also has a flatMap method. It takes a function that returns an Option and, instead of returning a nested Option (i.e., an Option[Option[B]]), it just returns an Option[B]:

def flatMap[B](f: (A) ⇒ Option[B]): Option[B]

Maybe instead of having an authenticatedUser() function, your app just has a session property. To get the name of the current user, you have to walk the property tree from the application to the session to the user and then finally the name. Application.session is an Option to indicate that there may be no valid session. Session.user is an Option to indicate that there may be no user signed in. Finally, maybe we don’t even force a user to have a name, just a user ID:

case class User(id: String, name: Option[String] = None)
case class Session(user: Option[User] = None)
case class Application(session: Option[Session] = None)

You walk the property tree like this:

val nameOpt = app.session
                .flatMap(_.user)
                .flatMap(_.name)
val name = nameOpt getOrElse "Guest"

Now if any property is None, from session to user to name, nameOpt is None, and name is "Guest".