Scala Saturday – The Stream.takeWhile Method

When you want a set number of items from the beginning of a stream, you use Stream.take. When you don’t care so much about the number, but rather the nature of the items, use Stream.takeWhile.

Stream.takeWhile starts at the beginning of the stream and applies a predicate to each item, one by one. It returns each item in a new stream until it reaches an item that does not meet the predicate:

Stream.takeWhile takes items from the beginning of a stream until it reaches an item that does not meet the given predicate.
Taking Items from a Stream While a Predicate Holds

Note that Stream.takeWhile does not return the item that fails to meet the predicate. It stops with the last item that meets the predicate.

Perhaps you have a sensor feeding you temperature readings once per minute:

import java.time.{LocalDateTime, Month}

case class Reading(
  temperature: Double, 
  timestamp: LocalDateTime)

val readings = Stream(
  Reading(89.5, 
    LocalDateTime.of(2015, Month.JULY, 19, 10, 0)),
  Reading(90.1, 
    LocalDateTime.of(2015, Month.JULY, 19, 10, 1)),
  Reading(89.9, 
    LocalDateTime.of(2015, Month.JULY, 19, 10, 2)),
  Reading(90.0, 
    LocalDateTime.of(2015, Month.JULY, 19, 10, 3)),
  Reading(90.1, 
    LocalDateTime.of(2015, Month.JULY, 19, 10, 4)),
  Reading(-1000.0, 
    LocalDateTime.of(2015, Month.JULY, 19, 10, 5)),
  Reading(-1000.0, 
    LocalDateTime.of(2015, Month.JULY, 19, 10, 6)),
  Reading(90.2, 
    LocalDateTime.of(2015, Month.JULY, 19, 10, 7))
)

Now to get all readings up until the thermometer returns the data-not-valid indicator, employ Stream.takeWhile:

val valid = readings takeWhile { _.temperature != -1000.0 }

// valid: scala.collection.immutable.Stream[Reading] = 
//   Stream(
//     Reading(89.5,2015-07-19T10:00), 
//     Reading(90.1,2015-07-19T10:01), 
//     Reading(89.9,2015-07-19T10:02), 
//     Reading(90.0,2015-07-19T10:03), 
//     Reading(90.1,2015-07-19T10:04)
//   )

Finally, Stream.takeWhile doesn’t balk if it never reaches an element that fails to meet the predicate:

val all = Stream(1,3,4,7) takeWhile { _ < 10 }
// all: scala.collection.immutable.Stream[Int] = 
//   Stream(1, 3, 4, 7)

As with Stream.take, the other collection types also define takeWhile:

F# Friday – The Seq.takeWhile Function

When you want a set number of items from the beginning of a sequence, you use Seq.take or Seq.truncate. When you don’t care so much about the number, but rather the nature of the items, use Seq.takeWhile.

Seq.takeWhile starts at the beginning of the sequence and applies a predicate to each item, one by one. It returns each item in a new sequence until it reaches an item that does not meet the predicate:

Seq.takeWhile takes items from the beginning of a sequence until it reaches an item that does not meet the given predicate.
Taking Items from a Sequence While a Predicate Holds

Note that Seq.takeWhile does not return the item that fails to meet the predicate. It stops with the last item that meets the predicate.

Perhaps you have a sensor feeding you temperature readings once per minute:

type Reading = {
    Temperature : float
    Timestamp : DateTime
}

let readings = 
    seq [
        { Temperature = 89.5
          Timestamp = DateTime(2015, 07, 19, 10, 0, 0) }
        { Temperature = 90.1
          Timestamp = DateTime(2015, 07, 19, 10, 1, 0) }
        { Temperature = 89.9
          Timestamp = DateTime(2015, 07, 19, 10, 2, 0) }
        { Temperature = 90.0
          Timestamp = DateTime(2015, 07, 19, 10, 3, 0) }
        { Temperature = 90.1
          Timestamp = DateTime(2015, 07, 19, 10, 4, 0) }
        { Temperature = -1000.0
          Timestamp = DateTime(2015, 07, 19, 10, 5, 0) }
        { Temperature = -1000.0
          Timestamp = DateTime(2015, 07, 19, 10, 6, 0) }
        { Temperature = 90.2
          Timestamp = DateTime(2015, 07, 19, 10, 7, 0) }
    ]

Now to get all readings up until the thermometer returns the data-not-valid indicator, employ Seq.takeWhile:

let valid = 
    readings
    |> Seq.takeWhile (fun r -> r.Temperature <> -1000.0)

// val valid: seq<Reading> =
//   [{Temperature = 89.5;
//     Timestamp = 7/19/2015 10:00:00 AM};
//    {Temperature = 90.1;
//     Timestamp = 7/19/2015 10:01:00 AM};
//    {Temperature = 89.9;
//     Timestamp = 7/19/2015 10:02:00 AM};
//    {Temperature = 90.0;
//     Timestamp = 7/19/2015 10:03:00 AM};
//    {Temperature = 90.1;
//     Timestamp = 7/19/2015 10:04:00 AM}]

Finally, even though Seq.take throws an exception if you ask it for more elements than it can provide, Seq.takeWhile does not balk if it never reaches an element that fails to meet the predicate:

let all = seq [1;3;4;7]
          |> Seq.takeWhile (fun n -> n < 10)
// val all : seq<int> [1; 3; 4; 7]

As of F# 4.0, there are versions of takeWhile in the Array and List modules, but as of this writing, the documentation on MSDN does not yet include them.

Scala Saturday – The Stream.take Method

Sometimes you have a collection of items, and you only want the first so many items. You may not know or even care what the first n items are, you just know that you want the first n items. Stream.take can do that for you.

To Infinity … and Beyond!

OK, it’s unusual that you just don’t care what items you get from a sequence, but there are occasions: for instance, random numbers. You really don’t care what the number is. You just don’t want it to be the same number every time, and perhaps you want it to fall within a certain range. Other than that, you don’t care: you just want a number!

val seed = new java.util.Date().hashCode
val rand = new scala.util.Random(seed)
val someNum = rand.nextInt
// someNum: Int = 1717783198  (well, this time anyway)

Now, assume that you have an application that needs a steady stream of random numbers. (Maybe even the number of random numbers you want is itself random!) Wouldn’t it be great to turn your random number into a stream?

Scala streams are lazy data structures. That means that as you iterate over the stream, it produces the next item on demand. It may not even have calculated what the next item is until you ask for it. In fact, stream can be infinite! How do you have an infinite stream? Well, back to the random number generator.

You can call rand.nextInt until the cows come home and just keep getting values. Well, the Stream companion object gives us a way to turn rand (or any function that calculates a value from an input) into an infinite stream—Stream.iterate:

let rands = Stream.iterate(rand.nextInt) {
  _ => rand.nextInt
}

Stream.iterate takes a starting value of type A and a function that takes a value of type A and returns another value of type A—an integer in this case. Now you can call Stream.take on rands, and get as many or as few values as you may want:

val taken = rands take 5
// taken: scala.collection.immutable.Stream[Int] = 
//   Stream(-2112282457,
//     -1552737819,
//     -745613730,
//     795338080,
//     -219200225)

Additionally, Scala’s #:: operator allows you to create the infinite stream of random numbers in another way:

def rands: Stream[Int] = rand.nextInt() #:: rands
val taken = rands take 5
// taken: scala.collection.immutable.Stream[Int] = 
//   Stream(-575330347,
//     631339635,
//     -911607409,
//     -614867864,
//     -1194280908)

In this case, you’re defining a function that calls itself recursively to provide the next element in the stream.

Use whichever approach you find the more natural.

Going for the Gold

There are, of course, times when your data set is not infinite. You happen not to need all of them, but only the first few. I mentioned early on how that you may not know or care what the items are. A better way of saying it is that you (or someone) have likely already done the work to prepare the data set—processed it, sorted it, whatever. Now that you have done all that preprocessing, you want the first few items.

Perhaps you are writing software to determine the gold, silver, and bronze medal winners in the Olympic Games. You have a type representing each competitor:

case class Competitor(name: String, score: Double)

You have the list of competitors for the 2012 men’s 110-meter hurdles, and they are already sorted according to ranking:

val competitors = Stream(
  Competitor("Aries MERRITT", 12.92),
  Competitor("Jason RICHARDSON", 13.04),
  Competitor("Hansle PARCHMENT", 13.12),
  Competitor("Lawrence CLARKE", 13.39),
  Competitor("Ryan BRATHWAITE", 13.40),
  Competitor("Orlando ORTEGA", 13.43),
  Competitor("Lehann FOURIE", 13.53)
)

Now to get the medalists, take the first three records:

val medalists = competitors take 3
// medalists: scala.collection.immutable.Stream[Competitor] =
//   Stream(
//     Competitor(Aries MERRITT,12.92),
//     Competitor(Jason RICHARDSON,13.04)
//     Competitor(Hansle PARCHMENT,13.12) )

Take It to the Limit

What happens, though, when you have a stream with fewer items than you attempt to take?

val numbersOfTheCounting = Stream(1,2,3) take 5
numbersOfTheCounting foreach println
// 1
// 2
// 3

Well, that’s nice. You ask for five, so the stream gives you all it has. If that happens to be fewer than you asked for, hey, no worries.

I should note that Stream is not unique in having a take method. The strict, sequential collection types—Array, List, Seq, and Vector—all have a take method that functions the same as Stream.take does. Even Map has a take method.

F# Friday – The Seq.take and Seq.truncate Functions

Sometimes you have a collection of items, and you only want the first so many items. You may not know or even care what the first n items are, you just know that you want the first n items. Seq.take can do that for you.

To Infinity … and Beyond!

OK, it’s unusual that you just don’t care what items you get from a sequence, but there are occasions: for instance, random numbers. You really don’t care what the number is. You just don’t want it to be the same number every time, and perhaps you want it to fall within a certain range. Other than that, you don’t care: you just want a number!

let seed = System.DateTime.UtcNow.GetHashCode()
let rand = System.Random(seed)
let someNum = rand.Next()
// val someNum : int = 689587617  (well, this time anyway)

Now, assume that you have an application that needs a steady stream of random numbers. (Maybe even the number of random numbers you want is itself random!) Wouldn’t it be great to turn your random number into a sequence?

F# sequences are lazy data structures. That means that as you iterate over the sequence, it produces the next item on demand. It may not even have calculated what the next item is until you ask for it. In fact, sequences can be infinite! How do you have an infinite sequence? Well, back to the random number generator.

You can call rand.Next() until the cows come home and just keep getting values. Well, the Seq module gives us a way to turn rand (or any function that calculates a value from an input) into an infinite sequence—Seq.initInfinite:

let rands = Seq.initInfinite (fun _ -> rand.Next())

Seq.initInfinite takes a function that takes an integer and returns a value of the type you are interested in—an integer in this case. Now you can call Seq.take on rands, and get as many or as few values as you may want:

let taken = rands |> Seq.take 5
// val taken : 
//   seq [
//     974631199
//     1778702212
//     1311642930
//     1549454908
//     1138487539
//   ]

Additionally, F#’s sequence expressions allow you to create the infinite sequence of random numbers in what may be a slightly more readable fashion:

let rands = seq { while true do yield rand.Next() }
let taken = rands |> Seq.take 5
// val taken : 
//   seq [
//     1882861480
//     469452280
//     466017706
//     417690089
//     166247112
//   ]

Use whichever approach you find the more natural.

Going for the Gold

There are, of course, times when your data set is not infinite. You happen not to need all of them, but only the first few. I mentioned early on how that you may not know or care what the items are. A better way of saying it is that you (or someone) have likely already done the work to prepare the data set—processed it, sorted it, whatever. Now that you have done all that preprocessing, you want the first few items.

Perhaps you are writing software to determine the gold, silver, and bronze medal winners in the Olympic Games. You have a type representing each competitor:

type Competitor = {
    Name : string
    Score : float
}

You have the list of competitors for the 2012 men’s 110-meter hurdles, and they are already sorted according to ranking:

let competitors = 
    seq [
        { Name = "Aries MERRITT"; Score = 12.92 }
        { Name = "Jason RICHARDSON"; Score = 13.04 }
        { Name = "Hansle PARCHMENT"; Score = 13.12 }
        { Name = "Lawrence CLARKE"; Score = 13.39 }
        { Name = "Ryan BRATHWAITE"; Score = 13.40 }
        { Name = "Orlando ORTEGA"; Score = 13.43 }
        { Name = "Lehann FOURIE"; Score = 13.53 }
    ]

Now to get the medalists, take the first three records:

let medalists = competitors |> Seq.take 3
// val medalists: seq<Competitor> =
//   [
//     {Name = "Aries MERRITT"; Score = 12.92;}
//     {Name = "Jason RICHARDSON"; Score = 13.04;}
//     {Name = "Hansle PARCHMENT"; Score = 13.12;}
//   ]

Take It to the Limit

What happens, though, when you have a sequence with fewer items than you attempt to take?

let numbersOfTheCounting = seq [1;2;3] |> Seq.take 5
numbersOfTheCounting |> Seq.iter (printfn "%A")
// 1
// 2
// 3
// System.InvalidOperationException: 
//   The input sequence has an insufficient 
//   number of elements.

Oops! That’s not good! If you ask for more items than the sequence contains, you get an exception. I have to admit that I find this annoying. To me, Seq.take ought to return as many items as it can. If it should happen to contain fewer than you’ve asked for, oh well, no big deal; here are the ones it’s got. As a point of comparison, the take functions in Scala, Clojure, and Haskell all behave that way.

The good news is that there is an F# function that does behave that way: Seq.truncate. So if you want to use a function that’s a little more forgiving, just use Seq.truncate:

let numbersOfTheCounting = seq [1;2;3] |> Seq.truncate 5
numbersOfTheCounting |> Seq.iter (printfn "%A")
// 1
// 2
// 3

As of F# 4.0, there are versions of take and truncate in the Array and List modules, but as of this writing, the documentation on MSDN does not yet include them.

Scala Saturday – The List.++ Method

Occasionally, you need to combine two lists (or arrays or vectors or sequences) into one. List.++ to the rescue! (Along with Array.++, and Vector.++, and Seq.++.)

A widow Carol has three daughters: Marcia, Jan, and Cindy.

val ladies = List("Carol", "Marcia", "Jan", "Cindy")

A widower Mike has three sons: Greg, Peter, and Bobby.

val fellas = List("Mike", "Greg", "Peter", "Bobby")

That lovely lady meets that fellow, and they know it is much more than a hunch. They marry and form a family:

val bunch = ladies ++ fellas
// bunch: List[String] = 
//   List(Carol, Marcia, Jan, Cindy,
//        Mike, Greg, Peter, Bobby)

Of course, as you probably have guessed, order matters. Let’s reverse the arguments:

val bunch2 = fellas ++ ladies
// bunch2: List[String] = 
//   List(Mike, Greg, Peter, Bobby,
//        Carol, Marcia, Jan, Cindy)

You can also use ++ to chain a series of lists together:

val hobbits = List("Frodo", "Sam", "Pippin", "Merry")
val men = List("Aragorn", "Boromir")
val dwarves = List("Gimli")
val elves = List("Legolas")
val maiar = List("Gandalf")
val fellowship = hobbits ++ men ++ dwarves ++
                 elves ++ maiar
// fellowship: List[String] = 
//   List(Frodo, Sam, Pippin, Merry, Aragorn, 
//        Boromir, Gimli, Legolas, Gandalf)

And there’s nothing special about ++ in this regard. Because Scala allows infix notation, you can similarly chain other operations together:

val fellowslip = 
  hobbits ++ men ++ dwarves ++
  elves ++ maiar filter {
    !_.startsWith("G")
  } map {
    _.toUpperCase
  }
// fellowslip: List[String] =
//   List(FRODO, SAM, PIPPIN, MERRY,
//        ARAGORN, BOROMIR, LEGOLAS)

F# Friday – The List.append Function

Occasionally, you need to combine two lists (or arrays or sequences) into one. List.append to the rescue! (Along with Array.append and Seq.append.)

A widow Carol has three daughters: Marcia, Jan, and Cindy.

let ladies = ["Carol"; "Marcia"; "Jan"; "Cindy"]

A widower Mike has three sons: Greg, Peter, and Bobby.

let fellas = ["Mike"; "Greg"; "Peter"; "Bobby"]

That lovely lady meets that fellow, and they know it is much more than a hunch. They marry and form a family:

let bunch = List.append ladies fellas
// val bunch : string list =
//   ["Carol"; "Marcia"; "Jan"; "Cindy";
//    "Mike"; "Greg"; "Peter"; "Bobby"]

Of course, as you probably have guessed, order matters. Let’s reverse the arguments:

let bunch2 = List.append fellas ladies
// val bunch2 : string list =
//   ["Mike"; "Greg"; "Peter"; "Bobby";
//    "Carol"; "Marcia"; "Jan"; "Cindy"]

A note of caution: Idiomatic F# takes heavy advantage of the forward pipeline operator (|>). It may be tempting to do this, but watch out:

let yikes = [1; 2]
            |> List.append [3; 4]
// val yikes : int list = [3; 4; 1; 2]

Probably not what you wanted. This is one of those situations when the backward pipeline operator (<|) can step in to help you preserve readability:

let inOrder = [1; 2]
              |> List.append <| [3; 4]
// val inOrder : int list = [1; 2; 3; 4]

And you can even continue fluidly applying additional operations:

let inOrderAndTrebled = [1; 2]
                        |> List.append <| [3; 4]
                        |> List.map ((*) 3)
// val inOrderAndTrebled : int list = [3; 6; 9; 12]

Update: Thanks to Isaac Abraham for reminding me that there is a list concatenation operator (@), so this works and is nicely readable, too:

let concatenated = [1; 2] @ [3; 4]
// val concatenated : int list = [1; 2; 3; 4]

Note that @ works for lists only, not sequences or arrays.

Scala Saturday – Pattern Matching, Part 4: Extractors

This week we take a look under the hood, as it were, of pattern matching in Scala. An extractor is any object that defines an unapply() method that Scala’s match expressions can use to evaluate whether the input value is a match or not.

One reason case classes are so convenient for pattern matching is because they define an unapply() method along with the other handy tools Scala gives you when you define a case class. You don’t have to define a case class to get an unapply() method, though. You can define one yourself and enjoy the benefits.

Boolean Extractors

Extractors allow you to give a readable name to a case that effectively communicates the nature of the match, but hide some of the clutter that can threaten the readability of your code.

A classic example is determining whether an integer is even or odd. Now you could do that this way:

n % 2 match {
  case 0 => n -> "even"
  case _ => n -> "odd"
}

Now n % 2 is a simple expression, and its use common enough in computer science that most of us recognize right away, whenever we see it, “Oh, right, even or odd.” But there is just the slightest context switch between thinking mathematically and thinking conceptually, i.e., determining intent. And that context switch slows us down. A more complex expression can slow us down even more.

Compare the above (admittedly simple) match expression to what you can do with a couple of extractors. First, define the pair of extractors this way:

object Even {
  def unapply(n: Int) = n % 2 == 0
}
object Odd {
  def unapply(n: Int) = n % 2 == 1
}

This illustrates one way you can create an extractor: Define unapply() so that it returns a Boolean. A true value indicates a match, false a failure.

Now after defining the extractors, you can use them in a match expression like this:

def oddOrEven(n: Int) = {
  n match {
    case Even() => n -> "even"
    case Odd() => n -> "odd"
  }
}

(1 to 10) map oddOrEven foreach println

// (1,odd)
// (2,even)
// (3,odd)
// (4,even)
// (5,odd)
// (6,even)
// (7,odd)
// (8,even)
// (9,odd)
// (10,even)

Look at the difference in readability. There is no context switch. It reads much smoother than the original match expression. You can see that oddOrEven takes a value, n, and returns a result based on whether n is even or odd. You don’t have to leave the realm of the conceptual to think mathematically and then turn right back around to think conceptually again.

A point of emphasis: When building a match expression, you need to cover all the bases and define a return condition for every case. That is easy for the even/odd test: There are only two cases.

But what if there are several cases? Take the wavelengths of colors in the spectrum of visible light. Each color corresponds to a range of wavelengths:

Color Wavelength Ranges
Color Wavelength
Red 620–750 nm
Orange 590–620 nm
Yellow 570–590 nm
Green 495–570 nm
Blue 450–495 nm
Violet 380–450 nm

No problem, right? Just define a quick little set of extractors:

object Red {
  def unapply(λ: Int) = 620 <= λ && λ < 750
}
object Orange {
  def unapply(λ: Int) = 590 <= λ && λ < 620
}
object Yellow {
  def unapply(λ: Int) = 570 <= λ && λ < 590
}
object Green {
  def unapply(λ: Int) = 495 <= λ && λ < 570
}
object Blue {
  def unapply(λ: Int) = 450 <= λ && λ < 495
}
object Violet {
  def unapply(λ: Int) = 380 <= λ && λ < 450
}

Then put those extractors to use in a function:

def colorOfLight(λ: Int) = {
  λ match {
    case Red() => s"$λ nm" -> "red"
    case Orange() => s"$λ nm" -> "orange"
    case Yellow() => s"$λ nm" -> "yellow"
    case Green() => s"$λ nm" -> "green"
    case Blue() => s"$λ nm" -> "blue"
    case Violet() => s"$λ nm" -> "violet"
  }
}

Now this line should run like a charm, right?

List(800,700,600,580,500,475,400,350)
  .map(colorOfLight)
  .foreach(println)

// Exception in thread "main" scala.MatchError: 
//   800 (of class java.lang.Integer)
// ...

Whoa, what happened? There is radiation outside the visible spectrum; λ could be greater than 750 nm (as it is in this case of 800 nm) or less than 380 nm. You therefore need a catch-all case to cover the values that are outside the explicit cases:

def colorOfLight(λ: Int) = {
  λ match {
    case Red() => s"$λ nm" -> "red"
    case Orange() => s"$λ nm" -> "orange"
    case Yellow() => s"$λ nm" -> "yellow"
    case Green() => s"$λ nm" -> "green"
    case Blue() => s"$λ nm" -> "blue"
    case Violet() => s"$λ nm" -> "violet"
    case _ => s"$λ nm" -> "invisible"
  }
}

Now your little three-liner really does run like a charm:

List(800,700,600,580,500,475,400,350)
  .map(colorOfLight)
  .foreach(println)

// (800 nm,invisible)
// (700 nm,red)
// (600 nm,orange)
// (580 nm,yellow)
// (500 nm,green)
// (475 nm,blue)
// (400 nm,violet)
// (350 nm,invisible)

Option Extractors

Another way to indicate a match with an extractor is to return an Option. If the input meets the criteria sought, you return the case name in a Some. If not, you return a None. Furthermore, you can capture the matching value or a collection of values based on calculations the extractor performs on the input value.

If you just need to capture one value, return an Option[A]. You could modify the even/odd case above so that you capture the input variable in another variable name:

object Even {
  def unapply(n: Int) = if (n % 2 == 0) Option(n) else None
}
object Odd {
  def unapply(n: Int) = if (n % 2 == 1) Option(n) else None
}

def oddOrEven(n: Int) = {
  n match {
    case Even(m) => m -> "even"
    case Odd(m) => m -> "odd"
  }
}

(1 to 10) map oddOrEven foreach println

// (1,odd)
// (2,even)
// (3,odd)
// (4,even)
// (5,odd)
// (6,even)
// (7,odd)
// (8,even)
// (9,odd)
// (10,even)

To capture multiple values, return an Option that contains a tuple of the number of values you want to capture.

Perhaps you are a clerk in a department store. You recommend that customers who are at least 6′ (72 inches) tall and have at least a 40-inch waist go to the Big & Tall section. Others you greet according to their proportions.

First, define a Measurements type with height and waist properties:

class Measurements(val height: Int, val waist: Int)

Then define three extractors:

  1. one to detect whether a customer meets the “big” criterion,
  2. a second to detect whether he meets the “tall” criterion, and
  3. a third to detect whether he meets both criteria.
object Big {
  def unapply(m: Measurements) =
    if (m.waist >= 40) Some(m.waist) else None
}
object Tall {
  def unapply(m: Measurements) =
    if (m.height >= 72) Some(m.height) else None
}
object BigAndTall {
  def unapply(m: Measurements) =
    (Big.unapply(m), Tall.unapply(m)) match {
      case (Some(w), Some(h)) => Some(w,h)
      case _ => None
    }
}

Now that you have the extractors defined, you can use them in a match expression that extracts the waist and height on a successful match:

def sizeUp(m: Measurements) = {
  m match {
    case BigAndTall(w,h) =>
      s"$w-inch waist and $h inches tall: " +
        "Let me show you to our big & tall section"
    case Tall(h) => s"$h inches tall: How's the weather up there?"
    case Big(w) => s"$w-inch waist: Big fella, ain'tcha?"
    case _ => "How may I help you?"
  }
}

Now you run some customers through the sizeUp function:

  val me = new Measurements(76, 36)
  val shrimp = new Measurements(58, 28)
  val hoss = new Measurements(80, 46)
  val tubby = new Measurements(63, 42)

  List(me, shrimp, hoss, tubby)
    .map(sizeUp)
    .foreach(println)
// 76 inches tall: How's the weather up there?
// How may I help you?
// 46-inch waist and 80 inches tall: 
//   Let me show you to our big & tall section
// 42-inch waist: Big fella, ain'tcha?

Sequence Extractors

Finally, you can extract a variable number of values and match only on elements that meet your conditions while ignoring the rest. Sequence extractors define an unapplySeq() method rather than unapply(). The unapplySeq() method must return an Option[Seq[A]].

You could build an extractor that gets the prime factors of an integer:

object Factors {
  def unapplySeq(n: Int): Option[Seq[Int]] = {
    @tailrec
    def go(factors: List[Int], candidates: Seq[Int]): List[Int] = {
      if (candidates.isEmpty) {
        factors
      } else {
        val head = candidates.head
        val tail = candidates.tail
        if (n % head == 0) {
          go(head :: factors, tail)
        } else {
          go(factors, tail.filter(_ % head != 0))
        }
      }
    }

    val factors = n :: go(List(1), (2 to (n/2)).toSeq)
    if (factors.isEmpty) None else Some(factors.reverse)
  }
}

I won’t explain the (rather brute force) factorization method above. Suffice it to say that it returns the prime factors in order. For example, for 15, it returns {1,3,5,15}. Now let’s say that we want to match on numbers that are divisible by three, but not two. That means we’re looking for factor sets that follow this pattern: {1,3,…}. Here is how you use Factors to match that pattern:

def divBy3Not2(n: Int) = n match {
  case Factors(1,3,_*) =>
    s"$n: Divisible by three, but not two"
  case _ => n.toString
}

The _* wildcard tells Scala that you don’t really care what follows. If the first two elements match, then it’s a match. Now you can put divBy3Not2 to use:

List(2,6,9,10,15,54)
  .map(divBy3Not2)
  .map(println)
// 2
// 6
// 9: Divisible by three, but not two
// 10
// 15: Divisible by three, but not two
// 54

F# Friday – Pattern Matching, Part 4: Active Patterns

Active patterns and partial active patterns round out the F# developer’s toolbox for pattern matching.

These Patterns Are Highly Active

An active pattern allows you to define a set of named cases that you can use in your pattern matching expressions. The active recognizer, the type of function that defines active patterns, encapsulates pattern matching logic and associates it with the case name(s). The advantage is that you can give a readable name to a case that effectively communicates the nature of the match, but hide some of the clutter that can threaten the readability of your code.

A classic example is determining whether an integer is even or odd. Now you could do that this way:

match n % 2 with
| 0 -> n, "even"
| _ -> n, "odd"

Now n % 2 is a simple expression, and its use common enough in computer science that most of us recognize right away, whenever we see it, “Oh, right, even or odd.” But there is just the slightest context switch between thinking mathematically and thinking conceptually, i.e., determining intent. And that context switch slows us down. A more complex expression can slow us down even more.

Compare the above (admittedly simple) match expression to what you can do with an active pattern. First, define the active pattern this way:

let (|Even|Odd|) n =
    match n % 2 with
    | 0 -> Even
    | _ -> Odd

Where you usually give a name to the function, you now have a set of pipe-delimited names between what are known as banana clips (| |). Those names are case names that you will be able to use in match expressions. Finally, in the body of the active recognizer, you define a return path for each case name.

Now after defining the active pattern, you can use it in a match expression like this:

let oddOrEven n =
    match n with
    | Even -> n, "even"
    | Odd -> n, "odd"

seq { 1..10 }
|> Seq.map oddOrEven
|> Seq.iter (printfn "%O")
// (1, odd)
// (2, even)
// (3, odd)
// (4, even)
// (5, odd)
// (6, even)
// (7, odd)
// (8, even)
// (9, odd)
// (10, even)

Look at the difference in readability. There is no context switch. It reads much smoother than the original match expression. You can see that oddOrEven takes a value, n, and returns a result based on whether n is even or odd. You don’t have to leave the realm of the conceptual to think mathematically and then turn right back around to think conceptually again.

A point of emphasis: When defining an active pattern, you need to cover all the bases and define a return condition for every case. (In that regard, an active pattern is something like an algebraic data type.) That is easy for the even/odd test: There are only two cases.

But what if there are several cases? Take the wavelengths of colors in the spectrum of visible light. Each color corresponds to a range of wavelengths:

Color Wavelength Ranges
Color Wavelength
Red 620–750 nm
Orange 590–620 nm
Yellow 570–590 nm
Green 495–570 nm
Blue 450–495 nm
Violet 380–450 nm

No problem, right? Just define a quick little active pattern:

let (|Red|Orange|Yellow|Green|Blue|Violet|) λ =
    match λ with
    | x when 620 <= x && x < 750 -> Red
    | x when 590 <= x && x < 620 -> Orange
    | x when 570 <= x && x < 590 -> Yellow
    | x when 495 <= x && x < 570 -> Green
    | x when 450 <= x && x < 495 -> Blue
    | x when 380 <= x && x < 450 -> Violet

But wait! The compiler tells you, “Incomplete pattern matches on this expression.” That is because there is radiation outside the visible spectrum; λ could be greater than 750 nm or less than 380 nm. You therefore need a catch-all case to cover the values that are outside the explicit cases:

let (|Red|Orange|Yellow|Green|Blue|Violet|Invisible|) λ =
    match λ with
    | x when 620 <= x && x < 750 -> Red
    | x when 590 <= x && x < 620 -> Orange
    | x when 570 <= x && x < 590 -> Yellow
    | x when 495 <= x && x < 570 -> Green
    | x when 450 <= x && x < 495 -> Blue
    | x when 380 <= x && x < 450 -> Violet
    | _ -> Invisible

Now you can use the active pattern like this:

let colorOfLight λ =
    match λ with
    | Red -> (sprintf "%d nm" λ), "red"
    | Orange -> (sprintf "%d nm" λ), "orange"
    | Yellow -> (sprintf "%d nm" λ), "yellow"
    | Green -> (sprintf "%d nm" λ), "green"
    | Blue -> (sprintf "%d nm" λ), "blue"
    | Violet -> (sprintf "%d nm" λ), "violet"
    | Invisible -> (sprintf "%d nm" λ), "invisible"

seq [800; 700; 600; 580; 500; 475; 400; 350]
|> Seq.map colorOfLight
|> Seq.iter (printfn "%O")
// (800 nm, invisible)
// (700 nm, red)
// (600 nm, orange)
// (580 nm, yellow)
// (500 nm, green)
// (475 nm, blue)
// (400 nm, violet)
// (350 nm, invisible)

One final word before moving on: There is a limit to the number of cases in an active pattern. It can contain no more than seven cases. Your spectrum-of-light active pattern just so happens to go right up to that limit. If, for instance, you had needed to account for an indigo band of wavelengths between blue and violet, you would have had to take a different approach.

Active In Part

A partial active pattern is similar to an active pattern, but it does not cover the entire input domain. If the input meets the criteria sought, you return the case name in a Some. If not, you return a None.

Perhaps you are a clerk in a department store. You recommend that customers who are at least 6′ (72 inches) tall and have at least a 40-inch waist go to the Big & Tall section. Others you greet according to their proportions.

First, define a Measurements type:

type Measurements = {
    Height : int
    Waist : int
}

Then define partial active patterns: one to detect whether a customer meets the “big” criterion and another to detect whether he meets the “tall” criterion.

let (|Big|_|) (m : Measurements) =
    if m.Waist >= 40 then Some Big else None
let (|Tall|_|) (m : Measurements) =
    if m.Height >= 72 then Some Tall else None

Now that you have the patterns defined, you can use them in a match expression:

let sizeUp m =
    match m with
    | Big & Tall -> 
        "Let me show you to the big & tall section"
    | Big -> 
        "Big fella, ain'tcha?"
    | Tall -> 
        "How's the weather up there?"
    | _ -> 
        "May I help you, sir?"

Notice how you can use partial active patterns together and by themselves. Now you run some customers through the sizeUp function:

let me = { Height = 76; Waist = 36 }
let shrimp = { Height = 58; Waist = 28 }
let hoss = { Height = 80; Waist = 46 }
let tubby = { Height = 63; Waist = 42 }

seq [me; shrimp; hoss; tubby]
|> Seq.map sizeUp
|> Seq.iter (printfn "%s")
// How's the weather up there?
// May I help you, sir?
// Let me show you to the big & tall section
// Big fella, ain'tcha?

Parameterizing on the Fly

Finally, active patterns may accept more than just the one parameter they test against. They can accept additional parameters in order to configure or modify their behavior.

You have a list of test scores, and you want to convert each to a letter grade. You can start by defining the partial active pattern AtLeast that takes a number, n, and returns a match only if evaluated against a score that is greater than or equal to n:

let (|AtLeast|_|) n score =
    if score >= n then Some AtLeast else None

Now build a grade function using AtLeast:

let grade =
    function
    | AtLeast 90 -> "A"
    | AtLeast 80 -> "B"
    | AtLeast 70 -> "C"
    | AtLeast 60 -> "D"
    | _ -> "F"

Just a reminder that order matters in match expressions. If you had put the AtLeast 60 match case before the AtLeast 90 match case, then AtLeast 90 would never match. AtLeast 60 would always match first.

Evaluating a list of scores is now straightforward:

seq [95; 31; 72; 67; 89]
|> Seq.map (fun n -> n, grade n)
|> Seq.iter (printfn "%O")
// (95, A)
// (31, F)
// (72, C)
// (67, D)
// (89, B)

Scala Saturday – Pattern Matching, Part 3: More Case Classes

Another tool in the pattern matching utility belt is disjoint unions. A disjoint union is a type that contains a small number of union cases. Each union case is treated as the same type as the disjoint union, but each case may comprise a different set of properties. You can build disjoint unions with Scala traits and case classes.

Rockin’ Disjoint (Unions)

You could use a disjoint union to represent the different cast type options when opening a socket to send UDP packets: broadcast, unicast, and multicast. If you are opening a socket to broadcast, you only need a port number. If you are opening a socket to unicast, you also need the IP address of the machine you are unicasting to. Finally, if you are opening a socket to multicast, you need an IP address (the multicast group), a port number, and optionally the ID of a network interface card if you want to target a specific network interface.

sealed trait SocketMeta
case class Broadcast(port: Int) extends SocketMeta
case class Unicast(addr: InetAddress, port: Int)
  extends SocketMeta
case class Multicast(addr: InetAddress, port: Int,
                     nicId: Option[String])
  extends SocketMeta

The SocketMeta trait serves as your disjoint union type. Define each union case as a case class that mixes in the SocketMeta trait. As you can see, each union case can have any number of properties, according to the familiar case class constructor syntax. Limit the number of union cases by marking SocketMeta as sealed. That way, all SocketMeta union cases must reside in this file.

Now it’s easy to write a function to build a socket based on the meta information:

def buildSocket(meta: SocketMeta) = {
  meta match {
    case Broadcast(port) =>
      // configure for broadcast
    case Unicast(ip, port) =>
      // configure for unicast
    case Multicast(ip, port, nic) =>
      // configure for multicast
  }
}

val s = buildSocket(Broadcast(5150))

Disjoint unions are good for times when the number of cases is small, and the number of properties in each case is small. For instance, you have likely worked with a particular disjoint union type many times. Option is essentially this:

sealed trait Option[+A]
case object None extends Option[Nothing]
case class Some[+A](get: A) extends Option[A]

(I know I have not covered case objects explicitly, but if you get the concept of a case class, case objects are pretty straightforward, too. They are singleton objects that you can use in pattern matching.)

Option is either some value or nothing at all; that’s it. There is no third alternative. It is very easy to cover all the bases in a fairly short block of code.

Option also illustrates one more thing about disjoint unions: union cases need not have any properties at all. None stands alone; it requires no properties.

Typing Lesson

It turns out that disjoint unions are a specific example of a more general kind of type: an algebraic data type. An algebraic data type is a composite type; that is, it is a combination of other types. An algebraic data type may comprise an infinite set of types or a finite set.

SocketMeta and Option both clearly consist of a finite number of types—a small finite number at that. As mentioned already, that’s what makes them so well suited to match expressions.

F# Friday – Pattern Matching, Part 3: Discriminated Unions

Another tool in the pattern matching utility belt is discriminated unions. A discriminated union is a type that contains a small number of union cases. Each union case is treated as the same type as the discriminated union, but each case may comprise a different set of properties.

Discriminating Tastes

You could use a discriminated union to represent the different cast type options when opening a socket to send UDP packets: broadcast, unicast, and multicast. If you are opening a socket to broadcast, you only need a port number. If you are opening a socket to unicast, you also need the IP address of the machine you are unicasting to. Finally, if you are opening a socket to multicast, you need an IP address (the multicast group), a port number, and optionally the ID of a network interface card if you want to target a specific network interface.

type SocketMeta =
| Broadcast of Port : int
| Unicast of Addr : IPAddress * Port : int
| Multicast of Addr : IPAddress * Port : int * 
               NicId : string option

Each case can have any number of properties. If there is more than one property, chain them together with tuple syntax using the asterisk (*).

Now it’s easy to write a function to build a socket based on the meta information:

let buildSocket meta =
    match meta with
    | Broadcast port ->
        // configure for broadcast
    | Unicast(ip, port) ->
        // configure for unicast
    | Multicast(ip, port, nic) ->
        // configure for multicast

let addr = IPAddress.Parse("10.9.01.25")
let s = buildSocket (Unicast (addr, 5150))

Discriminated unions are good for times when the number of cases is small, and the number of properties in each case is small. For instance, you have likely worked with a particular discriminated union type many times. Option is essentially this:

type Option<'T> =
| None
| Some of 'T

Option is either some value or nothing at all; that’s it. There is no third alternative. It is very easy to cover all the bases in a fairly short block of code.

Option also illustrates a couple of additional things about discriminated unions. First, union cases need not have any properties at all. None stands alone; it requires no properties. Second, naming the properties is optional. Some is so semantically clear, there is no need to give its single property a name. A type (which happens to be generic in the case of Option) is always required, but not a name.

Typing Lesson

It turns out that discriminated unions are a specific example of a more general kind of type: an algebraic data type. An algebraic data type is a composite type; that is, it is a combination of other types. An algebraic data type may comprise an infinite set of types or a finite set.

SocketMeta and Option both clearly consist of a finite number of types—a small finite number at that. As mentioned already, that’s what makes them so well suited to match expressions.