Categories
Tech

Scala Saturday – Stream.collect

Filtering over a sequence of values omits values that do not meet certain criteria. Mapping over a sequence of values transforms each value into another value. What if you could do both at the same time—filter out unwanted values, but transform the ones that are left? You can with Stream.collect. But first, you need to know about partial functions.

Partial Functions

A partial function is a function that has a limited domain, i.e., is not defined for every possible value of its input type, but only a subset.

The classic example is division. Division is undefined for a divisor of zero. In other words, m ÷ n is valid unless n = 0. So then, division is not defined for every number n. In this particular example, that’s not a big limitation on the domain, but it is nevertheless a limitation that prevents us from saying that division is defined for every possible n.

Scala has a PartialFunction type that allows you to represent a function that is only valid for a limited domain. Here is how you could represent integer division:

val divide = new PartialFunction[(Int,Int), Int] {
  override def isDefinedAt(x: (Int, Int)) = x._2 != 0
  override def apply(x: (Int, Int)) = x._1 / x._2
}

val quotient = divide(12, 4)
// quotient: Int = 3

Partial functions have the apply method that other functions have so that you can execute them with parentheses: divide(12, 3). They also have an isDefinedAt method so that you can ask the partial function, “Hey, can you handle this input?” That way, you can use an if-else expression to return a default or some other value:

val fine = if (divide.isDefinedAt(12, 4)) {
  divide(12, 4)
} else Int.MaxValue
// three: Int = 3

val meh = if (divide.isDefinedAt(12, 0)) {
  divide(12, 0)
} else Int.MaxValue
// meh: Int = 2147483647

In fact, this is such a common pattern, that PartialFunction has applyOrElse that takes an input and a default function that is executed if the partial function is not defined for the given input:

val default = Function.const(Int.MinValue) _  // lifted
val fine = divide.applyOrElse((12, 4), default)
// fine: Int = 3
val meh = divide.applyOrElse((12, 0), default)
// meh: Int = -2147483648

Now just because a partial function has a limited domain doesn’t mean that Scala prevents you from calling it on inputs that are outside its domain:

val quotient = divide(12, 0)
// java.lang.ArithmeticException: / by zero

Therefore, remember to check the domain of a partial function before applying it to a given input. A responsibly crafted API that accepts partial functions from you will verify that an input is in the partial function’s domain before applying it.

You may be thinking, “That’s great, but it’s got a lot of boilerplate.” That’s true. Scala is nice enough to let you use pattern matching syntax to define a partial function in a terser fashion:

val divide: PartialFunction[(Int,Int), Int] = {
  case (num, den) if den != 0 => num / den 
}

val quotient = divide(12, 4)
// quotient: Int = 3

Finally, perhaps a single partial function is not defined for the entire set of possible inputs, but you can use multiple partial functions that together cover the entire input range. It’s a contrived example, but you can take one partial function that is defined for even integers and another one that is defined for odds and then compose them together with the orElse method to get a partial function that does cover the entire set of possible inputs:

val square: PartialFunction[Int,Int] = {
  case x if x % 2 == 0 => x * x
}
val cube: PartialFunction[Int,Int] = {
  case x if x % 2 == 1 => x * x * x
}
val transform = square orElse cube

val squared = transform(4)
// squared: Int = 16

val cubed = transform(3)
// cubed: Int = 27

Collect: Filter and Map in One

Whereas Stream.filter takes a predicate—a function that takes a value and returns a Boolean—Stream.collect takes—you guessed it—a partial function. Stream.collect checks each element of the stream to see whether it is in the partial function’s domain. If the partial function is not defined for the input element, then Stream.collect discards it. If the input is within the partial function’s domain, then Stream.collect applies the partial function to the input element and returns the result as the next element in the output sequence.

val squaredEvens = (4 to 7).toStream.collect {
  case n if n % 2 == 0 => n * n
}
// squaredEvens: Stream[Int] = Stream(16, 36)

The following graphic illustrates what is going on in the code above:

Stream.collect takes a partial function performs a transform on its inputs for which it is defined. The resulting stream only retains the transformed values for which the partial function is defined; Stream.collect filters out any values not in the partial function's domain.
Collecting Items from a Stream

OK, so Stream.collect performs a filter and a map all in one. Why not just call Stream.filter and then Stream.map? One example I’ve seen is when you’re pattern matching and destructuring and then only using one/some of the potential match cases. Perhaps you have a trait and some case classes representing orders that were either fulfilled or cancelled before fulfillment:

trait Order
case class Fulfilled(id: String, total: BigDecimal)
case class Cancelled(id: String, total: BigDecimal)

You’d like to know how many dollars you “lost” in cancelled orders. Use Stream.collect to extract the dollar value of each cancelled order, and then sum them:

val orders = Stream(
  Fulfilled("fef3356074b4", BigDecimal("28.50")),
  Fulfilled("2605c9988f1d", BigDecimal("88.25")),
  Cancelled("94edac47971f", BigDecimal("22.01")),
  Fulfilled("2a1ff57b8f46", BigDecimal("39.30")),
  Fulfilled("9ee0a3e3da3a", BigDecimal("27.97")),
  Cancelled("db5dc439ad93", BigDecimal("99.49")),
  Fulfilled("08d58811ed36", BigDecimal("53.72")),
  Cancelled("63ebd07475ca", BigDecimal("93.66")),
  Cancelled("12d16ae9c112", BigDecimal( "7.79")),
  Fulfilled("c5ecedaedb0e", BigDecimal("87.21")) )

val cancelledDollars = orders.collect {
  case Cancelled(_, dollars) => dollars
}.sum
// cancelledDollars: BigDecimal = 222.95

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.