Categories

## Scala Saturday – Stream.collect

Filtering over a sequence of values omits values that do not meet certain criteria. Mapping over a sequence of values transforms each value into another value. What if you could do both at the same time—filter out unwanted values, but transform the ones that are left? You can with `Stream.collect`. But first, you need to know about partial functions.

### Partial Functions

A partial function is a function that has a limited domain, i.e., is not defined for every possible value of its input type, but only a subset.

The classic example is division. Division is undefined for a divisor of zero. In other words, m ÷ n is valid unless n = 0. So then, division is not defined for every number n. In this particular example, that’s not a big limitation on the domain, but it is nevertheless a limitation that prevents us from saying that division is defined for every possible n.

Scala has a `PartialFunction` type that allows you to represent a function that is only valid for a limited domain. Here is how you could represent integer division:

```val divide = new PartialFunction[(Int,Int), Int] {
override def isDefinedAt(x: (Int, Int)) = x._2 != 0
override def apply(x: (Int, Int)) = x._1 / x._2
}

val quotient = divide(12, 4)
// quotient: Int = 3
```

Partial functions have the `apply` method that other functions have so that you can execute them with parentheses: `divide(12, 3)`. They also have an `isDefinedAt` method so that you can ask the partial function, “Hey, can you handle this input?” That way, you can use an `if-else` expression to return a default or some other value:

```val fine = if (divide.isDefinedAt(12, 4)) {
divide(12, 4)
} else Int.MaxValue
// three: Int = 3

val meh = if (divide.isDefinedAt(12, 0)) {
divide(12, 0)
} else Int.MaxValue
// meh: Int = 2147483647
```

In fact, this is such a common pattern, that `PartialFunction` has `applyOrElse` that takes an input and a default function that is executed if the partial function is not defined for the given input:

```val default = Function.const(Int.MinValue) _  // lifted
val fine = divide.applyOrElse((12, 4), default)
// fine: Int = 3
val meh = divide.applyOrElse((12, 0), default)
// meh: Int = -2147483648
```

Now just because a partial function has a limited domain doesn’t mean that Scala prevents you from calling it on inputs that are outside its domain:

```val quotient = divide(12, 0)
// java.lang.ArithmeticException: / by zero
```

Therefore, remember to check the domain of a partial function before applying it to a given input. A responsibly crafted API that accepts partial functions from you will verify that an input is in the partial function’s domain before applying it.

You may be thinking, “That’s great, but it’s got a lot of boilerplate.” That’s true. Scala is nice enough to let you use pattern matching syntax to define a partial function in a terser fashion:

```val divide: PartialFunction[(Int,Int), Int] = {
case (num, den) if den != 0 => num / den
}

val quotient = divide(12, 4)
// quotient: Int = 3
```

Finally, perhaps a single partial function is not defined for the entire set of possible inputs, but you can use multiple partial functions that together cover the entire input range. It’s a contrived example, but you can take one partial function that is defined for even integers and another one that is defined for odds and then compose them together with the `orElse` method to get a partial function that does cover the entire set of possible inputs:

```val square: PartialFunction[Int,Int] = {
case x if x % 2 == 0 => x * x
}
val cube: PartialFunction[Int,Int] = {
case x if x % 2 == 1 => x * x * x
}
val transform = square orElse cube

val squared = transform(4)
// squared: Int = 16

val cubed = transform(3)
// cubed: Int = 27
```

### Collect: Filter and Map in One

Whereas `Stream.filter` takes a predicate—a function that takes a value and returns a Boolean—`Stream.collect` takes—you guessed it—a partial function. `Stream.collect` checks each element of the stream to see whether it is in the partial function’s domain. If the partial function is not defined for the input element, then `Stream.collect` discards it. If the input is within the partial function’s domain, then `Stream.collect` applies the partial function to the input element and returns the result as the next element in the output sequence.

```val squaredEvens = (4 to 7).toStream.collect {
case n if n % 2 == 0 => n * n
}
// squaredEvens: Stream[Int] = Stream(16, 36)
```

The following graphic illustrates what is going on in the code above:

OK, so `Stream.collect` performs a filter and a map all in one. Why not just call `Stream.filter` and then `Stream.map`? One example I’ve seen is when you’re pattern matching and destructuring and then only using one/some of the potential match cases. Perhaps you have a trait and some case classes representing orders that were either fulfilled or cancelled before fulfillment:

```trait Order
case class Fulfilled(id: String, total: BigDecimal)
case class Cancelled(id: String, total: BigDecimal)
```

You’d like to know how many dollars you “lost” in cancelled orders. Use `Stream.collect` to extract the dollar value of each cancelled order, and then sum them:

```val orders = Stream(
Fulfilled("fef3356074b4", BigDecimal("28.50")),
Fulfilled("2605c9988f1d", BigDecimal("88.25")),
Cancelled("94edac47971f", BigDecimal("22.01")),
Fulfilled("2a1ff57b8f46", BigDecimal("39.30")),
Fulfilled("9ee0a3e3da3a", BigDecimal("27.97")),
Fulfilled("08d58811ed36", BigDecimal("53.72")),
Cancelled("63ebd07475ca", BigDecimal("93.66")),
Cancelled("12d16ae9c112", BigDecimal( "7.79")),
Fulfilled("c5ecedaedb0e", BigDecimal("87.21")) )

val cancelledDollars = orders.collect {
case Cancelled(_, dollars) => dollars
}.sum
// cancelledDollars: BigDecimal = 222.95
```
Categories

## F# Friday – Seq.choose

Filtering over a sequence of values omits values that do not meet certain criteria. Mapping over a sequence of values transforms each value into another value. What if you could do both at the same time—filter out unwanted values, but transform the ones that are left? You can with `Seq.choose`.

Whereas `Seq.filter` takes a predicate—a function that takes a value and returns a Boolean—`Seq.choose` takes a function that takes a value and returns an `Option`. If that `Option` is `None`, then `Seq.choose` discards it. If it is `Some`, then `Seq.choose` extracts the value from the `Some` and returns it as the next element in the output sequence.

```let f = fun n -> match n % 2 with
| 0 -> Some (n * n)
| _ -> None
let squaredEvens = seq [4..7]
|> Seq.choose f
// val squaredEvens : seq<int> = seq [16; 36]
```

The following graphic illustrates what is going on:

OK, so `Seq.choose` performs a filter and a map all in one. Why not just call `Seq.filter` and then `Seq.map`? One example I’ve seen is when you’re pattern matching and destructuring and then only using one/some of the potential match cases. Perhaps you have a discriminated union representing orders that were either fulfilled or cancelled before fulfillment:

```type Order =
| Fulfilled of id : string * total : decimal
| Cancelled of id : string * total : decimal
```

You’d like to know how many dollars you “lost” in cancelled orders. Use `Seq.choose` to extract the dollar value of each cancelled order, and then sum them:

```let orders = [
Fulfilled ("fef3356074b4", 28.50m)
Fulfilled ("2605c9988f1d", 88.25m)
Cancelled ("94edac47971f", 22.01m)
Fulfilled ("2a1ff57b8f46", 39.30m)
Fulfilled ("9ee0a3e3da3a", 27.97m)
Fulfilled ("08d58811ed36", 53.72m)
Cancelled ("63ebd07475ca", 93.66m)
Cancelled ("12d16ae9c112",  7.79m)
Fulfilled ("c5ecedaedb0e", 87.21m)
]

let cancelledDollars =
orders
|> Seq.choose (function
| Cancelled (_, dollars) ->
Some dollars
| _ -> None)
|> Seq.sum
// val cancelledDollars : decimal = 222.95M
```
Categories

## Scala Saturday – The Stream.grouped Method

Another method `Stream` offers is `Stream.grouped`, which divides a stream’s elements into groups of a given size.

To take an example, if you have a stream of twelve elements and call `Stream.grouped` to turn it into groups of three, you’ll get an iterator over four sequences, each three elements in size:

```val xs = (1 to 12).toStream
val grouped = xs.grouped(3)
// grouped: Iterator[Stream[Int]] =
//   Iterator(
//     Stream(1, 2, 3), Stream(4, 5, 6),
//     Stream(7, 8, 9), Stream(10, 11, 12))
```

What happens if you use a group size that does not divide evenly into the size of your input stream? No sweat! The last group just contains any remaining elements, however many they may be:

```val xs = (1 to 10).toStream
val grouped = xs.grouped(3)
// grouped: Iterator[Stream[Int]] =
//   Iterator(
//     Stream(1, 2, 3), Stream(4, 5, 6),
//     Stream(7, 8, 9), Stream(10))
```

Where is this useful? Well, you can take my paging example from my Scala Saturday post on `Stream.drop` and make it slightly clearer without the `(page - 1) * perPage` arithmetic:

```case class Book(title: String, author: String)

val books = Stream(
Book("Wuthering Heights", "Emily Bronte"),
Book("Jane Eyre", "Charlotte Bronte"),
Book("Agnes Grey", "Anne Bronte"),
Book("The Scarlet Letter", "Nathaniel Hawthorne"),
Book("Silas Marner", "George Eliot"),
Book("1984", "George Orwell"),
Book("Billy Budd", "Herman Melville"),
Book("Moby Dick", "Herman Melville"),
Book("The Great Gatsby", "F. Scott Fitzgerald"),
Book("Tom Sawyer", "Mark Twain")
)

val perPage = 3
val page = 3
val records = books.grouped(perPage)
.drop(page - 1)
.next
// records: scala.collection.immutable.Stream[Book] =
//   Stream(Book(Billy Budd,Herman Melville),
//     Book(Moby Dick,Herman Melville),
//     Book(The Great Gatsby,F. Scott Fitzgerald))
```

This time, instead of having to calculate the number of elements to skip in order to skip n pages, you first use `Stream.grouped` to turn the stream into a paged recordset; each “page” is n records long. Then drop `page - 1` pages in order to get to the page of records you want. Finally, calling `Iterator.next` is necessary because, remember, `Stream.grouped` turns a flat stream into a stream of streams.

I will admit that I find it irritating that `Stream.grouped` returns something that does not have a `head` method. Calling `Iterator.next`, while just as easy, is inconsistent with collection semantics. It seems to me that `Stream.grouped` ought to return a collection rather than an iterator. Perhaps there was once a reason for returning an iterator instead of a collection, but it would be nice if we could fix that.

Categories

## F# Friday – The Seq.chunkBySize Function

Another module function new with F# 4.0 is `Seq.chunkBySize` (so new, in fact, that there is not even a hint of it on MSDN as of this writing, and hence the Github link). `Seq.chunkBySize` groups a sequence’s elements into arrays (chunks) of a given size.

To take an example, if you have a sequence of twelve elements and call `Seq.chunkBySize` to turn it into groups of three, you’ll get a sequence of four arrays, each three elements in size:

```let xs = seq [1..12]
let chunked = xs |> Seq.chunkBySize 3
// val chunked : seq<int []> =
//   seq [[|1; 2; 3|]; [|4; 5; 6|];
//        [|7; 8; 9|]; [|10; 11; 12|]]
```

What happens if you use a chunk size that does not divide evenly into the size of your input sequence? No sweat! The last array just contains any remaining elements, however many they may be:

```let xs = seq [1..10]
let chunked = xs |> Seq.chunkBySize 3
// val chunked : seq<int []> =
//   seq [[|1; 2; 3|]; [|4; 5; 6|];
//        [|7; 8; 9|]; [|10|]]
```

Where is this useful? Well, you can take my paging example from my F# Friday post on `Seq.skip` and make it slightly clearer without the `(page - 1) * perPage` arithmetic:

```type Book =
{ Title : string
Author : string }

let books =
seq [
{ Title = "Wuthering Heights"
Author = "Emily Bronte" }
{ Title = "Jane Eyre"
Author = "Charlotte Bronte" }
{ Title = "Agnes Grey"
Author = "Anne Bronte" }
{ Title = "The Scarlet Letter"
Author = "Nathaniel Hawthorne" }
{ Title = "Silas Marner"
Author = "George Eliot" }
{ Title = "1984"
Author = "George Orwell" }
{ Title = "Billy Budd"
Author = "Herman Melville" }
{ Title = "Moby Dick"
Author = "Herman Melville" }
{ Title = "The Great Gatsby"
Author = "F. Scott Fitzgerald" }
{ Title = "Tom Sawyer"
Author = "Mark Twain" }
]

let perPage = 3
let page = 3
let records =
books
|> Seq.chunkBySize perPage
|> Seq.skip (page - 1)

// val records : Book [] =
//   [|{Title = "Billy Budd";
//      Author = "Herman Melville";};
//     {Title = "Moby Dick";
//      Author = "Herman Melville";};
//     {Title = "The Great Gatsby";
//      Author = "F. Scott Fitzgerald";}|]
```

This time, instead of having to calculate the number of elements to skip in order to skip n pages, you first use `Seq.chunkBySize` to turn the sequence into a paged recordset; each “page” is n records long. Then skip `page - 1` pages in order to get to the page of records you want. Finally, calling `Seq.head` is necessary because, remember, `Seq.chunkBySize` turns a flat sequence into a sequence of arrays.

One final note: the `Array` and `List` modules also contain a `chunkBySize` function in F# 4.0.

Categories

## Scala Saturday – The Stream.distinct Method

Scala Saturday today is short and sweet: `Stream.distinct`. `Stream.distinct` removes any duplicate members of a stream, leaving only unique values.

One way to remove duplicates is to turn your stream into a set with `Stream.toSet`:

```val noDupes = Stream(3,5,6,3,3,7,1,1,7,3,2,7).toSet
// noDupes: scala.collection.immutable.Set[Int] =
//   Set(5, 1, 6, 2, 7, 3)
```

That’s fine if you don’t care about preserving the order of the items in the input stream.

But if you do want to preserve the order, `Stream.distinct` is the ticket:

```val noDupesOrdered =
Stream(3,5,6,3,3,7,1,1,7,3,2,7).distinct
// noDupesOrdered: scala.collection.immutable.Stream[Int] =
//   Stream(3, 5, 6, 7, 1, 2)
```
Categories

## F# Friday – The Seq.distinct Function

Today’s F# Friday is a simple one: `Seq.distinct`. `Seq.distinct` removes any duplicate members of a sequence, leaving only unique values.

One way to remove duplicates is to turn your sequence into a set with `Set.ofSeq`:

```let noDupes = seq [3;5;6;3;3;7;1;1;7;3;2;7]
|> Set.ofSeq
// val noDupes : Set<int> =
//   set [1; 2; 3; 5; 6; 7]
```

That’s fine if you don’t care about preserving the order of the items in the input sequence.

But if you do want to preserve the order, `Seq.distinct` is the ticket:

```let noDupesOrdered =
seq [3;5;6;3;3;7;1;1;7;3;2;7]
|> Seq.distinct
// val noDupesOrdered : seq<int> =
//   seq [3; 5; 6; 7; 1; 2]
```
Categories

## Scala Saturday – The Stream.dropWhile Method

Just as the analog to `Stream.take` is `Stream.skip`, the analog to `Stream.takeWhile` is `Stream.dropWhile`. That is, when you don’t care so much about dropping a certain number of items, but rather a certain kind of items.

`Stream.dropWhile` starts at the beginning of the stream and applies a predicate to each item, one by one. It does not start returning items in a new stream until it reaches an item that does not meet the predicate. Then it stops checking elements against the predicate and returns every item in the stream from that point on:

Assume you have the same temperature sensor as the one in my post on `Stream.takeWhile`. This time, instead of once per minute, assume that it feeds you temperature readings once per second. Add to that the idea that the sensor has a few seconds of boot-up time in which it sends you -1000.0—the indication that the current reading is invalid—until it has fully booted and can start sending good temperature data.

```import java.time.{LocalDateTime, Month}
case class Reading(temperature: Double, timestamp: LocalDateTime)

Reading(-1000.0, LocalDateTime.of(2015, Month.JULY, 19, 10, 0, 0)),
Reading(-1000.0, LocalDateTime.of(2015, Month.JULY, 19, 10, 0, 1)),
Reading(-1000.0, LocalDateTime.of(2015, Month.JULY, 19, 10, 0, 2)),
Reading(-1000.0, LocalDateTime.of(2015, Month.JULY, 19, 10, 0, 3)),
Reading(-1000.0, LocalDateTime.of(2015, Month.JULY, 19, 10, 0, 4)),
Reading(-1000.0, LocalDateTime.of(2015, Month.JULY, 19, 10, 0, 5)),
Reading(90.1, LocalDateTime.of(2015, Month.JULY, 19, 10, 0, 6)),
Reading(90.2, LocalDateTime.of(2015, Month.JULY, 19, 10, 0, 7)),
Reading(90.2, LocalDateTime.of(2015, Month.JULY, 19, 10, 0, 8)),
Reading(90.3, LocalDateTime.of(2015, Month.JULY, 19, 10, 0, 9)),
Reading(90.2, LocalDateTime.of(2015, Month.JULY, 19, 10, 0, 10))
)
```

To drop all readings until the thermometer starts returning valid data, use `Stream.dropWhile`:

```val valid = readings dropWhile (_.temperature == -1000.0)

```

Finally, like `Stream.takeWhile`, `Stream.dropWhile` doesn’t balk if it never reaches an element that fails to meet the predicate. You just get an empty stream:

```val none = Stream(1,3,4,7) dropWhile { _ < 10 }
// none: scala.collection.immutable.Stream[Int] = Stream()
```
Categories

## F# Friday – The Seq.skipWhile Function

Just as the analog to `Seq.take` is `Seq.skip`, the analog to `Seq.takeWhile` is `Seq.skipWhile`. That is, when you don’t care so much about skipping a certain number of items, but rather a certain kind of items.

`Seq.skipWhile` starts at the beginning of the sequence and applies a predicate to each item, one by one. It does not start returning items in a new sequence until it reaches an item that does not meet the predicate. Then it stops checking elements against the predicate and returns every item in the sequence from that point on:

Assume you have the same temperature sensor as the one in my post on `Seq.takeWhile`. This time, instead of once per minute, assume that it feeds you temperature readings once per second. Add to that the idea that the sensor has a few seconds of boot-up time in which it sends you -1000.0—the indication that the current reading is invalid—until it has fully booted and can start sending good temperature data.

```type Reading = {
Temperature : float
Timestamp : DateTime
}

seq [
{ Temperature = -1000.0
Timestamp = DateTime(2015, 07, 19, 10, 0, 0) }
{ Temperature = -1000.0
Timestamp = DateTime(2015, 07, 19, 10, 0, 1) }
{ Temperature = -1000.0
Timestamp = DateTime(2015, 07, 19, 10, 0, 2) }
{ Temperature = -1000.0
Timestamp = DateTime(2015, 07, 19, 10, 0, 3) }
{ Temperature = -1000.0
Timestamp = DateTime(2015, 07, 19, 10, 0, 4) }
{ Temperature = -1000.0
Timestamp = DateTime(2015, 07, 19, 10, 0, 5) }
{ Temperature = 90.1
Timestamp = DateTime(2015, 07, 19, 10, 0, 6) }
{ Temperature = 90.2
Timestamp = DateTime(2015, 07, 19, 10, 0, 7) }
{ Temperature = 90.2
Timestamp = DateTime(2015, 07, 19, 10, 0, 8) }
{ Temperature = 90.1
Timestamp = DateTime(2015, 07, 19, 10, 0, 9) }
{ Temperature = 90.3
Timestamp = DateTime(2015, 07, 19, 10, 0, 10) }
]
```

To skip all readings until the thermometer starts returning valid data, use `Seq.skipWhile`:

```let valid =
|> Seq.skipWhile (fun r -> r.Temperature = -1000.0)

//  [{Temperature = 90.1;
//    Timestamp = 7/19/2015 10:00:06 AM};
//   {Temperature = 90.2;
//    Timestamp = 7/19/2015 10:00:07 AM};
//   {Temperature = 90.2;
//    Timestamp = 7/19/2015 10:00:08 AM};
//   {Temperature = 90.1;
//    Timestamp = 7/19/2015 10:00:09 AM};
//   {Temperature = 90.3;
//    Timestamp = 7/19/2015 10:00:10 AM}]
```

Finally, even though `Seq.skip` throws an exception if you ask it to skip more elements than the sequence contains, `Seq.skipWhile` does not balk if it never reaches an element that fails to meet the predicate. You just get an empty sequence:

```let none = seq [1;3;4;7]
|> Seq.skipWhile (fun n -> n < 10)
// val none : seq<int> = seq []
```

As of F# 4.0, there are versions of `skipWhile` in the `Array` and `List` modules, but as of this writing, the documentation at MSDN does not yet include them.

Categories

## Scala Saturday – The Stream.drop Method

The opposite of `Stream.take` is `Stream.drop`. `Stream.drop`, as the name suggests, skips the first n items of the sequence and returns a new sequence that starts with element n + 1:

```val xs = (1 to 10).toStream
val dropped5 = xs drop 5
// dropped5: scala.collection.immutable.Stream[Int] =
//   Stream(6, 7, 8, 9, 10)
```

One of most obvious applications of `Stream.drop` is to pair it with `Stream.take` to page through a set of records. Perhaps you have a list of books:

```case class Book(title: String, author: String)

val books = Stream(
Book("Wuthering Heights", "Emily Bronte"),
Book("Jane Eyre", "Charlotte Bronte"),
Book("Agnes Grey", "Anne Bronte"),
Book("The Scarlet Letter", "Nathaniel Hawthorne"),
Book("Silas Marner", "George Eliot"),
Book("1984", "George Orwell"),
Book("Billy Budd", "Herman Melville"),
Book("Moby Dick", "Herman Melville"),
Book("The Great Gatsby", "F. Scott Fitzgerald"),
Book("Tom Sawyer", "Mark Twain")
)
```

If each page shows three books, and the user wants to see the records on page three, skip the first two pages’ worth of records, and take the next three records:

```val perPage = 3
val page = 3
val records = books.drop((page - 1) * perPage)
.take(perPage)
// records: scala.collection.immutable.Stream[Book] =
//   Stream(Book(Billy Budd,Herman Melville),
//     Book(Moby Dick,Herman Melville),
//     Book(The Great Gatsby,F. Scott Fitzgerald))
```

Fortunately, like `Stream.take`, if you ask the sequence for more elements than it contains, you simply get an empty stream:

```val empty= (1 to 5).toStream drop 6
// empty: scala.collection.immutable.Stream[Int] =
//   Stream()
```
Categories

## F# Friday – The Seq.skip Function

The opposite of `Seq.take` is `Seq.skip`. `Seq.skip`, as the name suggests, skips the first n items of the sequence and returns a new sequence that starts with element n + 1:

```let xs = seq [1..10]
let skipped5 = xs |> Seq.skip 5
// val skipped5 : seq<int> =
//   [6; 7; 8; 9; 10]
```

One of most obvious applications of `Seq.skip` is to pair it with `Seq.take` or `Seq.truncate` to page through a set of records. Perhaps you have a list of books:

```type Book =
{ Title : string
Author : string }

let books =
seq [
{ Title = "Wuthering Heights"
Author = "Emily Bronte" }
{ Title = "Jane Eyre"
Author = "Charlotte Bronte" }
{ Title = "Agnes Grey"
Author = "Anne Bronte" }
{ Title = "The Scarlet Letter"
Author = "Nathaniel Hawthorne" }
{ Title = "Silas Marner"
Author = "George Eliot" }
{ Title = "1984"
Author = "George Orwell" }
{ Title = "Billy Budd"
Author = "Herman Melville" }
{ Title = "Moby Dick"
Author = "Herman Melville" }
{ Title = "The Great Gatsby"
Author = "F. Scott Fitzgerald" }
{ Title = "Tom Sawyer"
Author = "Mark Twain" }
]
```

If each page shows three books, and the user wants to see the records on page three, skip the first two pages’ worth of records, and take the next three records:

```let perPage = 3
let page = 3
let records =
books
|> Seq.skip ((page - 1) * perPage)
|> Seq.take perPage

// val records : seq<Book> =
//   [{Title = "Billy Budd";
//     Author = "Herman Melville";};
//    {Title = "Moby Dick";
//     Author = "Herman Melville";};
//    {Title = "The Great Gatsby";
//     Author = "F. Scott Fitzgerald";}]
```

Unfortunately, like `Seq.take`, if you ask the sequence for more elements than it contains, it throws an exception:

```let oops = seq [1..5]
|> Seq.skip 6
|> printfn "%A"
// System.InvalidOperationException:
//   The input sequence has an insufficient
//   number of elements.
```