Categories

## Scala Saturday – Vector.splitAt

`Vector.splitAt` splits a vector into two parts at the index you specify. The first part ends just before the element at the given index; the second part starts with the element at the given index. Some examples will help:

### Some Examples

The most obvious use case is bisecting a vector, i.e., splitting it into two (nearly) equal parts. To do that, half the length, and use it as your splitting index:

```val xs = Vector(1,2,3,4,5,6)
val mid = xs.length / 2 // 3
val (left, right) = xs splitAt mid
// left: Vector[Int] = Vector(1, 2, 3)
// right: Vector[Int] = Vector(4, 5, 6)
```

But what happens if your vector contains an odd number of elements? No sweat! Because of the way integer division works, the quotient length ÷ 2 is truncated. In other words, the `left` vector will always have one less element than the `right` vector:

```val xs = Vector(1,2,3,4,5)
val mid = xs.length / 2 // 2
val (left, right) = xs splitAt mid
// left: Vector[Int] = Vector(1, 2)
// right: Vector[Int] = Vector(3, 4, 5)
```

Of course, you don’t have to cut the thing in half. You can split it anywhere:

```val xs = Vector(1,2,3,4,5)
val (left, right) = xs splitAt 1
// left: Vector[Int] = Vector(1)
// right: Vector[Int] = Vector(2, 3, 4, 5)
```

Or …

```val xs = Vector(1,2,3,4,5)
val (left, right) = xs splitAt 4
// left: Vector[Int] = Vector(1, 2, 3, 4)
// right: Vector[Int] = Vector(5)
```

What happens if you split at index 0?

```val xs = Vector(1,2,3,4,5)
val (left, right) = xs splitAt 0
// left: Vector[Int] = Vector()
// right: Vector[Int] = Vector(1, 2, 3, 4, 5)
```

Ah, the `left` vector is empty! That’s because `Vector.splitAt` starts the `right` vector at the given index. There’s nothing before the given index, so the only thing left to return for `left` is an empty vector.

The reverse happens if you split at the length: The `right` vector is empty while the `left` vector contains the entire input vector:

```val xs = Vector(1,2,3,4,5)
val (left, right) = xs splitAt 5
// left: Vector[Int] = Vector(1, 2, 3, 4, 5)
// right: Vector[Int] = Vector()
```

But what happens if you try to split at an index greater than the length?

```val xs = Vector(1,2,3,4,5)
val (left, right) = xs splitAt 6
// left: Vector[Int] = Vector(1, 2, 3, 4, 5)
// right: Vector[Int] = Vector()
```

Well, that’s interesting. So then, if you split at any index greater than the length of the input vector, the `left` contains the input vector while the `right` vector is empty.

You get the reverse if you split on a negative number:

```val xs = Vector(1,2,3,4,5)
val (left, right) = xs splitAt -1
// left: Vector[Int] = Vector()
// right: Vector[Int] = Vector(1, 2, 3, 4, 5)
```

### Merge Sort with Vector.splitAt

You can use `Vector.splitAt` in performing a merge sort. `Vector.splitAt` is, admittedly, a pretty small piece of the puzzle. Merge sort is a divide-and-conquer algorithm, and `Vector.splitAt` just performs the divide part. Nevertheless it comes in handy for that part.

Speaking of which, start by using `Vector.splitAt` to define a `bisect` function that splits a vector in half:

```def bisect[A](xs: Vector[A]) = {
val mid = xs.length / 2
xs splitAt mid
}
```

A merge sort breaks a vector down into single-element (or empty) vectors and then puts them back together, sorting the elements of each block as it combines them. Start with the `merge` function that puts the blocks back together after you’ve broken them down:

```def merge(left: Vector[Int], right: Vector[Int]) = {
@tailrec
def mergeWith(l: Vector[Int], r: Vector[Int], acc: Vector[Int]): Vector[Int] = {
(l.isEmpty, r.isEmpty) match {
// If either side is empty, just add
// the non-empty side to the accumulator.
case (true, _) => acc ++ r
case (_, true) => acc ++ l

// Compare the head elements, and add
// the lesser head value to the
// accumulator. Then call recursively.
case _ =>
val lh = l.head
val rh = r.head
val (next, l2, r2) =
if (lh < rh) (lh, l.tail, r)
else (rh, l, r.tail)
mergeWith(l2, r2, acc :+ next)
}
}

mergeWith(left, right, Vector[Int]())
}
```

Now `mergeSort` can use `bisect` and `merge` to break down the vector and then merge it back together:

```def mergeSort(as: Vector[Int]): Vector[Int] =
as match {
case Vector() => as
case Vector(a) => as
case _ =>
val (l, r) = bisect(as)
merge(mergeSort(l), mergeSort(r))
}

val xs:  = Vector(43,48,3,23,28,6,25,43,16)
val sorted: Vector[Int] = mergeSort(xs)
// sorted: Vector[Int] =
//   Vector(3, 6, 16, 23, 25, 28, 43, 43, 48)
```

A quick announcement: I’ve got some instructional material to develop. Unfortunately it’s going to take up a fair amount of my time. This is probably the last Scala Saturday post for a little while, but I hope to pick back up in a month or two.

Categories

## Scala Saturday – Stream.groupBy

Sometimes you have a collection of items that you want to group according to some common property or key. `Stream.groupBy` can do that job for you. It takes that collection and returns a map keyed to that grouping key. The value for each key is the sequence of all the items that fall into that group.

That’s a little hard to follow. So what’s it useful for?

Well, maybe you need to group a list of names by initial. No sweat:

```val names = Stream(
"Rehoboam",
"Abijah",
"Asa",
"Jehoshaphat",
"Jehoram",
"Ahaziah")

val groupedByInitial = names.groupBy(_.head)
// groupedByInitial: Map[Char,Stream[String]] =
//   Map(
//     J -> Stream(Jehoshaphat, Jehoram),
//     A -> Stream(Abijah, Asa, Ahaziah),
//     R -> Stream(Rehoboam) )
```

And of course, if you want to sort those groups, convert the resulting map to a stream of tuples, and throw in a call to `Stream.sortBy` to sort by the first element in the tuple:

```val groupedByInitialAndSorted =
groupedByInitial.toStream.sortBy(_._1)
// val groupedByInitialAndSorted: Stream[(Char, Stream[String])] =
//   Stream(
//     A -> Stream(Abijah, Asa, Ahaziah),
//     J -> Stream(Jehoshaphat, Jehoram),
//     R -> Stream(Rehoboam) )
```

Another example: Maybe you want to group some test scores according to grade. That is, all the students who scored in the 90s are grouped together, then all those scoring in the 80s, and so on:

```case class TestScore(name: String, score: Int)

val grades = Stream(
TestScore("Anna", 74),
TestScore("Andy", 76),
TestScore("Brenda", 70),
TestScore("Bobby", 90),
TestScore("Charlotte", 98),
TestScore("Chuck", 83),
TestScore("Deborah", 88),
TestScore("Dan", 66),
TestScore("Ellie", 80),
TestScore("Ed", 61),
TestScore("Frannie", 89),
TestScore("Frank", 96) )

val grouped = grades.groupBy(_.score / 10 * 10)
// grouped: Map[Int,Stream[TestScore]] =
//   Map(
//     80 -> Stream(
//             TestScore(Chuck,83),
//             TestScore(Deborah,88),
//             TestScore(Ellie,80),
//             TestScore(Frannie,89) ),
//     70 -> Stream(
//             TestScore(Anna,74),
//             TestScore(Andy,76),
//             TestScore(Brenda,70) ),
//     60 -> Stream(
//             TestScore(Dan,66),
//             TestScore(Ed,61) ),
//     90 -> Stream(
//             TestScore(Bobby,90),
//             TestScore(Charlotte,98),
//             TestScore(Frank,96) )
//   )
```

You can take it another couple of steps to produce a histogram by counting the number of students in each group and sorting by the group key (i.e., grade level):

```val histogram = grouped.map {
case (grade, scores) => grade -> scores.length
}.toStream.sortBy(_._1).reverse
// histogram: Stream[(Int, Int)] =
//   Stream((90,3), (80,4), (70,3), (60,2))
```

One more example, and this one I borrowed from one of Steven Proctor’s Ruby Tuesday posts. You can find anagrams in a list of words by sorting the characters in each word and grouping on that:

```val anagrams = Stream(
"tar", "rat", "bar",
"rob", "art", "orb"
).groupBy(_.sorted)
// anagrams: Map[String,Stream[String]] =
//   Map(
//     abr -> Stream(bar),
//     art -> Stream(tar, rat, art),
//     bor -> Stream(rob, orb) )
```
Categories

## Scala Saturday – Code That Looks Like Math

Something that Scala and most other modern languages allow these days is variable names that contain what you might think of as non-traditional characters from the Unicode character set, e.g., Greek symbols such as π and τ. If you’re a C programmer, you have to settle for spelling out the name of the character:

```const double PI = 3.141592654;
double delta = x1 - x2;
```

But that’s OK, right? What’s the difference, really? The value π is one thing: it’s a universally recognized constant. But even with the example `delta` above, don’t you want to name it something more descriptive, like `marginOfError` anyway?

Well, yes, many times instead of using the characters verbatim from your physics textbook …

```val f = m * a
```

… you spell it out so that the code is clearer:

```val force = mass * acceleration
```

Likewise, even though Scala allows you to write the following:

```val Ï‰ = 2 * math.Pi * f
```

… it’s probably better practice to write …

```val angularVelocity = 2 * math.Pi * frequency
```

What’s the point of this post then? Sure, you can use “special” characters in variable names, but so far, I’ve discouraged you from doing it!

Nevertheless there are times when it is appropriate. If you are coding up an algorithm that consists of a series of well-known equations in a certain field of study, and the more your code looks like those equations, the easier it is to check it against the literature.

Consider the following—a series of values and equations for converting latitude and longitude to universal polar stereographic (UPS) coordinates, a way of representing coordinates at the earth’s poles:

UPS coordinates consist of a hemisphere—either northern or southern—and two distance components, easting and northing, both in meters:

```object Hemisphere extends Enumeration {
type Hemisphere = Value

val Northern = Value('N')
val Southern = Value('S')

def fromLatitude(lat: Double): Hemisphere =
if (lat < 0) Southern else Northern
}

case class UniversalPolarStereographic(
northing: Double,
easting: Double,
hemisphere: Hemisphere)
```

Now compare the code below to the equations from the literature above:

```def latLonToUps(
lat: Double,
lon: Double): UniversalPolarStereographic = {

val hemisphere = Hemisphere.fromLatitude(lat)

val Ï† = lat.abs
val Î» = lon

val Ï€ = math.Pi

val FN = 2000000.0
val FE = 2000000.0

val a = 6378137.0
val f = 1 / 298.257223563

val e_2 = f * (2 - f)
val e = math.sqrt(e_2)
val eOver2 = e / 2
val Câ‚’ = ((2 * a) / math.sqrt(1 - e_2)) *
math.pow((1 - e) / (1 + e), eOver2)
val kâ‚’ = 0.994
val Ï€Over4 = Ï€ / 4

val esinÏ† = e * math.sin(Ï†)
val Ï†Over2 = Ï† / 2

val tanZOver2 =
math.pow((1 + esinÏ†) / (1 - esinÏ†), eOver2) *
math.tan(Ï€Over4 - Ï†Over2)
val R = kâ‚’ * Câ‚’ * tanZOver2
val RcosÎ» = R * math.cos(Î»)
val RsinÎ» = R * math.sin(Î»)

val N = hemisphere match {
case Hemisphere.Northern => FN - RcosÎ»
case Hemisphere.Southern => FN + RcosÎ»
}
val E = FE + RsinÎ»

UniversalPolarStereographic(N, E, hemisphere)
}
```

It’s not perfect: you still cannot set numerators above denominators, for instance. But isn’t that easier to compare to the literature than if we had to write out `RsinLambda` or `eSinPhi`?

(Note: UPS coordinates are only valid for latitudes near the poles. For simplicity, the code above does not check to make sure that the input latitude falls within those bounds. I mean, it’s complex enough as it is for the sake of exemplifying the point of this post.)

(Note: I’m aware that some of the characters in the code don’t show up correctly on all browsers, e.g., the subscript “O” and perhaps the φ. I’m working to correct that. Nevertheless you should be able to use such symbols in your source code.)

Categories

## Scala Saturday – Array.last and Array.lastOption

Last week, we looked at `List.headOption`. If you don’t need the first item in a list, but rather the last item, the counterpart of `List.head` is `List.last`. Likewise, the counterpart of `List.headOption` is `List.lastOption`.

Recall the code example from last week:

```case class Salesman(name: String, sales: BigDecimal)

def findTopSalesman (salesmen : List[Salesman]) = {
salesmen.filter { _.sales >= 10000 }
.sortBy { -_.sales } // descending
.headOption
}

val sales = List(
Salesman("Joe Bob", 9500),
Salesman("Sally Jane", 18500),
Salesman("Betty Lou", 11800),
Salesman("Sammy Joe", 6500)
)

val top = findTopSalesman(sales)
// top: Option[Salesman] =
//   Some(Salesman(Sally Jane,18500))
```

The `List.sortBy` call (highlighted above) sorts the sales records in a descending fashion so that the first record is the top salesman. What if it makes more sense to you to sort the records in an ascending fashion and take the last record? With `List.lastOption`, you can:

```case class Salesman(name: String, sales: BigDecimal)

def findTopSalesman (salesmen : List[Salesman]) = {
salesmen.filter { _.sales >= 10000 }
.sortBy { _.sales } // ascending
.lastOption
}

val sales = List(
Salesman("Joe Bob", 9500),
Salesman("Sally Jane", 18500),
Salesman("Betty Lou", 11800),
Salesman("Sammy Joe", 6500)
)

val top = findTopSalesman(sales)
// top: Option[Salesman] =
//   Some(Salesman(Sally Jane,18500))
```

This is a trivial example: I mean, you can sort the records however you wish; they’re your records! But what if you receive the recordset from elsewhere—an API that is outside your control, for instance—and it is already sorted in an ascending fashion. It is probably better to accept the recordset as is and just take the last item rather than to sort it again. Which brings me to another couple of other points …

This post claims to be about `Array.last` and `Array.lastOption`. Why all the talk about `List.lastOption`?

First, as you have probably noticed already, just about any method available on one sequential collection is available on them all. That is, if there’s a `List.last`, for example, then there’s also a `Seq.last`, an `Array.last`, and a `Stream.last`.

Second, I want to point out a potential pitfall of using `last` and `lastOption`. Both the size and type of the collection can affect the performance of your program or even crash it.

Arrays give you O(1) access to their elements. (In case you’re not familiar with it, that’s called “Big O notation.” It’s a way of expressing how long an algorithm takes to execute.) That is, arrays give you nearly instant access to any element—first, last, somewhere in the middle—doesn’t matter.

Lists and sequences, on the other hand, give you O(n) access to their elements. That is, the more items in the list/sequence, the longer it takes to get to the one you want because the machine always has to start at the first element and iterate through every single one until it gets to the one you want. No big deal if there are only 100 elements, but if there are 10,000,000 elements, fetching the last element will take a while.

Furthermore, streams can be infinite. If you call `Stream.last` or `Stream.lastOption` on an infinite sequence, your program will crash:

```// This sequence starts at one and just keeps going
val ns = Stream.from(1)
val notGood = ns.last
// java.lang.OutOfMemoryError: GC overhead limit exceeded
```

You don’t have to eschew `last` and `lastOption`. Just take into account what kind of collection you’re calling them on. `Array.last` and `Array.lastOption` are perfectly safe. (Well, do remember that `Array.last` throws an exception if the array is empty, but with regard to performance, it’s fine.) But before you call `last` or `lastOption` on a list or a stream, make sure you know how big it is, or you could, as they say, shoot yourself in the foot.

Categories

## Scala Saturday – List.headOption

Sometimes you need to get the first element of a list. No problem: `List.head` to the rescue, right? But what happens when you call `List.head` on an empty list?

```val xs = List[Int]()
val h = xs.head
// java.util.NoSuchElementException: head of empty list
```

Well that’s not good. You could get around that little wart with this:

```val xs = List[Int]()
val h = xs match {
case h :: _ => h
case Nil => -1
}
```

That’s not great. Alternatively, there’s this:

```val xs = List[Int]()
val h = if (xs.isEmpty) -1 else xs.head
// h: Int = -1
```

Yeah, I’m not wild about those options either.

Speaking of options, what if you had a method that returns a `None` if you ask for the head of an empty list? If the list is not empty, it could return a `Some` containing the value of the head element. `List.headOption` does just that.

```val empty = List[Int]()
val nonempty = (9 to 17).toList

val nuthin = empty.headOption
// nuthin: Option[Int] = None

val sumthin = nonempty.headOption
// sumthin: Option[Int] = Some(9)
```

Now you can use `Option.getOrElse` on the result of a call to `List.headOption` in order to return a default value in the event of an empty list:

```val empty = List[Int]()
val nonempty = (9 to 17).toList

val head = nonempty.headOption getOrElse -1
// head: Int = 9

val fallback = empty.headOption getOrElse -1
// fallback: Int = -1
```

Now when might you actually use something like this? Perhaps you want to determine the top salesman each day, but only if the salesman has reached a certain threshold, say, \$10,000. You can filter out the salesmen who don’t reach the threshold, sort the list of salesmen according to end-of-day sales totals, and then try to take the head element. If no one makes the cut, then the filter operation returns an empty list, which ultimately yields a `None`.

```case class Salesman(name: String, sales: BigDecimal)

def findTopSalesman (salesmen : List[Salesman]) = {
salesmen.filter { _.sales >= 10000 }
.sortBy { -_.sales } // descending
.headOption
}
```

So then, if Monday’s sales are as follows, then no one gets the prize because no one has broken \$10,000:

```val monday = List(
Salesman("Joe Bob", 9500),
Salesman("Sally Jane", 8500),
Salesman("Betty Lou", 9800),
Salesman("Sammy Joe", 6500)
)

val mondayTop = findTopSalesman(monday)
// mondayTop: Option[Salesman] = None
```

On Tuesday, though, there are two contenders. Alas, there can be only one winner, and Sally Jane (who, ironically, is from British Columbia) takes the prize:

```val tuesday = List(
Salesman("Joe Bob", 9500),
Salesman("Sally Jane", 18500),
Salesman("Betty Lou", 11800),
Salesman("Sammy Joe", 6500)
)

val tuesdayTop = findTopSalesman(tuesday)
// tuesdayTop: Option[Salesman] =
//   Some(Salesman(Sally Jane,18500))
```
Categories

## Scala Saturday – Stream.collect

Filtering over a sequence of values omits values that do not meet certain criteria. Mapping over a sequence of values transforms each value into another value. What if you could do both at the same time—filter out unwanted values, but transform the ones that are left? You can with `Stream.collect`. But first, you need to know about partial functions.

### Partial Functions

A partial function is a function that has a limited domain, i.e., is not defined for every possible value of its input type, but only a subset.

The classic example is division. Division is undefined for a divisor of zero. In other words, m ÷ n is valid unless n = 0. So then, division is not defined for every number n. In this particular example, that’s not a big limitation on the domain, but it is nevertheless a limitation that prevents us from saying that division is defined for every possible n.

Scala has a `PartialFunction` type that allows you to represent a function that is only valid for a limited domain. Here is how you could represent integer division:

```val divide = new PartialFunction[(Int,Int), Int] {
override def isDefinedAt(x: (Int, Int)) = x._2 != 0
override def apply(x: (Int, Int)) = x._1 / x._2
}

val quotient = divide(12, 4)
// quotient: Int = 3
```

Partial functions have the `apply` method that other functions have so that you can execute them with parentheses: `divide(12, 3)`. They also have an `isDefinedAt` method so that you can ask the partial function, “Hey, can you handle this input?” That way, you can use an `if-else` expression to return a default or some other value:

```val fine = if (divide.isDefinedAt(12, 4)) {
divide(12, 4)
} else Int.MaxValue
// three: Int = 3

val meh = if (divide.isDefinedAt(12, 0)) {
divide(12, 0)
} else Int.MaxValue
// meh: Int = 2147483647
```

In fact, this is such a common pattern, that `PartialFunction` has `applyOrElse` that takes an input and a default function that is executed if the partial function is not defined for the given input:

```val default = Function.const(Int.MinValue) _  // lifted
val fine = divide.applyOrElse((12, 4), default)
// fine: Int = 3
val meh = divide.applyOrElse((12, 0), default)
// meh: Int = -2147483648
```

Now just because a partial function has a limited domain doesn’t mean that Scala prevents you from calling it on inputs that are outside its domain:

```val quotient = divide(12, 0)
// java.lang.ArithmeticException: / by zero
```

Therefore, remember to check the domain of a partial function before applying it to a given input. A responsibly crafted API that accepts partial functions from you will verify that an input is in the partial function’s domain before applying it.

You may be thinking, “That’s great, but it’s got a lot of boilerplate.” That’s true. Scala is nice enough to let you use pattern matching syntax to define a partial function in a terser fashion:

```val divide: PartialFunction[(Int,Int), Int] = {
case (num, den) if den != 0 => num / den
}

val quotient = divide(12, 4)
// quotient: Int = 3
```

Finally, perhaps a single partial function is not defined for the entire set of possible inputs, but you can use multiple partial functions that together cover the entire input range. It’s a contrived example, but you can take one partial function that is defined for even integers and another one that is defined for odds and then compose them together with the `orElse` method to get a partial function that does cover the entire set of possible inputs:

```val square: PartialFunction[Int,Int] = {
case x if x % 2 == 0 => x * x
}
val cube: PartialFunction[Int,Int] = {
case x if x % 2 == 1 => x * x * x
}
val transform = square orElse cube

val squared = transform(4)
// squared: Int = 16

val cubed = transform(3)
// cubed: Int = 27
```

### Collect: Filter and Map in One

Whereas `Stream.filter` takes a predicate—a function that takes a value and returns a Boolean—`Stream.collect` takes—you guessed it—a partial function. `Stream.collect` checks each element of the stream to see whether it is in the partial function’s domain. If the partial function is not defined for the input element, then `Stream.collect` discards it. If the input is within the partial function’s domain, then `Stream.collect` applies the partial function to the input element and returns the result as the next element in the output sequence.

```val squaredEvens = (4 to 7).toStream.collect {
case n if n % 2 == 0 => n * n
}
// squaredEvens: Stream[Int] = Stream(16, 36)
```

The following graphic illustrates what is going on in the code above:

OK, so `Stream.collect` performs a filter and a map all in one. Why not just call `Stream.filter` and then `Stream.map`? One example I’ve seen is when you’re pattern matching and destructuring and then only using one/some of the potential match cases. Perhaps you have a trait and some case classes representing orders that were either fulfilled or cancelled before fulfillment:

```trait Order
case class Fulfilled(id: String, total: BigDecimal)
case class Cancelled(id: String, total: BigDecimal)
```

You’d like to know how many dollars you “lost” in cancelled orders. Use `Stream.collect` to extract the dollar value of each cancelled order, and then sum them:

```val orders = Stream(
Fulfilled("fef3356074b4", BigDecimal("28.50")),
Fulfilled("2605c9988f1d", BigDecimal("88.25")),
Cancelled("94edac47971f", BigDecimal("22.01")),
Fulfilled("2a1ff57b8f46", BigDecimal("39.30")),
Fulfilled("9ee0a3e3da3a", BigDecimal("27.97")),
Cancelled("db5dc439ad93", BigDecimal("99.49")),
Fulfilled("08d58811ed36", BigDecimal("53.72")),
Cancelled("63ebd07475ca", BigDecimal("93.66")),
Cancelled("12d16ae9c112", BigDecimal( "7.79")),
Fulfilled("c5ecedaedb0e", BigDecimal("87.21")) )

val cancelledDollars = orders.collect {
case Cancelled(_, dollars) => dollars
}.sum
// cancelledDollars: BigDecimal = 222.95
```
Categories

## Scala Saturday – The Stream.grouped Method

Another method `Stream` offers is `Stream.grouped`, which divides a stream’s elements into groups of a given size.

To take an example, if you have a stream of twelve elements and call `Stream.grouped` to turn it into groups of three, you’ll get an iterator over four sequences, each three elements in size:

```val xs = (1 to 12).toStream
val grouped = xs.grouped(3)
// grouped: Iterator[Stream[Int]] =
//   Iterator(
//     Stream(1, 2, 3), Stream(4, 5, 6),
//     Stream(7, 8, 9), Stream(10, 11, 12))
```

What happens if you use a group size that does not divide evenly into the size of your input stream? No sweat! The last group just contains any remaining elements, however many they may be:

```val xs = (1 to 10).toStream
val grouped = xs.grouped(3)
// grouped: Iterator[Stream[Int]] =
//   Iterator(
//     Stream(1, 2, 3), Stream(4, 5, 6),
//     Stream(7, 8, 9), Stream(10))
```

Where is this useful? Well, you can take my paging example from my Scala Saturday post on `Stream.drop` and make it slightly clearer without the `(page - 1) * perPage` arithmetic:

```case class Book(title: String, author: String)

val books = Stream(
Book("Wuthering Heights", "Emily Bronte"),
Book("Jane Eyre", "Charlotte Bronte"),
Book("Agnes Grey", "Anne Bronte"),
Book("The Scarlet Letter", "Nathaniel Hawthorne"),
Book("Silas Marner", "George Eliot"),
Book("1984", "George Orwell"),
Book("Billy Budd", "Herman Melville"),
Book("Moby Dick", "Herman Melville"),
Book("The Great Gatsby", "F. Scott Fitzgerald"),
Book("Tom Sawyer", "Mark Twain")
)

val perPage = 3
val page = 3
val records = books.grouped(perPage)
.drop(page - 1)
.next
// records: scala.collection.immutable.Stream[Book] =
//   Stream(Book(Billy Budd,Herman Melville),
//     Book(Moby Dick,Herman Melville),
//     Book(The Great Gatsby,F. Scott Fitzgerald))
```

This time, instead of having to calculate the number of elements to skip in order to skip n pages, you first use `Stream.grouped` to turn the stream into a paged recordset; each “page” is n records long. Then drop `page - 1` pages in order to get to the page of records you want. Finally, calling `Iterator.next` is necessary because, remember, `Stream.grouped` turns a flat stream into a stream of streams.

I will admit that I find it irritating that `Stream.grouped` returns something that does not have a `head` method. Calling `Iterator.next`, while just as easy, is inconsistent with collection semantics. It seems to me that `Stream.grouped` ought to return a collection rather than an iterator. Perhaps there was once a reason for returning an iterator instead of a collection, but it would be nice if we could fix that.

Categories

## Scala Saturday – The Stream.distinct Method

Scala Saturday today is short and sweet: `Stream.distinct`. `Stream.distinct` removes any duplicate members of a stream, leaving only unique values.

One way to remove duplicates is to turn your stream into a set with `Stream.toSet`:

```val noDupes = Stream(3,5,6,3,3,7,1,1,7,3,2,7).toSet
// noDupes: scala.collection.immutable.Set[Int] =
//   Set(5, 1, 6, 2, 7, 3)
```

That’s fine if you don’t care about preserving the order of the items in the input stream.

But if you do want to preserve the order, `Stream.distinct` is the ticket:

```val noDupesOrdered =
Stream(3,5,6,3,3,7,1,1,7,3,2,7).distinct
// noDupesOrdered: scala.collection.immutable.Stream[Int] =
//   Stream(3, 5, 6, 7, 1, 2)
```
Categories

## Scala Saturday – The Stream.dropWhile Method

Just as the analog to `Stream.take` is `Stream.skip`, the analog to `Stream.takeWhile` is `Stream.dropWhile`. That is, when you don’t care so much about dropping a certain number of items, but rather a certain kind of items.

`Stream.dropWhile` starts at the beginning of the stream and applies a predicate to each item, one by one. It does not start returning items in a new stream until it reaches an item that does not meet the predicate. Then it stops checking elements against the predicate and returns every item in the stream from that point on:

Assume you have the same temperature sensor as the one in my post on `Stream.takeWhile`. This time, instead of once per minute, assume that it feeds you temperature readings once per second. Add to that the idea that the sensor has a few seconds of boot-up time in which it sends you -1000.0—the indication that the current reading is invalid—until it has fully booted and can start sending good temperature data.

```import java.time.{LocalDateTime, Month}
case class Reading(temperature: Double, timestamp: LocalDateTime)

val readings = Stream(
Reading(-1000.0, LocalDateTime.of(2015, Month.JULY, 19, 10, 0, 0)),
Reading(-1000.0, LocalDateTime.of(2015, Month.JULY, 19, 10, 0, 1)),
Reading(-1000.0, LocalDateTime.of(2015, Month.JULY, 19, 10, 0, 2)),
Reading(-1000.0, LocalDateTime.of(2015, Month.JULY, 19, 10, 0, 3)),
Reading(-1000.0, LocalDateTime.of(2015, Month.JULY, 19, 10, 0, 4)),
Reading(-1000.0, LocalDateTime.of(2015, Month.JULY, 19, 10, 0, 5)),
Reading(90.1, LocalDateTime.of(2015, Month.JULY, 19, 10, 0, 6)),
Reading(90.2, LocalDateTime.of(2015, Month.JULY, 19, 10, 0, 7)),
Reading(90.2, LocalDateTime.of(2015, Month.JULY, 19, 10, 0, 8)),
Reading(90.3, LocalDateTime.of(2015, Month.JULY, 19, 10, 0, 9)),
Reading(90.2, LocalDateTime.of(2015, Month.JULY, 19, 10, 0, 10))
)
```

To drop all readings until the thermometer starts returning valid data, use `Stream.dropWhile`:

```val valid = readings dropWhile (_.temperature == -1000.0)

// valid: scala.collection.immutable.Stream[Reading] =
//   Stream(Reading(90.1,2015-07-19T10:00:06),
//     Reading(90.2,2015-07-19T10:00:07),
//     Reading(90.2,2015-07-19T10:00:08),
//     Reading(90.3,2015-07-19T10:00:09),
//     Reading(90.2,2015-07-19T10:00:10))
```

Finally, like `Stream.takeWhile`, `Stream.dropWhile` doesn’t balk if it never reaches an element that fails to meet the predicate. You just get an empty stream:

```val none = Stream(1,3,4,7) dropWhile { _ < 10 }
// none: scala.collection.immutable.Stream[Int] = Stream()
```
Categories

## Scala Saturday – The Stream.drop Method

The opposite of `Stream.take` is `Stream.drop`. `Stream.drop`, as the name suggests, skips the first n items of the sequence and returns a new sequence that starts with element n + 1:

```val xs = (1 to 10).toStream
val dropped5 = xs drop 5
// dropped5: scala.collection.immutable.Stream[Int] =
//   Stream(6, 7, 8, 9, 10)
```

One of most obvious applications of `Stream.drop` is to pair it with `Stream.take` to page through a set of records. Perhaps you have a list of books:

```case class Book(title: String, author: String)

val books = Stream(
Book("Wuthering Heights", "Emily Bronte"),
Book("Jane Eyre", "Charlotte Bronte"),
Book("Agnes Grey", "Anne Bronte"),
Book("The Scarlet Letter", "Nathaniel Hawthorne"),
Book("Silas Marner", "George Eliot"),
Book("1984", "George Orwell"),
Book("Billy Budd", "Herman Melville"),
Book("Moby Dick", "Herman Melville"),
Book("The Great Gatsby", "F. Scott Fitzgerald"),
Book("Tom Sawyer", "Mark Twain")
)
```

If each page shows three books, and the user wants to see the records on page three, skip the first two pages’ worth of records, and take the next three records:

```val perPage = 3
val page = 3
val records = books.drop((page - 1) * perPage)
.take(perPage)
// records: scala.collection.immutable.Stream[Book] =
//   Stream(Book(Billy Budd,Herman Melville),
//     Book(Moby Dick,Herman Melville),
//     Book(The Great Gatsby,F. Scott Fitzgerald))
```

Fortunately, like `Stream.take`, if you ask the sequence for more elements than it contains, you simply get an empty stream:

```val empty= (1 to 5).toStream drop 6
// empty: scala.collection.immutable.Stream[Int] =
//   Stream()
```