Scala Saturday – The Stream.take Method

Sometimes you have a collection of items, and you only want the first so many items. You may not know or even care what the first n items are, you just know that you want the first n items. Stream.take can do that for you.

To Infinity … and Beyond!

OK, it’s unusual that you just don’t care what items you get from a sequence, but there are occasions: for instance, random numbers. You really don’t care what the number is. You just don’t want it to be the same number every time, and perhaps you want it to fall within a certain range. Other than that, you don’t care: you just want a number!

val seed = new java.util.Date().hashCode
val rand = new scala.util.Random(seed)
val someNum = rand.nextInt
// someNum: Int = 1717783198  (well, this time anyway)

Now, assume that you have an application that needs a steady stream of random numbers. (Maybe even the number of random numbers you want is itself random!) Wouldn’t it be great to turn your random number into a stream?

Scala streams are lazy data structures. That means that as you iterate over the stream, it produces the next item on demand. It may not even have calculated what the next item is until you ask for it. In fact, stream can be infinite! How do you have an infinite stream? Well, back to the random number generator.

You can call rand.nextInt until the cows come home and just keep getting values. Well, the Stream companion object gives us a way to turn rand (or any function that calculates a value from an input) into an infinite stream—Stream.iterate:

let rands = Stream.iterate(rand.nextInt) {
  _ => rand.nextInt
}

Stream.iterate takes a starting value of type A and a function that takes a value of type A and returns another value of type A—an integer in this case. Now you can call Stream.take on rands, and get as many or as few values as you may want:

val taken = rands take 5
// taken: scala.collection.immutable.Stream[Int] = 
//   Stream(-2112282457,
//     -1552737819,
//     -745613730,
//     795338080,
//     -219200225)

Additionally, Scala’s #:: operator allows you to create the infinite stream of random numbers in another way:

def rands: Stream[Int] = rand.nextInt() #:: rands
val taken = rands take 5
// taken: scala.collection.immutable.Stream[Int] = 
//   Stream(-575330347,
//     631339635,
//     -911607409,
//     -614867864,
//     -1194280908)

In this case, you’re defining a function that calls itself recursively to provide the next element in the stream.

Use whichever approach you find the more natural.

Going for the Gold

There are, of course, times when your data set is not infinite. You happen not to need all of them, but only the first few. I mentioned early on how that you may not know or care what the items are. A better way of saying it is that you (or someone) have likely already done the work to prepare the data set—processed it, sorted it, whatever. Now that you have done all that preprocessing, you want the first few items.

Perhaps you are writing software to determine the gold, silver, and bronze medal winners in the Olympic Games. You have a type representing each competitor:

case class Competitor(name: String, score: Double)

You have the list of competitors for the 2012 men’s 110-meter hurdles, and they are already sorted according to ranking:

val competitors = Stream(
  Competitor("Aries MERRITT", 12.92),
  Competitor("Jason RICHARDSON", 13.04),
  Competitor("Hansle PARCHMENT", 13.12),
  Competitor("Lawrence CLARKE", 13.39),
  Competitor("Ryan BRATHWAITE", 13.40),
  Competitor("Orlando ORTEGA", 13.43),
  Competitor("Lehann FOURIE", 13.53)
)

Now to get the medalists, take the first three records:

val medalists = competitors take 3
// medalists: scala.collection.immutable.Stream[Competitor] =
//   Stream(
//     Competitor(Aries MERRITT,12.92),
//     Competitor(Jason RICHARDSON,13.04)
//     Competitor(Hansle PARCHMENT,13.12) )

Take It to the Limit

What happens, though, when you have a stream with fewer items than you attempt to take?

val numbersOfTheCounting = Stream(1,2,3) take 5
numbersOfTheCounting foreach println
// 1
// 2
// 3

Well, that’s nice. You ask for five, so the stream gives you all it has. If that happens to be fewer than you asked for, hey, no worries.

I should note that Stream is not unique in having a take method. The strict, sequential collection types—Array, List, Seq, and Vector—all have a take method that functions the same as Stream.take does. Even Map has a take method.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.