Scala Saturday – Stream.groupBy

Sometimes you have a collection of items that you want to group according to some common property or key. Stream.groupBy can do that job for you. It takes that collection and returns a map keyed to that grouping key. The value for each key is the sequence of all the items that fall into that group.

That’s a little hard to follow. So what’s it useful for?

Well, maybe you need to group a list of names by initial. No sweat:

val names = Stream(
  "Rehoboam",
  "Abijah",
  "Asa",
  "Jehoshaphat",
  "Jehoram",
  "Ahaziah")

val groupedByInitial = names.groupBy(_.head)
// groupedByInitial: Map[Char,Stream[String]] =
//   Map(
//     J -> Stream(Jehoshaphat, Jehoram),
//     A -> Stream(Abijah, Asa, Ahaziah),
//     R -> Stream(Rehoboam) )

And of course, if you want to sort those groups, convert the resulting map to a stream of tuples, and throw in a call to Stream.sortBy to sort by the first element in the tuple:

val groupedByInitialAndSorted =
  groupedByInitial.toStream.sortBy(_._1)
// val groupedByInitialAndSorted: Stream[(Char, Stream[String])] = 
//   Stream(
//     A -> Stream(Abijah, Asa, Ahaziah),
//     J -> Stream(Jehoshaphat, Jehoram),
//     R -> Stream(Rehoboam) )

Another example: Maybe you want to group some test scores according to grade. That is, all the students who scored in the 90s are grouped together, then all those scoring in the 80s, and so on:

case class TestScore(name: String, score: Int)

val grades = Stream(
  TestScore("Anna", 74),
  TestScore("Andy", 76),
  TestScore("Brenda", 70),
  TestScore("Bobby", 90),
  TestScore("Charlotte", 98),
  TestScore("Chuck", 83),
  TestScore("Deborah", 88),
  TestScore("Dan", 66),
  TestScore("Ellie", 80),
  TestScore("Ed", 61),
  TestScore("Frannie", 89),
  TestScore("Frank", 96) )

val grouped = grades.groupBy(_.score / 10 * 10)
// grouped: Map[Int,Stream[TestScore]] =
//   Map(
//     80 -> Stream(
//             TestScore(Chuck,83),
//             TestScore(Deborah,88),
//             TestScore(Ellie,80),
//             TestScore(Frannie,89) ),
//     70 -> Stream(
//             TestScore(Anna,74),
//             TestScore(Andy,76),
//             TestScore(Brenda,70) ),
//     60 -> Stream(
//             TestScore(Dan,66),
//             TestScore(Ed,61) ),
//     90 -> Stream(
//             TestScore(Bobby,90),
//             TestScore(Charlotte,98),
//             TestScore(Frank,96) )
//   )

You can take it another couple of steps to produce a histogram by counting the number of students in each group and sorting by the group key (i.e., grade level):

val histogram = grouped.map {
  case (grade, scores) => grade -> scores.length
}.toStream.sortBy(_._1).reverse
// histogram: Stream[(Int, Int)] =
//   Stream((90,3), (80,4), (70,3), (60,2))

One more example, and this one I borrowed from one of Steven Proctor’s Ruby Tuesday posts. You can find anagrams in a list of words by sorting the characters in each word and grouping on that:

val anagrams = Stream(
  "tar", "rat", "bar",
  "rob", "art", "orb"
).groupBy(_.sorted)
// anagrams: Map[String,Stream[String]] = 
//   Map(
//     abr -> Stream(bar),
//     art -> Stream(tar, rat, art),
//     bor -> Stream(rob, orb) )

Leave a Reply

Your email address will not be published. Required fields are marked *