Typeclasses in Scala - What are they?

13 Oct 2013

If you’re familiar with most object-oriented languages, then it’s likely you employ the concept of polymorphism when laying out your designs. The idea being that you have values of different data types being handled by a uniform interface. In Java, and most object-oriented languages, this is accomplished using subtyping, as in the example below:

case class Leap(height: Int)
trait Jumper { def jump: Leap }

object Frog extends Jumper { def jump = Leap(20) }
object Human extends Jumper { def jump = Leap(100) }

Sometimes this is a good solution, but it can lead to brittle object heirarchies that make refactoring a pain once you’ve got client implementations.

There is another variety of polymorphism that allows handling of various data types, by a uniform interface, but without subtyping from a base case.

As an example, suppose you have a collection of distinct, unrelated objects:

case class Employee(name: String, age: Int)
case class Movie(title: String, director: String, year: Int)

Now suppose you needed to output them somehow - JSON, System.out, whatever (we’ll go with just outputting as a string for now) - and imagine for a second, you didn’t have a toString method on your object. A naive approach might be to have just put a toString type method on the object:

case class Employee(name: String, age: Int) {
  def label = "Name: " + name + " Age: " + age
}
case class Movie(title: String, director: String, year: Int) {
  def label = "Title: " + title + " Director " + director + " Year: " + year
}

There are several problems with the above implementation. Firstly, there is logic mixed in with data. People coming from FP backgrounds argue that data - or state (even if it’s not inherently “stateful”) - should not be mixed with logic. They are different concerns, and these people strive to separate the two. Proponents of OO argue the opposite, insisting that data is often accompanied by logic on itself, and they should be combined. I don’t know enough about either to give an opinion on which is the definitive answer. However, let’s investigate the former. How can we separate data from logic, yet still have the benefits polymorphsm provides.

Enter: The Typeclass

This concept is called “typeclasses” in Scala and Haskell. In Clojure it is referred to as multimethods (or protocols).

So what is a typeclass? It’s a bit like a Java interface, except you can make existing types conform to the behaviour defined within, without modifying the existing type (or even knowing the types implementation). I could build up classes of behaviour as single entities, to be applied to a type, rather than have a type implement a behaviour.

Back to our example. Ideally, we would like to be able to have the label method on the Movie and Employee objects themselves, but contained elsewhere. Let’s start by defining some behaviour:

trait LabelMaker[T] {
  def label(t: T): String
}

The above code is the actual typeclass. But it looks just like a plain old interface, I hear you say. And it is! Except in how we use it. Define some instances of this typeclass:

object LabelMaker {
  object MovieLabelMaker extends LabelMaker[Movie] {
    def label(movie: Movie) = "Title: " + movie.title 
      + " Director: " + movie.director
  }
  object EmployeeLabelMaker extends LabelMaker[Employee] {
    def label(employee:Employee) = "Name: " + employee.name 
      + " Age: " + employee.age
  }
}

Now we can label objects like so:

val movie = Movie("Pulp Fiction", "Quentin Tarantino", 1994)
println(LabelMaker.MovieLabelMaker.label(movie))

Errr… yeah that sucks. But bear with me, because we’re not done. Ideally we’d like the compiler to find the correct method, and we call label on an object and have it autowired in. Using the magic (it’s not really) of Scala implicits we can. Let’s start by redefining those label makers slightly (the only difference being the addition of an implicit qualifier):

object LabelMaker {
  implicit object MovieLabelMaker extends LabelMaker[Movie] {
    def label(movie: Movie) = "Title: " + movie.title 
      + " Director: " + movie.director
  }
  implicit object EmployeeLabelMaker extends LabelMaker[Employee] {
    def label(employee:Employee) = "Name: " + employee.name 
      + " Age: " + employee.age
  }
}

And to use these new implicit objects, we need a slightly tweaked way of calling label:

def label[T](someObject: T)(implicit lm: LabelMaker[T]) = lm.label(t)

The above method can exist anywhere you want it to (perhaps a good place to put it would be the LabelMaker object. It takes in two parameters: the object of type T that you want to label, and some implicit method/object that can return a LabelMaker[T]. The implicit qualifier of the LabelMaker lm, tells the compiler to search within certain scopes for a an object meeting the above constraints. It’s “implicitly” filled in”. It means we can leave the 2nd parameter off at the client site:

val movie = Movie("Pulp Fiction", "Quentin Tarantino", 1994)
println(label(movie))

This is really just a nicer way of writing:

println(label(movie)(LabelMaker.MovieLabelMaker))

It’s just the compiler can fill it in for us. Very cool. But we’re not done. Ideally we should be able to extend the object we want to label with that same label method. Using another implicit, we can:

implicit def pimpWithLabel[T](t: T)(implicit lm: LabelMaker[T]) = new {
  def label = lm.label(t)
}

All we’ve done here is add an implicit method that converts an object of type T into a new object, with the addition of a new label method. We’ve extended the class without knowing its implementation (only its type) with a label method. We can now finally achieve our goal of labelling distinct objects like so:

val movie = Movie("Pulp Fiction", "Quentin Tarantino", 1994)
println(movie.label)

val employee = Employee("Dominic Bou-Samra", 23)
println(employee.label)

Yep. So what?

What has all this extra boilerplate actually bought us?

  1. We’ve got a really nice client calling interface (methods on objects)
  2. We’ve separated data from business logic. The Employee and Movie classes exist entirely on their own, with NO ties to any specific label implementation.
  3. We’re able to EXTEND classes without knowing how they’re implemented. If a class is missing a method, you can provide one. Most Scala libraries use a typeclass base heirachy. For instance, the List implementation does not contain a map, fold or length function within itself. They are provided as extension methods. It makes for extremely beautifully designed libraries that avoid repeated code.
  4. Ad-hoc polymorphism (TODO)

Caveat

The example I provided above is entirely contrived. In the real world we do have toString methods. It’s not worth the extra code for such a simple concept. It’s used purely as an explanatory example.