Work with Play

Build web applications using Scala and the Play Framework.

Scala Traits Part 2

After having introduced traits in the previous article, today we’ll continue talk about them. We are going to touch more advanced topics like linearization, initialization and public API best practices.

Linearization

If a class mixes in multiple traits and those traits extend from other traits or classes, the resulting inheritance hierarchy will form a graph.
This can cause several issues like constructor’s invocation order, method lookup resolution and more.

Going back to the example in the first part of this serie, let’s consider a class extending User and mixing in two traits:

app/models/User.scala
1
2
3
4
5
6
class UserWithCachedContacts(id: Pk[Int], username: String, age: Int)
  extends User(id, username, age) with cache.SingleElementCache[List[Contact]]
  with cache.ExpiryCache[List[Contact]]{

  override def contacts: List[Contact] = getOrFetch("contacts")(() => super.contacts)
}

We have seen in the last article that when getOrFetch is invoked, the version executed is the one in ExpiryCache and not the one in SingleElementCache, why is that?
Also, why does a call to super.getOrFetch in ExpiryCache results in SingleElementCache’s implementation if the parent is Cache?

The answer to both questions is linearization: the technique used by Scala to resolve the inheritance hierarchy into a flat ordered list of classes.
When a method or a property is used directly or with the super keyword, this list is visited and the lookup resolved to the first match found.
The algorithm is pretty simple: first step is to place the class for which the linearization is being calculated as first element of the list.
Starting from the element at the right end side of the list of traits or classes, calculate the relative linearization and append the result to the list.
Visiting now the resulting list from left to right, remove any duplicate, keeping always the one closest to the end of the list.
Last step is to append ScalaObject, AnyRef and Any at the end of the list. They are the parent classes for most Scala classes.

Let’s apply the algorithm to the class shown above:

1
2
3
4
5
6
[UserWithCachedContacts] // interested class is added to the list
[UserWithCachedContacts, ExpiryCache, Cache] // linearization of ExpiryCache added
[UserWithCachedContacts, ExpiryCache, Cache, SingleElementCache, Cache] // linearization of SingleElement added
[UserWithCachedContacts, ExpiryCache, Cache, SingleElementCache, Cache, User] // linearization of User added
[UserWithCachedContacts, ExpiryCache, SingleElementCache, Cache, User] // duplicate elements are removed
[UserWithCachedContacts, ExpiryCache, SingleElementCache, Cache, User, ScalaObject, AnyRef, Any] // common bases classes are added

This explains why calling getOrFetch from UserWithCachedContacts results in an invocation of ExpiryCache’s implementation.
Same for the question regarding the lookup resolution of super.getOrFetch within ExpiryCache.

Constructors

The primary constructor of a trait, as for any other class, is its body. The main difference with normal classes is that this constructor does not accept arguments and cannot invoke any of the constructors of the parent class.
Consequently, traits can only extend classes that have a no-arguments constructor.

The constructor in any class or trait must invoke one of the parent class constructors. Again, the linearization algorithm is used to determine the order of execution of the constructors when multiple traits are involved.

Issues with initialization

Suppose we want to make the timeout property in ExpiryCache configurable by classes using this trait. We may also want to make life easier for programmers by letting them specify the timeout in seconds and not milliseconds.
Our trait would change like this.

app/cache/Cache.scala
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
trait ExpiryCache[T] extends Cache[T] {

  val expirationTimeout: Int

  val timeout: Long = expirationTimeout * 1000

  var lastAccess: Map[String, Long] = Map()

  override def getOrFetch(key: String)(fetch: () => T): T = {
    if (lastAccess.contains(key) && System.currentTimeMillis - lastAccess(key) > timeout) {
      play.api.Logger.debug("Entry older then " + timeout + " milliseconds. Timeout expired, clearing the cache ...")
      super.clear(key)
    }
    lastAccess = lastAccess + (key -> System.currentTimeMillis)
    super.getOrFetch(key)(fetch)
  }
}

timeout is initialized during the execution of the trait’s constructor by multiplying expirationTimeout by 1000. The value for the latter comes from the class or trait mixing in ExpiryCache.this Consequently, UserWithCachedContacts must now override expirationTimeout.

app/models/User.scala
1
2
3
4
5
6
7
8
class UserWithCachedContacts(id: Pk[Int], username: String, age: Int)
  extends User(id, username, age) with cache.SingleElementCache[List[Contact]]
  with cache.ExpiryCache[List[Contact]]{

  override  val expirationTimeout = 5

  override def contacts: List[Contact] = getOrFetch("contacts")(() => super.contacts)
}

Let’s test these changes by running some REPL.

1
2
3
4
5
6
7
8
9
scala> user.contacts
[debug] application - single element cache miss, invoking the callback ...
res1: List[models.Contact] = List(Contact(1,email,diegocastorina@gmail.com))

scala> user.contacts
[debug] application - Entry older then 0 milliseconds. Timeout expired, clearing the cache ...
[debug] application - Not related trait implementation ...
[debug] application - single element cache hit ...
res2: List[models.Contact] = List(Contact(1,email,diegocastorina@gmail.com))

To our surprise, timeout is to 0 and not to 5000.
This is because of the order in which the constructors are called.
Since ExpiryCache constructor is called first, expirationTimeout hasn’t been initialized yet when timeout is calculated and the resulting value is 0.

There are two ways to solve this problem.

Lazy val

A val marked as lazy is not evaluated until the first time it is accessed.
Simply marking timeout as lazy will fix our bug.

app/cache/Cache.scala
1
2
3
4
5
6
7
8
trait ExpiryCache[T] extends Cache[T] {

  val expirationTimeout: Int

  lazy val timeout: Long = expirationTimeout * 1000

  ...
}
1
2
3
4
5
6
7
8
scala> user.contacts
[debug] application - single element cache miss, invoking the callback ...
res1: List[models.Contact] = List(Contact(1,email,diegocastorina@gmail.com))

scala> user.contacts
[debug] application - Entry older then 5000 milliseconds. Timeout expired, clearing the cache ...
[debug] application - single element cache miss, invoking the callback ...
res2: List[models.Contact] = List(Contact(1,email,diegocastorina@gmail.com))

There may still be problems if some initialization logic takes place within the constructor of the trait.
Suppose the constructor of ExpiryCache runs an init method that does access timeout.

app/cache/Cache.scala
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
trait ExpiryCache[T] extends Cache[T] {

  val expirationTimeout: Int

  lazy val timeout: Long = expirationTimeout * 1000

  var lastAccess: Map[String, Long] = Map()

  private def init {
    play.api.Logger.debug("Init function. Timeout is " + timeout)
  }
  init

  ...
}

Let’s see what happens in the console.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
scala> import models._
import models._

scala> val user = User.loadWithCache(1).get
[debug] application - Init function. Timeout is 0
user: models.User = User(1,diego,34)

scala> user.contacts
[debug] application - single element cache miss, invoking the callback ...
res1: List[models.Contact] = List(Contact(1,email,diegocastorina@gmail.com))

scala> user.contacts
[debug] application - Entry older then 0 milliseconds. Timeout expired, clearing the cache ...
[debug] application - single element cache miss, invoking the callback ...
res2: List[models.Contact] = List(Contact(1,email,diegocastorina@gmail.com))

Again timeout is set to 0 because, even if lazy, it is accessed before expirationTimeout is initialized. To solve this problem we need to set expirationTimeout as lazy in the subclass.

app/models/User.scala
1
2
3
4
5
6
7
8
class UserWithCachedContacts(id: Pk[Int], username: String, age: Int)
  extends User(id, username, age) with cache.SingleElementCache[List[Contact]]
  with cache.ExpiryCache[List[Contact]]{

  override lazy val expirationTimeout = 5

  override def contacts: List[Contact] = getOrFetch("contacts")(() => super.contacts)
}

This time we get the expected behavior.

1
2
3
scala> val user = User.loadWithCache(1).get
[debug] application - Init function. Timeout is 5000
user: models.User = User(1,diego,34)

Pre-initialized fields

An alternative to lazy vals are early initializations.

app/models/User.scala
1
2
3
4
5
6
class UserWithCachedContacts(id: Pk[Int], username: String, age: Int)
  extends { val expirationTimeout = 5}
      with User(id, username, age) with cache.Cache[List[Contact]] {

   override def contacts: List[Contact] = getOrFetch("contacts")(() => super.contacts)
}

This creates a class having the expirationTimeout field already initialized and then enriches it with all other traits, including the parent class User.

Follows the equivalent using just anonymous classes:

app/models/User.scala
1
2
3
4
5
new { val expirationTimeout = 5 } with User(id, username, age)
  with cache.SingleElementCache[List[Contact]] with cache.ExpiryCache[List[Contact]] {

      override def contacts: List[Contact] = getOrFetch("contacts")(() => super.contacts)
  }

The problem with early-initializations and using lazy vals in subclasses is that they break the encapsulation of the trait. In fact, they require users of the trait to take some action without which the code would not work properly.
A better solution may be replacing the trait with an abstract class accepting expirationTimeout as argument in the primary constructor.
If this cannot be done, it is fundamental to properly documenting the trait explaining what has to be done in order to make effective usage of it.

Traits as public API

We have seen so far how traits are interfaces with optional implementations.
Traits can be used to achieve multiple inheritance but one of their main goal is also to define public APIs.
A public API for a certain module represents the contract between that module and the outside world. It communicates what a piece of software expect as input and what has to be expected as output. It does not say anything about the implementation details. This favors modularity, encapsulation and information hiding.
A client of an API does not need to change or even being re-compiled if anything in the internal implementation of the API changes, he is not aware or interested at all.
For these reasons it is better not to implement any methods for a trait representing a public API.

Another important point is that, even if Scala can infer return types, you should always explicity define them for public API.
The first reason for that is that it makes the API much easier to read for humans. A method name could give some clue about the returned type but nothing more.
The other reason is that the type inferred by Scala may be some internal class you don’t want your users to be aware of.
A best practice is to always to return types as abstract as possible. This will make the code much more flexible by making it easy to change the concrete returned type.

One last best practice to follow is documentation. Even the best library in the world without documentation can be useless as people do not know how to use it and do not have any way how to learn it other than docs.

Going back to our Cache module, let’s make it a public API following the rules discussed so far. We will rename the current implementation of the Cache trait to EmptyCache which becomes the base trait for the others to extend. It is important to define a default implementation for any trait as this allows other traits to safely invoke super.

app/cache/Cache.scala
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
/** A cache allows to store data so that future accesses can be faster */
trait Cache[T] {
  /** 
   *  Returns the element with the specified key.
   *  If the element is not found in the cache, a callback is invoked to fetch it.
   *  
   *  @param key - the key of the element to fetch
   *  @param fetch - the callback to fetch the element if not found in the cache
   *  @return T - the element with the specified key
   */
  def getOrFetch(key: String)(fetch: () => T): T

  /**
   * Removes the element with the specified key from the cache.
   * 
   * @param key - the key of the element to remove
   */
  def clear(key: String)
}

trait EmptyCache[T] extends Cache[T]{

  override def getOrFetch(key: String)(fetch: () => T): T = {
    play.api.Logger.debug("Default empty cache always invoking the callback...")
    fetch()
  }

  override def clear(key: String) {}
}

trait MapCache[T] extends EmptyCache[T] {
   ...
}

trait SingleElementCache[T] extends EmptyCache[T] {
   ...
}

trait ExpiryCache[T] extends EmptyCache[T] {
   ...
}

Conclusions

Traits are a really powerful feature in Scala, they promote re-usability through composition while keeping the code very compact, elegant and expressive.
They are not perfect, though, as initialization can cause some issues like we have seen in this article. In cases where the encapsulation of a trait is broken, it is really important to document it properly.
As usual, let’ conclude this article with the link to the source code.

Comments