Welcome to another episode of Functional Scala!

This episode is the first part of at least two episodes we’re gonna take a deeper look at what’s called Algebraic datatypes and how you could build some of them on your own in Scala. An algebraic datatype reflects the idea, that a type is – loosely spoken – a set of certain (data) values. So in it’s very simplest form, an algebraic datatype can be defined by giving an enumeration of all the individual data values for a given type (of course you always also need to give the type an appropriate name).

Let’s start right away by defining an own enumerated type. We need a type which consists of a finite set of values (otherwise it would be hard to enumerate them all, right?). Well, we all know that Scala already provides a type *Boolean*. For demonstration sake, let’s redefine that type as an algebraic datatype and call’em *Bool*. First of all, the single values of an algebraic datatype are only data – no behaviour at all – maybe that’s why they called *data*types …

In order to give you an impression how easily they can be defined in a pureley functional Language, let’s take a look at how *Bool *is defined in Haskell as an algebraic datatype and then define Bool in Scala afterwards:

data Bool = False | True

Wow, that was pretty short. What you saw was the introduction of a new datatype by using Haskell’s keyword *data *followed by the types name *Bool*. After the equals sign starts the *enumeration *of all values. In fact they are called *value constructors* because they’ll construct the individual values of the given type. *Bool *is a nice example of a so called *sum type*, because it can be characterized as the sum of all individual and unique values, which we just enumerated.

In Scala, you can define an equal algebraic datatype by using case classes. The fact that we’re using case classes will be more deeply motivated within one of the following episodes (right after we’ve grasped algebraic datatypes), where we’ll going to *destruct *values of a given datatype, using pattern matching (in fact, the idea of pattern matching is deeply connected to the idea of algebraic datatypes). For now, let’s see how we can define and use (or construct) algebraic datatypes in Scala. it goes something like this:

sealed abstract class Bool case object True extends Bool case object False extends Bool

Because Scala is a hybrid OO – functional Language, it doesn’t support algebraic datatypes in a direct way (like in Haskell). Instead we’re using some constructs out of the OO world – first and foremost the concept of classes and objects: our new type *Bool *itself is introduced by the abstract class *Bool*. You may wonder why it’s defined abstract. That’s because we don’t want to see an instance of that base class at runtime. Class *Bool *is only for introducing the type, not its values. All single values of that type are then defined by a single case object for each of them. Why objects, not classes? This time, it’s the other way round: they are representing the single values, not a certain type – and as long as there’s only one instance for every and each value within a type, they’re defined as singleton objects.

Wow, our first algebraic datatype in Scala! Now lay back and relax … unless you’re asking yourself how you’re able to work with them, now that *Bool*‘s in the world. Well, working with them means using some functions whose outcome is based (calculated) on *Bool*. For a first simple use case, let’s consider some boolean functions, based on our new type *Bool*:

val not : Bool => Bool = ( b : Bool ) => if( b eq True ) False else True val and : ( Bool, Bool ) => Bool = ( a :Bool, b: Bool ) => if( a eq False ) False else b val or : ( Bool, Bool ) => Bool = ( a :Bool, b: Bool ) => if( a eq True ) True else b ... println( and( or( True, not( True ) ), not( False ) ) )

Two points here: Since our values are defined as singleton objects, we’re able to compare those values by using the case objects inherited method *eq* (we’re living in a OO – functional world, i told you so!). Secondly, it really looks a little paradox to compare two values of type *Bool *using *eq *only to receive a value of type *Boolean *which is then used to decide the* if expression*. Trust me – as we’ll see, we really only need pattern matching to get rid of *eq *and that *if expression* (and hence of *Boolean*). It will turn out, that with pattern matching on hand, we really could write our own polymorphic *if* function (that we’re gonna do when we’ve looked at lazy evaluation).

Furthermore, you may have asked yourself, why we’ve sealed our case class (ok, it’s the third point): Consider the fact we didn’t. So everybody could expand our datatype and add another value, say *Maybe*. Now what about our neatly defined functions? Since they only know and care about the values *True *and *False*, their calculation is now kind of incomplete (similar to the visitor pattern in OO world, which you might use best with a stable *type structure *and some volatile (mostly unpredictable increasing) functions on that type structure). So sealing the case class ensures that those single values of that certain type might not be extended in a chaotic, random way, compromising all functions, based on that type!

Using* if expressions* in order to compare or match single values of a certain type quickly become really cumbersome, especially if there are more than two values. Watch the following algebraic datatype *Season*, which consists of four different data values:

sealed abstract class Season case object Winter extends Season case object Spring extends Season case object Summer extends Season case object Fall extends Season

Ok, that was also kind of quick. Nothing new here. Just a new type as the base case class and four case objects, representing the different values. Now let’s think of a function, which returns the next Season for a given one:

val next : Season => Season = ( s :Season ) => if( s eq Winter ) Spring else if( s eq Spring ) Summer else if( s eq Summer ) Fall else Winter

Writing that function wasn’t that quick anymore. Asume we need to write some more functions based on *Season*. Matching those single values may be inevitable, but in some other cases it may be sufficient to exploit the fact, that the values of an enumerated type are finite. Uhm, ehm, c’mon guy – what do you mean? How can we take advantage of that fact? Well, it turns out, that you can *count *all values of a finite set, right? And so, you could give each value an individual number. And so you could present a function which delivers the individual number for each value. And you could also present a function which delivers a distinct value for a given number (this function might be only a partial one, since we couldn’t provide a value for *any *number). Observe:

val toInt : Season => Int = ( s :Season ) => if( s eq Winter ) 0 else if( s eq Spring ) 1 else if( s eq Summer ) 2 else 3 val fromInt : Int => Season = ( i :Int ) => if( i == 0 ) Winter else if( i == 1 ) Spring else if( i == 2 ) Summer else Fall

Admitted, the given functions are not corner proof, but they will serve our purpose. But beside of that, what gives? Well, if you can give a definition for converting the values of a given type into an individual number and vice versa, you’re able to build your logic upon those numbers, not the values itself. And this may come very handy in some situations (the ‘trick’ is to find the *right *mapping between the values and numbers). Let’s take our function *next *and rewrite it using the corresponding numbers:

val next : Season => Season = ( s :Season ) => fromInt( ( toInt( s ) + 1 ) % 4 )

Aaaah, now it’s clear. We utilized the fact, that the corresponding numbers are in the right order. So we could produce a function which’s far more concise. Judge for yourself if it’s still that readable …

## Summary

This was our first part to the introduction of algebraic datatypes and how they can be implemented within Scala by leveraging case classes (and objects). We’ve seen a special (most simple) form of algebraic datatypes – enumerated types. In this case, all values of that type could be enumerated one by one.

We’ve also seen, that writing functions which are based on algebraic datatypes seemed kind of cumbersome, since we tried to match a certain value (or a combination of some given values) by using *eq *and *if expressions*. Don’t worry – we will encounter a powerful technique called pattern matching which will allow us to *deconstruct *(whatever that means by now) any given value of a certain algebraic datatype, helping us to write more condensed *and *readable functions!

Of course there are situations, were you can’t give an enumeration for all possible values, especially when the underlyining type leads to a potential set of infinite values. Though, these kind of types can be described as algebraic datatypes as well. This will be the topic for the next episode. Hope to see you soon …

January 30, 2011 at 6:41 pm

[…] called Algebraic datatypes and how you could build some of them on your own in Scala. An… [full post] Mario Gleichmann brain driven development generalscala 0 0 0 […]

January 31, 2011 at 5:11 am

Thanks for sharing some valuable info, keep them coming!!..

January 31, 2011 at 9:51 am

Now for a followup, showing how all the functions can be defined by:

sealed abstract class Boolean {

def fold[X](ifTrue: => X, ifFalse: => X): X

}

January 31, 2011 at 1:09 pm

Jason,

that’s a nice idea! I think i will go with that as soon as i’ve introduced catamorphisms in general.

Greetings

Mario

January 31, 2011 at 12:02 pm

Maybe you have that planned for the next installment, anyway, but a function like nextSeason could be defined neatly with a catamorphism for Season.

January 31, 2011 at 1:13 pm

Stefan,

no i haven’t planned that – but again – what a nice idea!

Unfortunately, i need to introduce the idea of catamorphisms first, where i will show in more detail ‘why functional programming matters’ … 😉

Greetings

Mario

February 1, 2011 at 5:52 am

[…] Functional Scala: Algebraic Datatypes – Enumerated Types « brain … […]

February 5, 2011 at 7:11 pm

[…] one is the continuation of the last episode, where we introduced algebraic datatypes. Within this installment we’ll broaden the scope and […]

February 8, 2011 at 8:35 pm

[…] datatypes! We’ve already seen some simple sum types and a pure product type in the last two episodes. However, the vast majority of algebraic datatypes constitute a combination of a sum and […]

February 13, 2011 at 4:35 pm

[…] the last episodes, we discovered how to define so called algebraic datatypes. So far, we’ve only […]

February 19, 2011 at 1:23 pm

Great post. I was checking constantly this blog and I’m quite impressed! Very helpful info specifically the first part. I care for such information a lot. I was seeking this certain information for a long time. Thank you and best of luck.

April 19, 2011 at 2:38 pm

[…] Functional Scala: Algebraic Datatypes – Enumerated Types […]

April 23, 2011 at 12:08 pm

[…] Functional Scala: Algebraic Datatypes – Enumerated Types « brain driven development (tags: scala functional algebraic datatype) […]

December 23, 2015 at 10:28 am

In the example of the Bool ADT, why do you use a abstract class and not a trait ? Is there a reason for that ?

Thanks for the article and obviously for the answer 😉

August 21, 2016 at 2:06 pm

Your blogs cleared lot of details for me.

I have the same question like Herve. Why use an abstract class instead of a trait to define the Season type? Is there any guideline of when to use which?