Functional Scala: Essential list functions

April 10, 2011 — Mario Gleichmann

In the last two episodes we came up with the basic idea of persistent (read immutable) list-like data structures and how to construct and deconstruct (read pattern match) them using a sugarized notational form. We also talked big about their non-destructive nature when it comes to e.g. updating a given list or removing some elements. In this episode, we’ll finally inspect some of those commonly used list functions. While we’re going to reimplement the most essential ones, you’ll hopefully get a better feeling for the basic principles and characteristics which build the foundation for operating on persistent data structures in a non-destructive way (preserving the structure of given list values).

Retrieving an element

For a smooth entry into the world of functions on persistent list structures, let’s first take a look at some non-destructive functions. First we might wanna know if a certain value is an element of a given list. Again we’re going to cheat a bit, since we’ll make an implicit assumption that the elements of a list can be tested for equality using == (we’ll adress that issue in more detail when looking at type classes):

def element[A]( x :A, lst :Lst[A] ) : Boolean = lst match {
    case EmptyLst  => false
    case a +: _  if( a == x )  => true
    case _ +: as  => element( x, as )
}

That wasn’t too difficult. For getting an even better feeling (and might see some recurring patterns), let’s take another look at a read-only function: it’s surely comfortable to have a function for retrieving a certain element within a given list. Typically, a list structure preserves the order in which their elements got inserted. Let’s say the element at the head of the list sits on position zero. The next element (which is the head of the lists tail) is at position one and so on and so on. So our function clearly needs two arguments: the position of the element we wanna retrieve and the list itself. Again, we want to preserve the type of the element which is going to be returned, so we’ll use parametric polymorphism. We also need to handle the absence of a reasonable return value (that is requesting a position which is out of scope for a given list), so we’ll again returning a value of type Option, parameterized with the type of the lists content. Just watch the following definition for a first feeling:

def elemAt[A]( pos :Int, lst :Lst[A] ) :Option[A] = ( lst, pos ) match {
    case( EmptyLst , _ )  => None
    case( a +: _ , 0 )  => Some( a )
    case( _ +: as , i )  => elemAt( i-1, as )
}

So what’s going on here? We’ll just make use of the recursive structure which is given by the value constructors of our list type. If we’re requesting an arbitrary position on an empty List (line 2), we can only return None, since we have no element to give. Otherwise, we’re trying to step through the list until we’re reaching the requested position: If we request position zero on a given list (line 3), then we’re immediately done, since we only need to return the head of the given list. For any other position (line 4) we can’t deliver the lists head but need to step into the tail of the given list (which’s also a list) recursively, while also reducing the position which is requested within the rest of the list.

Inserting an element

Again, we just saw a very natural way for operating on recursive, algebraic datatypes. Since pattern matching can be seen as the counterpart of list construction, we used it for reasoning and react upon the structure of a given list instance: we just splitted the problem into three smaller problem cases which we can handle independently. Secondly we just reflected the recursive nature of our datatype by defining a recursive function which just makes use of the fact that the tail of a list is also a list which can be treated in the same way.

Now that we saw a very natural way for operating on our list type, let’s see if we’re able to make use of it when trying to operate on a list which typically destroys the structure of a given list instance in an imperative environment. As we’ve said, a persistent data structure represents an immutable value. So every operation – like inserting another element at an arbitrary position into a given list – just results into a new version (read a new value) of a list.

So how could we achieve it? Maybe we could take a look at our three cases we used for retrieving an element within a given list. In doing so, we might find an answer for all of those three smaller problems:

Inserting an element into the empty list, at an arbitrary position
In this case, we might simply return a new list which just consists of the new element , prepended to the empty list. We just don’t care about the position, since we already arrived at the end of the list.
Inserting an element into a given list at position zero
In this case, we also just prepend the new element to the given list. The result is a new list with the new element as head and the given list as the tail. Of course, the old list instance isn’t destroyed since we only constructed a new list version which refers to the old list version as its tail.
Inserting an element into a given list at any other position
In this case, we can’t simply prepend the new element as the head of a new list. Like we already did, we also need to step through the rest of the list, until we reached the right position. And here comes the exciting part: while we’re stepping though the list, we just reassemble all the list elements we’ve already passed so far, creating a new list version. We’ll doing this in a recursice way, by just prepending every element we’re passing to the list which results from the recursive call of inserting the element into the rest of the list (which should result into a list, were the new element is already inserted successfully).

Now we achieved a master plan for inserting elements into a given list, just resulting into a new list which also holds a new element at the requested position. Implementing that function should be easy now:

def insertAt[A,B>:A](  pos :Int, b :B, lst :Lst[A] ) :Lst[B] = ( lst, pos ) match {
    case ( EmptyLst , _ )  =>	b +: EmptyLst
    case( as , 0 )  => b +: as
    case( a +: as , i )  => a +: insertAt( i-1, b, as )
}

Remember the covariant nature of our list type? Again, we reflected that fact just by allowing to insert an arbitrary element for a possible supertype of the given type of the lists content. The function then results into a new list which holds for the most specific type for the new element and the given elements. Most important, we didnt destroyed the old list version. We only constructed a new list instance by reassembling every single element on our way to the position for the new element to insert. When we reached that position, we simply result into a list which refers to that new element as the head and refering the rest of the given list as the tail.

You might see the whole process splitted into two phases: reconstruction of a new list by reassembling all elements until we reached the position for the new element and just refering (and thus sharing with the original list version) to the rest of the list after we reached the position for the new element. So for inserting a new element near the start of a list you might need to reassemble only a few list elements (sharing the most part of the list with the former version), while inserting a new element at the end of a list results in reassembling all given elements of the original list value. So in this worst case the cost for inserting a new element (the way we did above) raise linear with the number of the lists elements.

Replacing an element

We shouldn’t have any new poblems for that kind of quest. It’s quite similar to the idea of inserting a new element. The only difference is to just skip the given element at the requested position when reassembling the new list version. Observe:

def replaceAt[A,B>:A]( pos :Int, b:B, lst :Lst[A] ) :Lst[B] = ( lst, pos ) match {
    case ( EmptyLst , _ )  =>	b +: EmptyLst
    case ( a +: as, 0 ) => b +: as
    case ( a +: as, i ) => a +: replaceAt( i-1, b, as )
}

See the pattern at line 3. There we now deconstructed the list into its head and tail, just to leave out the head when it comes to constructing the new list on the right side of the case expression!

Removing an element

Removing an element should be very similar to replacing an element. What? Yep, it’s like replacing without any new element (to replace with) at all. Just watch:

def removeAt[A]( pos :Int, lst :Lst[A] ) :Lst[A] = ( lst, pos ) match {
   case ( EmptyLst , _ )  => EmptyLst
   case ( a +: as, 0 )  => as
   case ( a +: as, i )  => a +: removeAt( i-1, as )
}

Another way would be to give an arbitrary value which is intented to be removed from the list if it’s an element of it. Again, we’ll rely on == for identifying that value within a given list instance. The idea is again very similar to element removal on a certain position. In this case, we just regard every element while stepping through the list:

def remove[A]( x :A, lst :Lst[A] ) :Lst[A] = lst match{
    case EmptyList  => EmptyLst
    case a +: as if( a == x )  => remove( x, as )
    case a +: as  => a +: remove( x, as )
}

Summary

In this episode, we saw how to come up with a bulk of some of the most common, essential list functions, operating on lists in a non-destructive way. There, we detected some common patterns.

First, we used pattern matching for reasoning and reacting upon the structure of a given list instance, allowing us to split the problem into some smaller problem cases which we can handle independently.

Second, our functions reflected the recursive nature of our datatype. At least one case within our function was defined in a recursice way, which just makes use of the fact that the tail of a list is also a list which can be treated by the function in the very same way.

And finally, we saw the non-destructive processing of a given list version, mostly reflected by reassembling a new list structure by the given elements of the original version until we reached a situation (or case) were we could handle the problem immediately, followed by simply referring to the rest of the original list, which can be seen as structural sharing this part of the list between the old and the new list version. In either way, we always come up with a new list value which is kind of derived from the old list. All of the functions we’ve looked so far are leaving the original version untouched, which is so to say the essence of persistent data structures.

Of course there are many more list functions which we haven’t considered yet. What about concatenating two list values for example? Not only for that, we’re going to explore some very powerful functions, which will constitute the basis for a bunch of other functions on lists (like concatenation). That will be the topic for the next episode. So stay tuned …

Posted in general, Scala. 6 Comments »

Functional Scala: Essential list functions Says:
April 10, 2011 at 2:44 pm

[…] (read pattern match) them using a sugarized notational form. We also talked big about their… [full post] Mario Gleichmann brain driven development generalscala 0 0 0 […]

Matt Pryor Says:
April 12, 2011 at 1:42 pm

Nice article – you’re whole Functional Scala series has been a great introduction for me. I think you might have a typo on line 3 in your last example though – if I understand correctly it should be

case a +: as if( a == x ) => as

rather than

case a +: as if( a == x ) => remove( as )

Thanks

Matt

Mario Gleichmann Says:
April 12, 2011 at 6:41 pm
Matt,

thanks for your kind words and feedback!

If you don’t call remove on the rest (tail) of the list than you’ll only remove the first occurrence of the value. So if the value’s contained within the list more than once and you want to remove all occurrences, you need to call remove again on the lists tail (in other words, you don’t stop removing when hitting the first occurrence but step through the whole list).

Nevertheless, there’s another typo, since i’ve forgotten to apply the value (which is going to be removed) to the recursive function call (remove( x, as )). Will fix …

Thanks & Greetings

Mario

links for 2011-04-15 « Dan Creswell’s Linkblog Says:
April 15, 2011 at 12:03 pm

[…] Functional Scala: Essential list functions (tags: scala collections) […]

Functional Scala: Quiz with Lists – common list functions, handcraftet « brain driven development Says:
April 17, 2011 at 8:24 pm

[…] promised within the last episode, we’re going to take another detailed look at some more commonly used list functions. As we […]

Functional Scala – Mario Gleichmann « Another Word For It Says:
April 19, 2011 at 2:40 pm

[…] Functional Scala: Essential list functions […]

brain driven development

Pages

Categories

Feed