Blocks – an alternative for Closures ? | brain driven development

Blocks – an alternative for Closures ?

February 10, 2008 — Mario Gleichmann

The discussion about Closures for Java is still an ongoing issue and haven’t lost anything of its heat. The more concrete the different proposals get, the more excited get their advocates (resp. the more warningly their detractors). One popular opinion is afraid of the additional complexity that may not be understandable by the most part of developers: It’s like providing a kind of a chain saw to their hands that could become more dangerous than helpfull, while producing a pile of unpredictable, complex code.
Not few are claiming for a more simpler solution that follows Pareto’s Law: carrying only 20% of the complexity while solving 80% of the problems that can be adressed by Closures – a solution that will integrate more seamlessly into the current shape of the language and that’s at the same time more comprehensible by the ‘mass-market’.

Well, the following solution is at least not more complex than some of the given proposals, yet by far not comparable with full featured Closures.
It will only use some given instruments of the current language specification (Java 1.5) and tries to be as concise as possible (another issue alluded to Closures within a statically typed language like Java).
The solution is more like an experiment on what’s posible with given features, yet again using a rather unconventional way in assigning names for packages, classes and methods. At the end you’ll see, that its mechanisms are easy to comprehend and hence understandable by the better part of Java guys while solving some of the well established domains for Closures.

Deferred execution

One of the core traits of Closures is that they define a block of code that isn’t executed immediately (the lexical scope within / where they are declared), but ‘captured’ within that kind of block that can be executed at later times. In this point of view they behave similarly to Delegates in C#.
In todays Java, this kind of behaviour is oftentimes ‘mapped’ to anonymous classes, implementing an interface claimed by a collaborator that will call that block of code later on. To make things more clear, let’s imagine we have a collection-like type of a Range that should accept a block of code on which we deliver every element of the Range once only. Therefore Range will offer a method foreach() that will accept that block of code. Since Range is a generic type (containing elements of an ‘arbitrary’ type), we’ll define a rather generic Interface that will ‘capture’ that block of code, too. Let’s go for a first green remedy:

public interface Block<T>{
	public void apply( T arg );
}
...
public class Range<T extends Comparable> implements Iterable<T>{
	...
	public void foreach( Block<T> visitor ){
		for( T element : this ){
			visitor.apply( element );
		}
	}
}

We now could come up with an (i.e. anonymous) implementation for the Block interface, that can be applied to all elements of a given Range. Note, that this solution is rather verbose in that you have to implement method apply() with its full signature every time you want to apply a different kind of logic to the elements of a Range. In case of foreach(), it looks furthermore more verbose than using a for loop for ‘visiting’ all elements (remember that Range implements Iterable) directly.

...
Range( 100, 222 ).apply(
	new Block<Integer>(){ public void apply( Integer element ){
		if( element % 2 == 0 ) println( element + " is even" ); } } );

This kind of capturing a bulk of code by implementing a given Interface (with a well known method signature all collaborators agree on) that’s called once or many times afterwards is a common pattern. You can see this idea slightly within the command pattern or as a popular ‘idiom’ when implementing anonymously any kind of Listeners (i.e. Swings ActionListener).
Of course this solution has some drawbacks. First of all it’s hard to access variables that are declared within the same lexical scope of that anonymous implementation – you can only refer to final variables within your Block, leading to some rather ugly constructions in such cases:

...
final IntegerHolder counter = new IntegerHolder( 0 );  // holds an mutable Integer, while the Holder itself is final
...
Range( 100, 222 ).foreach(
	new Block<Integer>(){ public void apply( Integer element ){
		counter.set( counter.get() + element ); } } );

Secondly, the Signature of such an Interface becomes longer and longer, if you want to apply more than one argument and maybe also want to pass a return value. Let’s say we want to fold every given element of a Range to a condensed single value and deliver this folded value afterwards: In this case, we need an Interface that takes two arguments (firstly an already folded interim value of the same type as the final folded return value (say of type R) and secondly the nextElement to fold of type A) and passes a return value (the finally folded value, again of type R):

public Interface Block<R,R,A> Block{
	public R apply( R interimValue, A1 element );
}
...
public class Range<T extends Comparable> implements Iterable<T>{
	...
	public <R> R foldLeft( R initialValue, Block<R,R,T> folder ){

		R interimValue = initialValue;	

		for( T element : this ){
			interimValue = folder.apply( interimValue, element );
		}
	}
}

Now if we want to fold – say a Range of Integer values – we have to provide an implementation of that Block, declaring the return type and the arguments type. I think you get the fact: First, with every additional type, the method signature gets longer and longer and also the whole definition of our block, claiming for more and more ‘boilerblate’ code:

Integer sum =
	Range( 100, 222 ).foldLeft( 0, new Block<Integer,Integer,Integer>(){
		public Integer apply( Integer interimValue, Integer nextElement ){
			return interimValue + nextElement;
		}
	} );

Second, things get even more confused, if we also are in need of a Block with say three generic arguments but no return value. The signature of such an interface wouldn’t differ from our above one, so we’re in trouble, since Java doesn’t allow to declare two identical Interface definitions (at least not within the same package).

How could we adress those issues? First of all, we don’t want to write to much boilerplate code, but want to keep the definitions and usage of such blocks somehow short and ‘readable’. Let’s go with another try – activate your McGyver attitude, allowing for some rather unconventional solution …

Stateful

What we need is still a ‘deferred’ block of code, but maybe without declaring every single argument within the methods signature again and again (kind of reducing the fluff), wherever we need to define a new block. There’s another potential solution beside the implementation of interfaces that allows for a method signature without arguments. Still it allows to ‘pass’ arguments if we are in need of. What’s that, you may ask. I’m pretty sure you’ve used such a kind of solution in another context: a statefull class. Let’s see how far we can get with that …

First of all, we define a class Block this time, that will hold a field that acts as an argument (with public access for the sake of simplicity in that demo case):

public class Block <T>{
	public T _1= null;
}

Ok, that’s nice – but how do that class solve our purpose and act like a block? Where goes the blocks logic? The idea is to define or overwrite an ‘appointed’ method (again with a well known signature, by convention) while declaring a new Block (Remember? Java allows you to define or overwrite arbitrary methods during the instantiation of a class). This method get’s called by the caller, after it has passed the argument to the given Block instance:

public class Block <T>{
	public T _1= null;	

	void go(){
		throw new RuntimeException( "no block definition" );
	}

	public void call(){
		try {
			Method method = getClass().getDeclaredMethod( "go" );

			method.setAccessible( true );

			method.invoke( this );
		}
		catch (Exception e) {
			throw new RuntimeException( e );
		}
	}
}

As you can see, the mechanics are very simple: we have a well known method named go() without any parameters – that method gets called by reflection (for some reason explained later, we made the method package private and therefore can’t simply overide or implement it in subclasses because of its reduced visibility) whenever the caller executes method call(). If we miss to ‘overwrite’ method go(), a RuntimeException is thrown.
A caller of that Block instance firstly have to populate the argument to the given block and executes it afterwards by using method call():

public class Range<T extends Comparable> implements Iterable<T>{
	...
	public void foreach( Block<T> visitor ){
		for( T element : this ){
			visitor._1 = element;
			visitor.call();
		}
	}
}

Of course it’s quite easy to alternatively provide a method call( T arg ) (which will set the related field on behalf of the caller) on which the caller could pass the argument instead of populating it directly to the public field before calling the blocks logic.
Now for the block definition: when instantiating a new instance of that class Block, we overwrite the appointed method go(), defining our own logic within. Inside the methods body we are able to freely access the given argument by refering to the corresponding field:

Range( 100, 222 ).foreach(
	new Block(){ void go(){ if( _1 % 2 == 0 ) println( _1 + " is even" ); } } );

Note that we ceded the possibility of freely naming the argument due to the concrete problem. Instead we use a Scala-like pattern in adressing the argument with name _1 (for the first argument -in this example the only argument). We could of course come up with other conventions, i.e. naming the first argument it like in Groovy.
Secondly we use a rather short name – go() – for the methods name wherein we define the blocks code. We can’t name the method do(), since it’s a reserved keyword, but i think go() is a nice and short name, too.
Third – as we said – we don’t have to declare any argument on the methods signature. The type of the argument is ‘implicitly’ given with the chosen type parameter for a concrete Block instance.
Beyond that, in order to cut the method siganture even more down, we forgo to use any modifier on the methods signature, leaving it automatically with package private access. In order to call the overriden method (by reflection as already mentioned), we change its access attributes on the fly.

Unconventional combinations

Now that was the first step. What’s about multiple arguments? Since we use Generics to define the arguments types, we can’t define another class Block<T1,T2> within the same package, so we have to arrange the different classes in separate packages. In order to cope with this rather unattractive enforcement, we’ll use again a rather unusual style for package and class names to firstly get rid of those block imports and secondly may gain a more readable style (and may kill two birds with one stone).

package block1;

import block.Block;

public class on <T> extends Block{
	public T _1= null;
}
...
package block2;

public class on <T1,T2> extends block1.on <T1> {
	public T2 _2 = null;
}
...

See, that there’s a separate class for every block with a different number of arguments, each placed in a separate package. In practice there shouldn’t be blocks with say more than four or five arguments, so we talk about a maximum of five classes in five separate packages. Note also, that a class is named in a rather unusual style – you’ll see in a minute that this will read rather fluent, when examining a given block declaration.
Now, in order to define a block for folding a Range, we could come up with this style of a block definition:

Range( 100, 222 ).foldLeft( 0, new block2.on<Integer,Integer(){ void go(){ _1 + _2; // how to return ???} } );

How to return ?

We’ve decided to use Generics solely for defining the arguments types (so not conflicting between a block with two arguments and a block with one argument and a return value). But what’s about returning some values? How can we pass back such a value? Again we’ll use our ‘stateful’ class and define another field for a potential return value along with another method yield() in order to conveniently populate that return value within the blocks code:

package block;

import java.lang.reflect.Method;

public class Block {

	public Object retVal = null;

	public void yield( Object retVal ){
		this.retVal = retVal;
	}
	...
}

Again we could imagine a call() method that the caller can pass the current arguments on the one side and return a potential return value to the caller on the other side, so the caller could be completely unaware of the different fields that will hold the arguments and return value in between the execution.
Admitted, the solution is a bit weak here, since we allow to pass back an object of arbitrary type. Of course we could also include the return value within the type parameters list, but that may would destroy the fluent reading of a blocks declaration. Using the given solution, a caller of such a block have to cast the return value to an appropriate, expected type (let’s say a kind of ‘duck casting’):

...
public T foldLeft( T initial, block2.on<T,T> folder ){

	T folded = initial;

	for( T elem : iterable ){
		folder._1 = folded;
		folder._2 = elem;
		folder.call();

		folded = (T) folder.retVal;

		// or alterantively and more shortly:
		// folded = (T) folder.call( folded, elem );
	}
	return folded;
}

Given this solution, we now can complete the above example and yield a return value, passing that value to the reserved field within the class:

Integer sum = Range( 100, 222 ).foldLeft( 0, new block2.on<Integer,Integer(){ void go(){ yield( _1 + _2 ); } } );
Integer product = Range( 10, 22 ).foldLeft( 1, new block2.on<Integer,Integer(){ void go(){ yield( _1 * _2 ); } } );

Conclusion

We came up with a (first simple) solution for deferred block logic that’s build solely on features of the given language specification (1.5). Instead of implementing a given Interface on the fly, we use a stateful class and overwrite an appointed method, forcing as less boilerplate code as possible.
Again we’ve chosen some rather unconventional ways, so a declaration of such a block reads rather natural. Admitted, it might look a bit odd at the start, but one get used to it – i have to repeat myself: if you trust the underlying mechanisms and know how thinks work, you get used to it. In our case, the underlying mechanisms are very easy to understand. Everyone of you could easily come up with a similar solution!

You have to come up with the underlying ‘infrastruture’ (the Block classes in essential) only once. Once provided, you can rely on the underlying mechanisms, you only have to declare your block logic as needed and provide some methods that will accept those blocks – that’s it, nothing more.

You may have seen, that there’s nothing magic about that solution. As said at the beginnng, not even comparable with the full force of real closures and of course we have to maneuver within the given language features, making some of the curly braces necessary – but you get some of the advantages of closure-like constructs in a mostly type safe manner for free without extending the given language with new concepts, maybe quite satisfactory to 80% of the problems and therefore for the mass-market …

Posted in java. 7 Comments »

Andrew Says:
February 11, 2008 at 9:45 am

So far I’ve got only one question:

If we can do something like this in Java already, why do we need closures? Just to make it look better?

Nick Brown Says:
February 11, 2008 at 3:25 pm

Correct me f I am missing something, but this seems to solve only one of the problems of anonymous classes (and not a particularly big one, I haven’t seen too many people using the complaint that anonymous methods require variables to be made final as its probably the easiest problem to get around). It does nothing to reduce the clutter that defining an anonymous class requires (compare your definition with the BGGA equivalent, {int a, int b => a*b}) and does nothing to allow control structures such as resource management blocks.

Yes, we get that closures technically can be done with the current Java spec. But it requires so much work to make it happen and is so ugly and unmaintainable that it is rarely used, which is going to be problematic as new structures such as the Java 7 fork join get added.

And I’m not too happy with the practice of defining an entirely new package for every conceivable number of argument parameters (exactly how high are you planning to go?), thats what arrays are for (if you don’t like the syntax of defining an array, use the Java 5 var args to do it). And the practice of naming a class with a lower case letter isn’t much better.

Nick Brown Says:
February 11, 2008 at 5:16 pm

One more thing, your complaint that parameters have to be declared as final isn’t really valid either, as in functional programming you typically want everything to be stateless. Introducing variables that have a state is just asking for a host of problems for anything more complex than a counter (and even that would be prone to CMEs).

Mario Gleichmann Says:
February 11, 2008 at 6:16 pm

Andrew,

thanks for your feedback. As pointet out in the post, the presented experiment is by far not comparable with the full power of Closures.
One thing, just indicated in the post, is the ability to access the lexical scope within the Closure was defined, even if that scope doesn’t ‘exist’ any longer when the Closure gets executed (as mentioned, only possible with final variables in Java).
Another essential function is the possibility to curry a given Closure (you may want to search for that idea, since it would take a little bit longer to explain).
That said, the demonstrated thoughts are by far not comparable with full featured Lambda calculus nor with Closures inherently.

Greetings

Mario

Mario Gleichmann Says:
February 11, 2008 at 6:57 pm

Nick,

thank you very much for your critical comments!

>> … anonymous methods require variables to be made final as its probably the easiest problem to get around.

First of all, the post is mainly about an experiment about how ‘concise’ and ‘readable’ you can get with the given language features. It’s a fact, that you only can refer to final variables, since they end up on the Heap – nothing more, nothing less (if it seems that i was mainly complaining about this fact, it’s rather my fault not to make things clear enough).

>> … does nothing to reduce the clutter that defining an anonymous class requires (compare your definition with the BGGA equivalent, {int a, int b => a*b}

Yes, as said in the post – it’s at least an experiment to become as concise as possible with the given features. It’s of course much more bloated compared to BGGA, but in my opinion more concise (i’m not saying by a huge amount) than implementing an Interface while repeating the whole signature of a method all over again.

Of course (again mentioned twice in the post), it’s by far not as powerfull as Closures (i.e. the post doesn’t mention issues with stateful classes and concurrency, exceptionally escaping from Closures, … ), however could it serve for a good part of problems that are now tried to be clobbered by Closures.

>> And I’m not too happy with the practice of defining an entirely new package for every conceivable number of argument parameters (exactly how high are you planning to go?)

As explained in the post – for providing Block classes for a maximum number of say five arguments (seriously, did you use more than three or four arguments in practice?) – exactly five classes. Nothing more, since they are Generic, so you can use every combination of types with the given classes!

>> thats what arrays are for (if you don’t like the syntax of defining an array, use the Java 5 var args to do it)

I don’t get your point about this part. If you want to forgo a type safe way of using arguments of potentially different types (since you can’t declare an array of different types you have to use the most abstract type of Object), go on. But than the resulting block definition is again bloated by some casts. Nothing about that – personally, i prefer a type safe way with the advantage to directly calling the given methods on that type within the blocks definition.

Blocks - an alternative for Closures ? « Java Net Says:
February 12, 2008 at 8:42 pm

[…] – an alternative for Closures ? This entry was written by Mario Gleichmann. Bookmark the permalink. Follow any comments here with the RSS feed for this post.Content related […]

Real State Agent Says:
March 11, 2008 at 1:54 pm

I did the same as suggested here for implementing a presentation framework, so that you could pass Runnables as parameters to methods.

This Runnables would be executed when the paint, load, etc. events were triggered.

The result was that none of the shortcomings that you mention were real problems.

The only annoying problem was having to write the whole Runnable. It would be easier if you only had to write something like f.onPaint( #{ object.doSomething(); } ).

The other problems I find in Java and that C++ already had solved were:

1. Destructors. In C++ you declare the destructor and you forget about it. Not so in Java, you have to use finally everywhere. It is not pretty.

2. The lack of operators. Syntactic sugar, yes. But necessary sugar.

brain driven development

Pages

Categories

Feed