[Issue 003] DSLs with Java

Summary: Designing and implementing a DSL with Java may result in a tedious and arduous experience given the limitations of the language. However there are a couple of techniques and tricks than can be applied to obtain better results.

Hello again, it's been a quiet period since the last issue due to summer break. The 7th edition of JCrete took place once more in mid July at the Orthodox Academy of Crete (OAC for short) in Kolimbari, Crete. Like every other unconference, JCrete offers the opportunity to every attendee to share their passion and knowledge, however JCrete is quite unique as you can experience the famous cretan hospitality, and well the setting (the beaches!) is beyond belief. This year I decided to take a different approach to the event and made a handful of interviews with the attendees as well as two of three founders: Dr. Heinz Kabutz and Kirk Pepperdine. You can watch all the interviews if you follow this link.

There were over 90 different sessions held during the regular time, one in particular caught my attention which is the topic for this issue. The conversation started by sharing the motivation for the meeting then we jumped into the proper mechanics of writing a DSL. As is likely to happen when touching the subject the distinction between internal vs. external DSLs arose pretty early. An internal DSL is bound to the host platform/language, in other words it's very close to the host and follows most of its rules. Think for example the Bean Shell Framework (BSF) which provides scripting capabilities on top of Java. On the other hand an external DSL is not bound to the host language, it can be a completely different type of language; users of this kind of DSL might not even be aware about the host at all. Given that external DSLs can be implemented with any existing platform/language we limited the options to just the JVM and internal DSLs.

You may be thinking, how is it possible to write an internal DSL targeting the JVM if the Java language is so verbose and also limited when compared to alternative JVM languages? Languages such as Scala, Kotlin, and Groovy (my personal favorite) were mentioned, then again the group decided to press on, in order to find how to write a DSL with just Java. And so we arrived to the following:

Given that we're bound to Java semantics, the DSL expressions must follow the receiver.message() or message(receiver) pattern. Wait a second, the first pattern is easy to understand, after all that's just calling a method on an instance, so how can we realize the second pattern? The answer is through static methods; this is a trick used by Mockito and Awaitility to great effect, for example the following code snippet shows a simple stub for a service that's meant to be invoked by a controller component

import static org.mockito.Mockito.mock;
import static org.mockito.Mockito.when;
import static org.mockito.Mockito.only;
import static org.mockito.Mockito.verify;

HelloService service = mock(HelloService);
when(service.sayHello(input)).thenAnswer(output);
Controller controller = new Controller(service);
controller.doTheStuff(input);
verify(service, only()).sayHello(input);

The methods mock(), when(), and verify() are used as starting points for the DSL expression, with further instructions appended to the result of the previous invocation. This approach is akin (but not identical) to using higher order functions in other languages, that is, the expression is the composition of values/receivers (the service and the input value) and functions (when and verify for example) with other functions (only). Appending function calls works as long as the next value in the chain can access the desired function, which is not often the case when dealing with types in a hierarchy. Take the following types as example

public class Parent {
    public Parent doParentStuff() {
        // ...
        return this;
    }
}

public class Child extends Parent {
    public Child doChildStuff()  {
        // ...
        return this;
    }
}

Child child = new Child();
child.doParentStuff().doChildStuff(); // oops!

The compiler complains that the doChildMethod() is not available on the Parent type and rightly so, it's only available from the Child type. Given the constraints of the Java language we can't override the doParentStuff() method on the Child type and have it return Child instead of Parent. One could instead update the code of Parent to return Child instead of Parent, but that wouldn't work on all situations where there are other types in the hierarchy. A better solution would be for Parent/Child hierarchy to use the concept of self types, which unfortunately is not available in the language but can be faked out using generics in the following way

public class Parent<SELF extends Parent<SELF>> {
    protected final SELF self() { return (SELF) this; }

    public SELF doParentStuff() {
        // ...
        return self();
    }
}

public class Child extends Parent<Child> {
    public SELF doChildStuff() {
        // ...
        return self();
    }
}

Child child = new Child();
child.doParentStuff().doChildStuff(); // yay!

This trick is used by AssertJ/Truth (discussed in issue 002) in order to provide a fluent interface design to their assertion hierarchies. The next approaches require Java 8 as they rely on features added to Java on that release, namely default methods, lambda expressions and method references. Default methods let you define implementation details on an interface. While this feature does not directly affect the syntax of a target DSL it enables more code reuse as you can push common implementation details up the chain to an interface instead to an base class, thus giving you the freedom to extend any other class if needed. The syntax of lambda expressions may look alien in some DSLs but their power is unbeatable, as they can be used by the final users of the DSL to define inline functions; the alternative would be for them to define static methods and that would force them to have more knowledge of the Java language itself. Finally, method references can be seen as shortcuts for lambda expressions, and yes the syntax of receiver::function may also be seen as alien in some cases however there may be times where their usage leads to better results.

One last point to rounding up the options discussed was the usage of the builder pattern. Often times the domain is expressed in terms of immutable objects which means you need a way to construct those immutable instances thus a mutable version of the target type is used: the builder. Implementing a builder is pretty straight forward but the code can be very verbose; you may use IDE macros as an alternative or Project Lombok's @Builder annotation to get the same effect.

Thank you for reading. Any feedback is appreciated.

See you next time.

Andres

Andres Almiray

Building a better world one commit at a time

Share this: