Home

Blog

Kotlin DSL: from Theory to Practice

All articles

Blog 78 months ago

Kotlin DSL: from Theory to Practice

Ivan Osipov

I’ll try to explain the language syntax as simple as possible, however, the article still appeals to developers who consider Kotlin as a language for custom DSL building. At the end of the article I’ll mention Kotlin drawbacks worth taking into account. The presented code snippets are relevant for Kotlin version 1.2.0 and are available on GitHub.

What is DSL?

All programming languages can be divided into general-purpose and domain-specific languages. SQL, regular expressions, build.gradle are often cited as examples of DSL. These languages are limited in functionalities but they are able to effectively address a certain problem. They allow to write not imperative code (we shouldn't explain how to solve the problem) but in more or less declarative way (we just declare the task) in order to obtain the solution based on the given data.

Let’s say you have the standard process definition which can be eventually changed and enhanced but generally you want to use it with different data and result formats. By creating a DSL you create a flexible tool for solving various problems within one subject domain, no matter for your DSL end user how the solution is obtained. So, you create a sort of API which, if mastered, can simplify your life and make it easier to keep the system up-to-date in the long-term.

The article deals with building an “embedded” DSL in Kotlin as a language implemented on the general-purpose languages syntax. You can read more about it here.

Implementation area

To my mind, one of the best ways to use and demonstrate Kotlin DSL is testing.

Suppose that you’ve come from the Java world. How often have you been faced with declaring entity instances of an extensive data model? You’ve been likely using some builders or, even worse, special utility classes to fill the default values under the hood. How many overridden methods have you had? How often do you have to make a little changes from default values, and how much effort does this require today? If these questions stir up nothing but negative feelings, this article is for you.

That’s the way we’ve been doing for a long time in our project in the area of education: we used builders and utility classes to cover with tests one of our most important modules which is school timetable scheduling. Now this approach has given way to the Kotlin language and DSL which is used to describe test scenarios and check the results. Below you can see how we took advantage of Kotlin so that testing of the scheduling subsystem is not a torture anymore.

In this article we will dive into details of constructing a DSL which helps to test an algorithm building teachers and students schedules.

Key tools

Here are the basic language features that allow you to write cleaner code in Kotlin and create your own DSL. The table below demonstrates the main syntax enhancements that are worth using. Take a look at it carefully. If most these tools are unfamiliar to you, you might better read the whole article. If you don’t know one or two of them, feel free to fast forward to corresponding sections. In case there’s nothing new for you, just skip to DSL drawbacks review at the end of the article. You can also welcome to propose more tools in comments.

Tool	DSL syntax	General syntax
`Operators overloading`	`collection += element`	`collection.add(element)`
`Type aliases`	`typealias Point = Pair`	`Creating empty inheritors classes and other duct tapes`
`get/set methods convention`	`map["key"] = "value"`	`map.put("key", "value")`
`Destructuring declaration`	`val (x, y) = Point(0, 0)`	`val p = Point(0, 0); val x = p.first; val y = p.second`
`Lambda out of parentheses`	`list.forEach { ... }`	`list.forEach({...})`
`Extension functions`	`mylist.first(); // there isn’t first() method in mylist collection`	`Utility functions`
`Infix functions`	`1 to "one"`	`1.to("one")`
`Lambda with receiver`	`Person().apply { name = «John» }`	`N/A`
`Context control`	`@DslMarker`	`N/A`

Found anything new? If so, let’s move on.

I omitted delegated properties intentionally, as, in my opinion, they are useless for building DSL, at least in our case. Using the features above we can write cleaner code and get rid of voluminous “noisy” syntax, making development even more pleasant (Could it be?).

I liked the comparison I met in “Kotlin in Action” book: in natural languages, as English, sentences are built of words, and grammar rules define the way of combining these words. Similarly, in DSL one operation can be constructed of several method calls, and the type check guarantees that the construction makes sense. For sure, the order of callings can not always be obvious, but it is entirely up to the DSL designer.

It is important to stress that this article examines an “embedded DSL”, so it is based on a general-purpose language, which is Kotlin.

Final result example

Before we begin to design our own domain-specific language, I’d like to show you an example of what you’ll be able to create after reading this article. The whole code is available on GitHub repository through the link.
The DSL-based code below is designed to test the allocation of a teacher for students for the defined disciplines. In this example we have a fixed timetable, and we check if classes are placed in both teacher’s and students’ schedules.

schedule {
    data {
        startFrom("08:00")
        subjects("Russian",
                "Literature",
                "Algebra",
                "Geometry")
        student {
            name = "Ivanov"
            subjectIndexes(0, 2)
        }
        student {
            name = "Petrov"
            subjectIndexes(1, 3)
        }
        teacher {
           subjectIndexes(0, 1)
           availability {
             monday("08:00")
             wednesday("09:00", "16:00")
           } 
        }
        teacher {
            subjectIndexes(2, 3)
            availability {
                thursday("08:00") + sameDay("11:00") + sameDay("14:00")
            }
        }
        // data { } won't be compiled here because there is scope control with
        // @DataContextMarker
    } assertions {
        for ((day, lesson, student, teacher) in scheduledEvents) {
            val teacherSchedule: Schedule = teacher.schedule
            teacherSchedule[day, lesson] shouldNotEqual null
            teacherSchedule[day, lesson]!!.student shouldEqual student
            val studentSchedule = student.schedule
            studentSchedule[day, lesson] shouldNotEqual null
            studentSchedule[day, lesson]!!.teacher shouldEqual teacher
        }
    }
}

Toolkit

All the features for building a DSL have been listed above. Each of them is used in the example from the previous section. You can examine how define such DSL constructs in my project on GitHub.
We will refer to this example again below in order to demonstrate the usage of different tools. Please bear in mind that the described approaches are for illustrative purposes only, and there may be other options to achieve the desired result.
So, let’s discover these tools one by one. Some language features are most powerful when combined with the others, and the first in this list is the lambda out of parentheses.

Lambda out of parentheses

Documentation

Lambda expression is a code block that can be passed into a function, saved or called. In Kotlin the lambda type is defined in the following way: (list of param types) -> returned type. Following this rule, the most primitive lambda type is () -> Unit, where Unit is an equivalent of Void with one important exception. At the end of the lambda we don’t have to write the return… construction. Thereby, we always have a returned type but in Kotlin this is done implicitly.

Below is a basic example of assigning lambda to a variable:

val helloPrint: (String) -> Unit = { println(it) }

Usually the compiler tries to infer the type from the already known ones. In our case there is a parameter. This lambda can be invoked as follows:

helloPrint("Hello")

In the example above the lambda takes one parameter. Inside the lambda this parameter is called it by default, but if there were more, you would have to specify their names explicitly or use the underscore to ignore them. See such case below:

val helloPrint: (String, Int) -> Unit = { _, _ -> println("Do nothing") }
helloPrint("Does not matter", 42) //output: Do nothing

The base tool - which you may already know from Groovy - is the lambda out of parentheses. Look again at the example from the very beginning of the article: almost every use of curly brackets, except the standard constructions, is a lambda. There are at least two ways of making an x { … }:-like construction:

the object x and its unary operator invoke (we’ll discuss it later);
the function x that takes a lambda.

In both cases we use lambda. Let’s suppose there is a function x(). In Kotlin, if a lambda is the last argument of a function, it can be placed out of parentheses, furthermore, if a lambda is the only function’s parameter, the parentheses can be omitted. As a result, the construction x({…}) can be transformed into x() {}, and then, by omitting the parentheses, we get x {}. This is how we declare such functions:

fun x( lambda: () -> Unit ) { lambda() }

In concise form a single-line function can be also written like this:

fun x( lambda: () -> Unit ) = lambda()

But what if x is a class instance, or an object, instead of a function? Below is another interesting solution based on a fundamental domain-specific concept: operators overloading.

Operator overloading

Documentation

Kotlin provides wide but limited variety of operators. The operator modifier enables to define functions by conventions that will be called under certain conditions. As an obvious example, the plus function is executed if you use the “+” operator between two objects. The complete list of operators can be found in the docs by the link above.
Let’s consider a less trivial operator invoke. This article’s main example starts with the schedule { } construct that defines the code block, responsible for testing the schedule. This construct is built in a slightly different way to the one mentioned above: we use the invoke operator + “lambda out of parentheses”. Having defined the invoke operator, we can now use the schedule(...) construct, although schedule is an object. In fact, when you call schedule(...), the compiler interprets it as schedule.invoke(…). Let’s see how schedule is declared:

object schedule {
    operator fun invoke(init: SchedulingContext.() -> Unit)  { 
        SchedulingContext().init()
    }
}

The schedule identifier refers us to the only schedule class instance (singleton) that is marked by the special keyword object (you can find more information about such objects here). Thus, we call the invoke method of the schedule instance, receiving lambda as a single parameter and placing it outside of the parentheses. As a result, the schedule {… } construction matches the following:

schedule.invoke( { code inside lambda } )

However, if you look at the invoke method carefully, you’ll see not a common lambda but a “lambda with a handler” or “lambda with context” which type is defined as

SchedulingContext.() -> Unit

Let’s examine it in details.

Lambda with a handler

Documentation

Kotlin enables us to set a context for lambda expressions (context and handler mean same here). Context is just an object. The context type is defined together with the lambda expression type. Such lambda acquires properties of a non-static method in the context class but only has access to the public methods of this class.
While the type of a normal lambda is defined like () -> Unit, the type of a lambda with X context is defined as follows: X.()-> Unit, and, if normal lambdas can be called in a usual way:

val x : () -> Unit = {}
x()

lambda with context requires a context:

class MyContext

val x : MyContext.() -> Unit = {}

//x() //won’t be compiled, because a context isn’t defined 

val c = MyContext() //create the context

c.x() //works

x(c) //works as well

I’d like to remind that we have defined the invoke operator in the schedule object (see the preceding paragraph) that allows us to use the construct:

schedule { }

The lambda we are using has the context of SchedulingContext type. This class has a data method in it. As a result, we get the following construct:

schedule {
    data {
        //...
    }
}

As you have probably guessed, the data method also takes a lambda with context, however it is a different context. Thus, we get nested structures having several contexts inside simultaneously. To get the idea of how it works, let’s remove all syntactic sugar from the example:

schedule.invoke({
    this.data({
    })
})

As you can see, it’s all fairly simple. Let’s take a look at the invoke operator implementation.

operator fun invoke(init: SchedulingContext.() -> Unit)  { 
    SchedulingContext().init()
}

We call the constructor for the context SchedulingContext(), and then with the created object (context) we call the lambda with the init identifier that we have passed as a parameter. This resembles a lot a general function call. As a result, in one single line SchedulingContext().init() we create the context and call the lambda passed to the operator. For more examples, consider apply and with methods from Kotlin standard library.

In the last examples we discovered the invoke operator and its combination with other tools. Next, we will focus on the tool that is formally an operator and makes the code cleaner - the get/set methods convention.

get/set methods convention

Documentation

When creating a DSL we can implement a way to access maps by one or more keys. Let’s look at the example below:

availabilityTable[DayOfWeek.MONDAY, 0] = true
println(availabilityTable[DayOfWeek.MONDAY, 0]) //output: true

In order to use square brackets, we need to implement get or set methods (depending on what we need, read or update) with an operator modifier. You can find an example of such implementation in the Matrix class on GitHub. It is a simple wrapper for matrix operations. Below you see a code snippet on the subject:

class Matrix(...) {
    private val content: List>
    operator fun get(i: Int, j: Int) = content[i][j]
    operator fun set(i: Int, j: Int, value: T) { content[i][j] = value }
}

You can use any get and set parameter types, the only limit is your imagination. You are free to use one or more parameters for get/set functions to provide a convenient syntax for data access. Operators in Kotlin provide lots of interesting features that are described in the documentation.

Surprisingly, there is a Pair class in the Kotlin standard library. Larger part of developers community finds Pair harmful: when you use Pair, the logic of linking two objects is lost, thus it is not transparent why they are paired. The two tools I’ll show you next will demonstrate how to keep the pair sensemaking without creating additional classes.

Type aliases

Documentation

Suppose we need a wrapper class for a geo point with integer coordinates. Actually, we could use the Pair <int, int="">class, but having such variable we can in a moment lose the understanding of why we have paired these values.
A straightforward solution is either to create a custom class or something even worse. Kotlin enriches the developer’s toolkit by type aliases with the following notation:

typealias Point = Pair

In fact, it is nothing but renaming a construct. Due to this approach we don’t need to create the Point class anymore, as it would only duplicate the Pair. Now we can create a point in this way:

val point = Point(0, 0)

However, the Pair class has two attributes, first and second, that we need to rename somehow to blur any differences between the needed Point and the initial Pair class. For sure, we are not able to rename the attributes themselves (however you can create extension properties), but there is one more notable feature in our toolkit called destructuring declaration.

Destructuring declaration

Documentation

Let’s consider a simple case: suppose we have an object of the Point type which is, as we already know, just a renamed type Pair<int, int="">. If we look at the Pair class implementation in the standard library, we’ll see that it has a data modifier which directs the compiler to implement componentN methods within this class. Let’s learn more about it.</int,>

For any class, we can define the componentN operator that will be in charge of providing access to one of the object attributes. That means that calling point.component1 will be equal to calling point.first. Why do we need such a duplication?

Destructuring declaration is a means of “decomposing” an object to variables. This functionality allows us to write constructions of the following kind:

val (x, y) = Point(0, 0)

We can declare several variables at once, but what values will they be assigned? That’s why we need the generated componentN methods: using the index starting from 1 instead of N, we can decompose an object to a set of its attributes. So, the above construct equals to the following:

val pair = Point(0, 0)
val x = pair.component1()
val y = pair.component2()

which, in turn, is equal to:

val pair = Point(0, 0)
val x = pair.first
val y = pair.second

where first and second are the Point object attributes.The for loop in Kotlin looks as follows, where x takes the values 1, 2, and 3:

for(x in listOf(1, 2, 3)) { … }

Pay attention to the assertions block in the DSL from the main example. I’ll repeat a part of it for convenience:

for ((day, lesson, student, teacher) in scheduledEvents) { … }

This line should be evident. We iterate through a collection of scheduledEvents, each elements of which is decomposed into 4 attributes.

Extension functions

Documentation

Adding new methods to objects from third-party libraries or to the Java Collection Framework is what a lot of developers have been dreaming about. Now we have such opportunity. This is how we declare extension functions:

fun AvailabilityTable.monday(from: String, to: String? = null)

Compared to the standard method, we add the class name as a prefix to define the class we extend. In the example above AvailabilityTable is an alias for the Matrix type and, as aliases in Kotlin are nothing but renaming, this declaration is equal to the one below, which is not always convenient:

fun Matrix.monday(from: String, to: String? = null)

Unfortunately, there's nothing we can do here, except not using the tool, or adding methods only to a specific context class. In this case, the magic only appears where you need it. Moreover, you can use such functions even for extending interfaces. As a good example, the first method extends any iterable object:

fun  Iterable.first(): T

In essence, any collection based on the Iterable interface, despite of the element type, gets the first method. It is worth mentioning that we can place an extension method in the context class and thereby have access to the extension method only in this very context (similarly to lambda with a context). Furthermore, we can create extension functions for Nullable types (the explanation of Nullable types is out of scope here, for more details see this link). For example, that’s how we can use the function isNullOrEmpty from the standard Kotlin library that extends the CharSequence type:

val s: String? = null
s.isNullOrEmpty() //true

Below is this function’s signature:

fun CharSequence?.isNullOrEmpty(): Boolean

When working with such Kotlin extension functions from Java, they are accessible as static functions.

Infix functions

Documentation

One more way to sugar-coat our syntax is to use infix functions. Simply said, this tool helps us to get rid of excessive code in simple cases. The assertions block from the main snippet demonstrates this tool’s use case:

teacherSchedule[day, lesson] shouldNotEqual null

This construction is equivalent to the following:

teacherSchedule[day, lesson].shouldNotEqual(null)

In some cases brackets and dots can be redundant. For such cases we can use the infix modifier for functions.
In the code above, the construct teacherSchedule[day, lesson] returns a schedule element, and the function shouldNotEqual checks this element is not null.
To declare an infix function, you need to:

use the infix modifier;
use only one parameter.

Combining the last two tools we can get the code below:

infix fun  T.shouldNotEqual(expected: T)

Note that the generic type by default is an Any inheritor (not Nullable), however, in such cases we cannot use null, that’s why you should explicitly define the type Any?

Context control

Documentation

When we use a lot of nested contexts, on the lower level we risk getting a wild mix. Due to lack of control the following meaningless construct becomes possible:

schedule { //context SchedulingContext
    data { //context DataContext + external context SchedulingContext
        data { } //possible, as there is no context control
    }
}

Before Kotlin v.1.1 there had already been a way to avoid the mess. It lies in creating a custom method data in a nested context DataContext, and then marking it with the Deprecated annotation with the ERROR level.

class DataContext {
    @Deprecated(level = DeprecationLevel.ERROR, message = "Incorrect context")
    fun data(init: DataContext.() -> Unit) {}
}

This approach eliminates the possibility of building incorrect DSL. Nevertheless, the big number of methods in SchedulingContext would have made us doing a lot of routine work discouraging from any context control.
Kotlin 1.1 offers a new control tool — the @DslMarker annotation. It is applied to your own annotations which, in turn, are used for marking your contexts. Let’s create an annotation and mark it with the new tool from our toolkit:

@DslMarker
annotation class MyCustomDslMarker

Now we need to mark up the contexts. In the main example these are SchedulingContext and DataContext. As far as we annotate both classes with the common DSL marker, the following happens:

@MyCustomDslMarker
class SchedulingContext { ... }

@MyCustomDslMarker
class DataContext { ... }

fun demo() {
    schedule { //context SchedulingContext
        data { //context DataContext + external context SchedulingContext is forbidden
            // data { } //will not compile, as contexts are annotated with the same DSL marker
        }
    }
}

With all the benefits of this cool approach saving so much time and effort, one problem still remains. Take a look at the main example, or, more precisely, to this part of the code:

schedule {
    data {
        student {
            name = "Petrov"
        }
        ...
    }
}

In this case on the third nesting level we get the new context Student which is, in fact, an entity class, so we are expected to annotate part of the data model with @MyCustomDslMarker, which is incorrect, to my opinion. In the Student context the data {} calls are still forbidden, as the external DataContext is still in its place, but the following constructions remain valid:

schedule {
    data {
        student {
            student { }
        }
    }
}

Attempts to solve the problems with annotations will lead to mixing business logic and testing code, and that is certainly not the best idea. Three solutions are possible here:

Using an extra context for creating a student, for example, StudentContext. This smells like madness and outweighs the benefits of @DslMarker.
Creating interfaces for all entities, for example, IStudent (no matter the name), than creating stub-contexts that implement these interfaces, and finally delegating the implementation to the student objects, and that verges on madness, too.

@MyCustomDslMarker
class StudentContext(val owner: Student = Student()): IStudent by owner
Using the @Deprecated annotation, as in the examples above. In this case it looks like the best solution to use: we just add a deprecated extension method for all Identifiable objects.
```
@Deprecated("Incorrect context", level = DeprecationLevel.ERROR)
fun Identifiable.student(init: () -> Unit) {}
```

To sum it up, combining various tools empowers you to build a very convenient DSL for your real-world purposes.

Cons of DSL use

Let’s try to be more objective concerning the use of DSL in Kotlin and find out the drawbacks of using DSL in your project.

Reuse of DSL parts

Imagine you have to reuse a part of your DSL. You want to take a piece of your code and enable to replicate it easily. Of course, in simplest cases with a single context we can hide the repeatable part of DSL in an extension function, but this will not work in most cases.

Perhaps you could point me towards better options in comments, because for now only two solutions come to my mind: adding “named callbacks” as a part of DSL or spawning lambdas. The second one is easier but can result in a living hell when you try to understand the calls sequence. The problem is the more imperative behaviour we have, the less benefits remain from DSL approach.

This, it!?

Nothing’s easier than losing the meaning of the current “this” and “it” while working with your DSL. If you use “it” as a default parameter name where it can be replaced by a meaningful name, you’d better do so. It’s better to have a bit of obvious code than non-obvious bugs in it.

The notion of context can confuse one who had never faced it. Now, as you have “lambdas with a handler” in your toolkit, unexpected methods inside DSL are less likely to appear. Just remember, in worst case you can set the context to a variable, for example, val mainContext = this

Nesting

This issue relates closely to the first drawback in this list. The use of nested in nested in nested constructions shifts all your meaningful code to the right. Up to a certain limit this shift may be acceptable, but when it’s shifted too much, it would be reasonable to use lambdas. Of course, this will not decrease your DSL’s readability but can be a compromise in case your DSL implies not only compact structures but also some logic. When you create tests with a DSL (the case covered by this article), this issue is not acute as the data is described with compact structures.

Where are the docs, Lebowski?

When you first try to cope with somebody’s DSL, you will almost certainly wonder where is the documentation. On this point I believe that if your DSL is to be used by others, usage examples will be the best docs. Documentation itself is important as an additional reference, but it is not very friendly to a reader. A domain-specific practitioner will normally start with the question “What do I call to get the result?”, so in my experience, the examples of similar cases will better speak for themselves.

Conclusion

We’ve got an overview of the tools that enable you to design your own custom domain-specific language with ease. I hope you now see how it works. Feel free to suggest more tools in comments.
It is important to remember that DSL is not a panacea. Of course, when you get such a powerful hammer, everything looks like a nail, but it isn’t. Start small, create a DSL for tests, learn from your mistakes, and then, experienced, consider other usage areas.</int,>