Accelerating software engineering through the adoption of Domain-Specific Languages
In the fourth part of our series exploring the motivations, architecture and principles driving the development of IPF, Tom Beadman, Principal Engineer at Icon Solutions, discusses the benefits of using Domain-Specific Languages (DSLs), Language Engineering and Code Generation.
There is an incredible amount of untapped potential for the adoption of a basic Language Engineering capability into a developer toolset, providing additional context to frame certain problems whilst designing software.
Language Engineering, however, is often seen as a niche and academic topic. This is understandable, as it has an initial learning curve, requires investment, and consequently maintains a smaller-albeit-dedicated community.
So to fully explore and understand the benefits, it is important to be clear on what we mean by a DSL.
Put simply, a DSL is a language that has been designed for a specific use.
The scope is focused on a specific context and it provides a direct relationship between language concepts and the user’s intentions.
There are several common ways of authoring a DSL, and these can be categorised into ‘internal’ or ‘external’ types.
Internal DSLs are built within the existing capabilities of a General Programming Language (GPL). Think of Kotlin, Groovy, or Ruby: those languages allow developers to create DSL meta-models around existing classes and constructs to improve the expressiveness of the code. This is a convenient tool for the developer but is often just syntactic sugar on top of the existing capabilities of the GPL host language.
For this reason, the focus of this article is on the benefits of building a dedicated external DSL for a specific area of interest with dedicated tooling such as Jetbrains’ Meta Programming System (MPS). This refers to the process of reflecting upon a business domain, distilling the core concepts, and crafting a rich, meaningful self-contained language.
An external DSL has a much broader scope. It has considerations for all aspects of language definition, from user experience to code generation. This overview will shed light on what a DSL can tangibly offer. In the case of Icon’s IPF technology platform, it is the generation of self-contained executable Java microservices, BDD test scenarios, visual representations, and documentation that are automatically and transparently integrated into our standard Maven build process.
Orchestration has always been a core focus of the IPF platform. The software that we build always has some degree of tracking work items through a set of business processes. Consider the modeling of a payments engine that takes a request from a channel and provides the rails for that transaction to the resultant payment scheme. En route, it may need to interface with common types of systems such as fraud, sanctions or accounting. The exact details of how to integrate with these systems may differ between clients, but there is a maintained notion of an orchestrated flow.
Previously, we had defined this orchestration using an external, graphical modeling tool, Modelio. We developed a code generation tool that would translate the Modelio output model into executable Java source code. The approach remains effective but had inherent limitations around the granularity of code that could be generated, and how well the flow design process could be integrated with the wider engineering process. During a recent innovation cycle, we identified that leveraging a lightweight external DSL is a powerful approach and may prove to be more effective for certain use cases. The overarching motivation being that by defining our own language concepts and using a dedicated code generation tooling, we could produce much higher quality generated output. And by front-loading the engineering effort into building a DSL (as opposed to “core” application components), it provides a better context for digging into “what” really constitutes the domain model.
In contrast to our previous approach (which was specifically tailored for applications handling instant payments), a lightweight external DSL provides a foundation to explore other areas including value-added services or order management.
Our DSL ultimately generates a Java state machine and plenty of supporting collateral. The key message is that all of these different generation outputs are simply different materialisations of the same model, a pure domain model that is not coupled to any technology or application concerns:
Domain – the main executable artifact is a self-contained Java library represented by Akka Event-Sourced Actors. We generate all of the associated classes one would expect to find in such a framework, including Events, Commands, CommandHandlers, Aggregates, EventHandlers, Input, and Action interfaces. A key principle is that this generated domain is testable and observable in isolation, without any dependency on any application components such as databases or message brokers.
Tests – we also generate test scenarios for each of the possible processing paths through the modeled flow, ensuring full traversal and coverage of edge cases. As a big proponent of Behaviour Driven Development (BDD) testing at Icon, these scenarios take the form of user stories written in Gherkin syntax. They serve as another window into the model, further exposing and highlighting the business functionality that has been described. The tests are then executed against the resulting application to ensure that it is correctly implementing the domain that has been generated. This results in a self-regulating regression suite, where a user cannot make a change in the DSL model without additionally ensuring that their change is successfully handled at the application level.
Graphs – we are visual creatures, so providing a visual representation is nearly always required to fully understand any nontrivial processing model. Our use case is no different. We generate graphical representations of the modeled processing flow. The output of this generation is Graphviz file representation, further compiled to SVG images. These graphs describe the overall process flow for a given DSL model, but they are also then augmented in real-time as part of the IPF product, overlaying payments with their current status to show exactly how they are being processed.
Documentation – we capitalise on the opportunity of holding our model in an appropriate tool such as Meta Programming System (more on this shortly) by generating supporting collateral that provides visibility into the model. Documentation is a key example of this. A common pain point for many development teams is when design documentation starts to no longer represent what the application is actually doing. Living documentation, versioned alongside source code and referencing live code examples, is a pattern to try and alleviate this. At Icon, we do this in the form of AsciiDoc documentation accompanying individual software components within their source repository. These documents are automatically published to our documentation site when the software is released. We extended this further by generating documentation in this format as part of the DSL. We author AsciiDoc pages for the model – describing the relationships and data types. In this way, we further expose the actual business functionality of the application.
All of the above is re-generatable by a simple click within MPS. In our case we have the process encapsulated behind a standard Maven build pipeline – this is completely transparent to downstream developers.
So now, what is MPS? MPS is a fantastic Language Workbench by JetBrains. It provides all the tooling needed to develop external DSLs. Other popular tools include Xtext, an Eclipse-based workbench. A comparison between different tools could warrant a separate article in itself. Ultimately, however, several design decisions were undertaken with MPS that really resonated with us, enough for us to take the leap into adopting it. These included:
- Projectional editor – the MPS editor may look text-based, but it is not. It is simply a projection on the underlying Abstract Syntax Tree (AST) of the language concepts – the user directly maintains the underlying model. This switch in paradigm from a traditional text-based approach does require a bit of an initial learning curve but has many powerful consequences for model consistency and improved refactoring.
- Consistency – the user is modeling the AST directly, and not through a text parser, so it is impossible to define an inconsistent model. Imagine traditional compile-time safety with the additional prevention of typos/user errors, and application of type-system rules and language constraints.
- Extensibility – MPS heavily promotes re-use and the extension of existing DSLs, so you do not have to define new languages from scratch. For example, MPS ships with a self-contained Java implementation called BaseLanguage. If you want to add extra capability to the Java language to start from, you simply extend BaseLang and enrich the language with your own concepts. This is nicely demonstrated in some of the sample projects within MPS, where BaseLang is extended with mathematical notations and tabular forms.
- Generator aspect – the implementation of the generator aspect of MPS is compelling. You can define exactly how to transform DSL concepts into various outputs with full compile-time safety. In our case with IPF, we generate our concepts into BaseLang, which in turn generates incredibly high-quality Java source code.
- Java interoperability – MPS (through the BaseLang language) provides full Java interoperability throughout the language aspects. For example, a user can refer to their own Java dependencies as part of the DSL editor. Similarly, in the generator aspect, from which we generate our DSL to BaseLang, we can reference existing Java dependencies, providing a compile-time-safe Language Engineering experience. This is the polar opposite of other approaches based on runtime templating.
- IntelliJ support and Rich Client Platforms (RCP) – the distribution of languages is often an interesting area. Rather than having end users manually install a copy and configure MPS and configuring it with the required plugins and languages just to be able to use it, MPS allows you to build a slimmed-down customisable IDE with everything preinstalled. This RCP can then be distributed as a self-contained application without any dependencies. Perhaps even more interestingly, at least for Java developers, is the fact that you don’t actually need the MPS application to use the DSL. MPS-based languages can be used in IntelliJ IDEA by simply installing the MPS plugin for IDEA!
The benefits of DSL
In summary, there are clear benefits to using DSL:
- Describing a domain in a DSL provides a formal, repeatable modeling platform and allows subject matter experts to take ownership of the domain, whilst freeing up software developers to focus on the actual software.
- The actual process of designing a language provides a perfect context to identify and analyse the underlying concepts and nature of the domain itself – much more so than with traditional software engineering.
- Tooling such as the code generation capability through MPS effectively results in a low-code framework for generating applications; no more writing boilerplate code.
Whilst these benefits are self-evident, there are others that are more subtle. The generator implementation that translates your DSL model into executable Java code is decoupled from the language itself. This means that we can confidently pivot to a new application technology without any change to the domain model – or even run and generate both! Similarly, if a specific project requires additional software collateral to be generated, they can simply create an additional generator.
For those who prefer video, we recently had the pleasure of presenting an overview of our DSL at the Jetbrains MPS 2021 event, the content of which complements this article and can be found here.
There was also recently a great talk from Markus Völter and Václav Pech focusing on how DSLs are empowering domain experts by bridging the two previously separate realms of knowledge-working and software engineering and can be found here.