Why MPS language workbench is key for successful Language Orientated Programming
This is fifth part of our series exploring the motivations, architecture and principles driving the development of IPF. Volodymyr Prokopyuk, Senior Payments Architect at Icon Solutions, discusses the benefits of Language Orientated Programming (LOP), explores different programming approaches and explains why the MPS language workbench is an ideal solution for building software in complex domains.
Glossary of terms
|AS||Abstract Syntax||A set of rules and notations to describe the structure of a GPL or a DSL program preserving its semantic meaning, but free of any CS notation and forms (expressions, statements)|
|AST||Abstract Syntax Tree||A tree-like graph data structure semantically equivalent to a GPL or a DSL program expressed using a CS, but without the details of the CS. An AS consists of interlinked nodes that represent language concepts, containment and referencing relationships. An AST is usually generated by a parser using a CS grammar or an AST can be directly manipulated though a projection editor in language workbenches|
|CS||Concrete Syntax||The notation and forms (expressions, statements) of a GPL or a DSL that are visualized and directly manipulated through a text editor in traditional source-based programming or multiple CS representations are rendered by a projectional editor in language workbenches|
|DSL||Domain-Specific Language||A programming language with limited scope of application specifically designed to be particularly suitable to express solution to a certain range of domain problems. A DSL provides a set of carefully designed language abstractions that allows direct mapping of a solution in terms of domain concepts to the DSL abstractions that directly benefits developer productivity, expressiveness, conciseness and correctness of a DSL program|
|EBNF||Extended Backus-Naur Form||A set of notations and forms to precisely describe a formal programming language that belongs to a class of context free grammars. EBNF expressions and statements exhaustively define all possible syntactic constructions of a programming language. An EBNF is used as input to a parser along with a program text, so the parser can prove syntactic validity of a program and construct an AST or fail to recognize the language construct and issue an appropriate error. Note: language workbenches that use projectional editing does not use EBNF and parsing technology as the AST is directly manipulated by a user through a projectional editor|
|GPL||General Purpose Language||A programming language that is Turing complete and applicable to any computable problem domain. A GPL provides low-level generic language abstractions to construct higher-level language abstractions that are conceptually closed to a solution in terms of domain concepts|
|LOP||Language Oriented Programming||An approach to system and application development in which a set of composable, interoperable and well-integrated DSLs is designed and developed first and only then a solution to domain problems are expressively and concisely described using the appropriate DSL to describe each of the orthogonal aspects of the system
LOP makes application development considerably more productive
The main challenge with application development productivity is this: a big semantic gap between the problem domain and the General Programming Language (GPL) abstractions of the target platform. The mapping of domain concepts to, conceptually very distant, GPL language abstractions involves several levels of indirection. This leads to verbose and complex expressions, resulting in the loss of the main idea that is needed to solve the domain problem in the implementation details. Complex systems require concise and precise descriptions of problem domain solutions for both clarity of expression and ease of maintenance of a system. To meet the requirements of complex system development, multiple specialized languages (DSLs) may be used, but they must be integrated in a cohesive and composable fashion.
This is where language orientated programming (LOP) comes in. From one side, LOP decomposes a complex system into simpler, constituent parts and applies the most appropriate Domain Specific Language (DSL) to precisely express domain problem solutions. From the other side, multiple DSLs are seamlessly integrated into a highly cohesive and loosely coupled solution. Due to a big semantic gap between the GPL abstractions and domain concepts, GPL cannot fully and effectively support arbitrary domains. This creates lengthy periods of time between initial ideas and final programs that are far less expressive (verbose libraries in terms of GPL abstractions) and far less maintainable (domain logic is lost in not relevant implementation details) compared to the LOP strategy.
DSLs are part of LOP, providing direct mapping from domain concepts to DSL abstractions. This provides the user with a concise, express solution in terms of domain concepts. LOP makes application development more productive by combining multiple DSLs with excellent IDE and tooling for both solution programming and application maintenance.
How does the language orientated programming approach differ from the general programming language one?
The GPL development approach follows four main phases:
- Domain problem requirements
- Conceptual idea in mind
- Solution in terms of domain concepts
- GPL program with a big semantic gap that leads to time consuming, non-productive, less expressive and less maintainable mapping between domain concepts and GPL abstractions
In contrast, the LOP approach automates the idea-to-program translation phase by introducing a set of DSLs tailored to express problem domain solutions in terms of the domain without the need to resort to multiple levels of manually crafted indirections in order to reach the low-level GPL abstractions.
The LOP approach is as follows:
- Domain problem requirements
- Conceptual idea in mind
- Solution in terms of domain concepts
- Create / use appropriate DSLs that directly map the problem domain concepts to the DSL abstractions (all employed DSLs are fully integrated, composable and interoperable)
It’s important to note that a program in LOP is not a set of low-level instructions in a GPL. It is solution description in domain terms using a set of composable DSLs.
A language in LOP is defined by:
- AS meta model (scheme) that defines the structure of a program
- Projectional editor that defines the Concrete Syntax (CS) grammar rendering and user interaction in terms of projection rules
- Code generators that define execution semantics of the DSL program Abstract Syntax Tree (AST)
Each component of a DSL is defined and manipulated through a projectional editor using a powerful IDE and tooling. Before embarking on the this journey, developers should evaluate the effort to design, develop and maintain a set of integrated and composable DSL, versus the productivity boost that the LOP provides.
GPL approach versus LOP approach to development
GPL vs LOP Domain Specific Language (DSL) for application development
When it comes to application development, there are significant benefits of the LOP DSL composition approach versus the plain GPL one. To illustrate with a concrete and simple example, here is a Linux shell DSL vs GPL solution comparison.
How does traditional programming in GPL work?
The traditional programming in a GPL (aka source-based programming) starts from the CS parsed into the AST that is finally interpreted on the target platform, or compiled to an executable for the target platform. The translation of a GPL program CS to execution semantics on the target platform goes through the following phases:
- GPL language designers define an unambiguous language grammar using a form of the EBNF notation.
- A lexer (scanner, syntactic analyzer) and a parser (AST creator) are generated from the language grammar described in the EBNF form using the parser generator technology.
- A lexer scans a textual representation of a GPL program as a stream of bytes and creates a stream of tokens. A lexer does not take into consideration the program context that limits the flexibility of the GPL (e. g. keywords cannot be used as identifiers even if the usage is not ambiguous).
- A scannerless parser considers the program context and, while being more complex to implement, is free of the GPL limitations mentioned above.
- A parser consumes a stream of tokens and constructs the AST of a GPL program using unambiguous rules of the language grammar. After the AST is constructed, a semantic analyzer checks the AST for language constraints and type system rules.
- An optimizer applies optimizations to a validated AST to make the program more efficient. Starting from this point either an interpreter directly interprets the AST on the target platform or a compiler generates an executable for the target platform.
In the traditional GPL programming approach, the focus is on the CS in the textual form while the AST is just a transient data structure (tree-like graph) temporarily created by a parser for subsequent interpretation or compilation. A GPL program is edited and persisted in the textual form of the CS. The textual form of the CS is the single source of truth, while the AST can be easily changed by altering parsing rules.
Projectional editing in language workbenches
The projectional editing used in language workbenches starts from the abstract syntax (AS) meta model (scheme) definition.
From the AST a set of representations of CS in different formats (textual, symbolic, tabular, graphical), for different audiences (developers, domain experts, business analysts, quality assurance experts, security experts, infrastructure experts) and different goals (editing and modification, analysis and review, learning and presentation) is defined via projectional rules. A user interacting with a projectional editor directly modifies the AST of a DSL program, bypassing all the complexities (e. g. context-aware scannerless parser) and limitations (grammar ambiguities, limited grammar composition) of the parsing technology. On the other side, the AST is used by an semantic analyzer, an optimizer, either interpreter or compiler in exactly the same manner as in traditional programming in GPL.
In the projectional editing the focus is on the AST, from which a set of CS representations and a set of interpreters or executables for multiple target platforms can be generated by altering projectional rules and AST interpretation or code generation rules respectively. The AST is the single source of truth used for projectional editing and a DSL program persistence.
Why is projectional editing more powerful than source-based programming?
In projectional editing the single source of truth is the AST of a DSL program. The following system aspects are orthogonaly derived from the AST, giving unmatched power and flexibility to the projectional editing:
- Multiple editable representations (projections) in different formats (textual, symbolic, tabular, graphical) are generated from the AST meeting the goals of different audiences working on or with the DSL. The AST is directly manipulated through its editable representations in the projectional editor.
- The AST is directly persisted in its storage representation as a data structure (tree-like graph) in the most appropriate structured format (XML, JSON, YAML). In contrast, in source-based programming both GPL program editing and persistence are done using textual representation of a GPL program.
- Multiple executable representations can be generated from a single AST in form of interpreters or compiles for different target platforms. The AST is an abstract representation of a DSL program semantically equivalent to but free from any CS. The abstract representation can be used for analysis (static code analysis, DSL constraints, type system checks) and reasoning (formal DSL program verification) about the DSL program, as well as the optimization of the DSL program for performance.
- Finally a non-editable visual representation of the AST can be generated for documentation, learning, presentation and communication purposes.
In summary, by focusing on the AST direct editing and structured persistence, projectional editing segregates different aspects of DSL program management. This includes program editing, AST persistence, program analysis and verification, code generation and system documentation into separate processes supported with good tooling.
How projectional editing solves GPL limitations
Complex systems need different languages to describe solutions to the various aspects of the system. Due to ambiguities in the resulting language grammar, GPL integration is in practice very difficult. The ambiguities of the integrated GPL language grammar require changes to the initial GPLs.
In contrast, projectional editing directly manipulates AST meaning the problem of language grammar ambiguity does not occur at all. All semantic ambiguities that happen while editing the AST through a projectional editor are immediately resolved by user selection of the appropriate language concept (AST node). This direct manipulation of the AST via projectional editing allows for easy and seamless DSL integration and composition via DSL extension, DSL referencing and DSL embedding.
GPL provides a limited set of low-level, generic built-in language abstractions for the user to develop their own higher-level domain abstractions in the form of libraries and frameworks. The newly built domain abstractions are not new linguistic domain concepts. Rather, they are convenient combinations of initial low-level GPL abstractions. The abstract syntax (AS) meta model makes it easy to add new domain concepts as first-class linguistic DSL abstractions, without the need to resort to combination of low-level GPL abstractions.
The generic, low-level nature of GPL requires verbose and lengthy expressions of the domain concepts, creating mental overhead and slowing productivity of the user. The lack of notation flexibility and conciseness of GPL, however, can be easily resolved by adding new domain concepts directly to the DSL through projectional editing of the AS meta model.
Higher-level GPL constructs representing domain concepts suffer from a lack of semantic meaning when it comes to tooling (syntax highlight, code completion, navigation, refactoring, static code analysis, semantic analysis, error reporting, simulations, debugging). In contract, projectional editing has full semantic context of domain concepts and high-level linguistic abstractions to solve all of the above GPL shortcomings. It also benefits from the ability to provide different representations of the AST in the most suitable format for specific audience goals.
The most important benefit of projectional editing is the improved and direct communication between domain experts and developers. The DDD ubiquitous language representing domain concepts enhances productivity by enabling the quick exchange of idea, as well as direct implementation of problem domain solutions using a set of integrated and composable DSLs.
Why language workbenches are a good match for LOP
Language workbenches enable users to directly manipulate the AST of a DSL program through a projectional editor, using multiple CS representations and targeting multiple target platforms via dedicated interpreters or compilers. In addition, users can easily define new DSLs that are fully integrated and composable with each other. Creating a new DSL means 1) defining the AS meta model (scheme) with domain concepts and relationships between them, 2) set of CS representations with projectional rules, 3) set of interpreters or compilers with code generators for executables, documentation and testing.
With all the above DSL components, manipulation and management is fully supported on semantic level by the IDE. Tooling provides multiple CS representations, syntax highlighting, code completion, navigation, refactoring, static code analysis, meaningful errors, debuggers, simulators and target platform code generators that boost developer productivity. To summarise, language workbenches enable developer to program a complete system as a set of integrated and composable DSLs within a single IDE with tooling.
What are the key takeaway points about language workbenches and LOP?
- LOP consists of developing multiple composable DSLs that describe different aspects of the system and only than express the solution using the developed DSLs.
- LOP greatly reduces the amount of code needed to express solutions in terms of domain concepts. The considerable translation and mapping between domain concepts and GPL abstractions are performed automatically by DSL code generators.
- Language workbenches with projectional editing enable direct manipulation of the AST, making DSL composition easier as this approach is free of grammar ambiguities problems.
- LOP approach to development using language workbenches boosts developer productivity. DSLs tailored to solve specific problems in the best possible way can be easily manipulated and composable through projectional editing.