Language Workbench Competition 2011
May 24th the first Language Workbench Competition have been held in Cambridge (UK) as a warming-up of the Code Generation conference. The idea for a language workbench competition originates from a group of people during Code Generation 2010. A lot of new initiatives in the field of language workbenches are arising to facilitate the creation of Domain Specific Languages (DSLs) and their accompanied code generators and IDEs. They all have their own strengths and weaknesses, so a competition to learn about them sounded like a great idea back in 2010… and it was!
All participants of the workshop had to implement an assignment, consisting of the following tasks:
This phase is intended to demonstrate basic language design, including IDE support (code completion, syntax coloring, outlines, etc).
- 0.1 Simple (structural) DSL without any fancy expression language or such.
- 0.2 Code generation to GPL such as Java, C#, C++ or XML
- 0.3 Simple constraint checks such as name-uniqueness
- 0.4 Show how to break down a (large) model into several parts, while still cross-referencing between the parts
This phase demonstrates advanced features not necessarily available to the same extent in every LWB.
- 1.1 Show the integration of several languages
- 1.2 Demonstrate how to implement runtime type systems
- 1.3 Show how to do a model-to-model transformation
- 1.4 Some kind of visibility/namespaces/scoping for references
- 1.5 Integrating manually written code (again in Java, C# or C++)
- 1.6 Multiple generators
Phase 2 is intended to show a couple of non-functional properties of the LWB. The task outlined below does not elaborate on how to do this.
- 2.1 How to evolve the DSL without breaking existing models
- 2.2 How to work with the models efficiently in the team
- 2.3 Demonstrate Scalability of the tools
Every LWB has its own special “cool features”. In phase three we want the participants to show off these features. Please make sure, though, that the features are built on top of the task described below, if possible.
Until now, 13 different tools have been used to implement the assignments. The submissions, including a lot of documentation and explanation can be found at the LWC website. 10 of the participants joined the LWC11 in Cambridge. In a tight day schedule they all had 40 minutes to demonstrate and explain their solution as well as highlighting some of their strong points.
I will try to give you a brief overview of all the tools, but I will not write a full comparison, that’s way too much work. You can check the PDFs of the submissions or play around with some of the tools to get a better understanding of the possibilities. I will just give you some (subjective) “soundbites” from LWC11 to give you a first idea about the 10 tools. Don’t hit me if I missed the best feature of your tool… 10 sessions of 40 minutes on 1 day is quite a challenge for my concentration 😉
JetBrains MPS is an opensource (Apache 2.0 license) tool. Instead of text editors it provides a projectional editor, meaning that you’re directly editing the abstract syntax tree (AST) instead of a concrete syntax which needs to be parsed. You start by defining the abstract syntax. Afterwards you can define the layout of the cells that should be rendered as you edit (i.e. the projection). In most cases it looks a lot like text, and the MPS team did put a lot of effort in giving you the feeling that you’re editing text when editing a projection. However, you are editing a structured format (e.g. tree, tables).
MPS is strong in language extension. The Java definition is provided with the tool and allows you to great DSL extension to Java. Code generation is also done by defining a model-to-model transformation from the AST of your DSL to the AST of Java. So, to generate a target language you will need the AST of that language (which can be quite some work to define).
Markus Voelter, who did present the MPS submission, advises to use his EMF export to be able to use Xtext to generate text from MPS models, as Xtext is stronger in that field. If you want to learn more about he model-to-model transformations you will have to play around with MPS. According to Markus, M2M in MPS is easier to implement than to explain.
Steven Kelly demonstrated the MetaEdit+ implementation of the assignment. In fact, he implemented most of the assignment from scratch during his demonstration! Very impressive and the only one who did it this way. For me, this shows the productivity and ease of use of MetaEdit+.
MetaEdit+ really focuses on vertical (i.e. business domain) and graphical Domain Specific Languages. They offer some nice features to define your graphical concrete syntax like a symbol editor with conditional options (e.g. change colors or symbols based on model values). The code generators are quite different from other tools. Very modular. A nice thing Steven demonstrated was the generation of code within the tool and the possibility to jump from the code to the model and back.
The main comment on twitter about MetaEdit+ was that its UI looks a bit old-fashioned. That’s of course a matter of taste, but their next version (5.0 – to be released soon) will have a new, Windows 7 looking UI.
Mixing manual code with your modes can be done by using protected regions in the code generators.
MetaEdit+ has quite powerful language evolution. Model data is kept when changing the metamodel (depending on the case of course). Generators keep working if you used relative referrals (walk the tree, use oid’s, etc.). Removing elements from the metamodel will not remove them from the models but obsolete / deprecate them.
Most validations are automatically there because you cannot model anything outside the scope of your metamodels.
Steven finished his presentation with a nice demonstration of the generator debugger giving the ability to step through the code generaters, add breakpoints, … and that’s when the egg timer (which helped us to not let this workshop evolve in a two-day event) gave the signal for a coffee break.
The OOMEGA platform was presented by Christian Merenda. It’s an open source platform based on Eclipse with commercial database backends. They also provide a collaborative platform at metamodels.org to share metamodels.
OOMEGA uses a strict separation between abstract and concrete syntax. Multiple textual representations are possible for one abstract structure. OOMEGA uses a very compact language to define grammars, which is nice, but you will need to get used to it. However, Christian told me they will provide an additional concrete syntax (as said they support multiple textual representations of the same language) for this language to support people to start with the tool.
Language composition is done by defining and including packages.
OOMEGA is compatible with ATL wich enables you to transform OOMEGA models with standard ATL model transformation scripts.
Multiuser options: SVN for textual models, databases as third-party extensions (locking, notifications, etc.).
Nice navigation among metamodel, model, instance model. Just click-jump.
You can define validations for your models which will prevent you from saving invalid models.
The Whole Platform presentation started by emphasizing their focus on an agile approach. It includes features like hot deploy, meaning that you can change language definitions and directly see the result in the resulting modeling environment. They use model interpretation (instead of code generation) to facilitate this process. It is even possible to deploy incomplete models to test what your are currently working on.
The Whole Platform is opensource (LGPL) and delivers projectional editors. The main focus is on graphical DSLs. Even their meta-languages are graphical, e.g. the grammar definition language is graphical EBNF / railroad diagrams. They want to enable less-experienced people to be involved in language specification. There were some comments about their graphical approach, for example this statement by Jurgen Vinju:
Some graphical DSL’s look like AST visualizations with edit boxes.
It is possible to have multiple projections for one language and to seemlessly switch between them. Part of the projections are automatically generated, they didn’t explain how much effort you need to define a projection, and I didn’t ask 😉
Multiple DSLs can be mixed / used in one model. I’m wondering about the keyboard support (MPS did put a huge effort into this) as it looks as a lot of point and click is necessary.
Model-to-model transformations can be defined using a graphical DSL (in principle it’s partly based on the visual syntax of the target language). The transformation interprets the source model with an analyzer and executes the transformation specification to generate target model. A very nice feature is the visual debugger for your model-to-model transformations which features step-through debugging on the graphical transformation model level.
The Whole Platform also contains a DSL to write tests for your DSL specifications, inspired by JUnit and integrated with the JUnit plugin in Eclipse to run your tests. Tests are once again modeled using a graphical syntax. These graphical languages are nice to demo, but they can have some disadvantages as Steven Kelly mentioned:
Whole Platform looks nice, but needs huge amounts of screen real estate. Dual projectors for LWC12?
Everything is possible on domain / DSL level, no framework or Java code is needed (holds for more tools). Code generation is done by using text in the transformation DSL.
The Whole Platform seems to be the surprise of the day as a lot of people didn’t know them. They had a nice demo and their product looks stable. As Karsten Thoms stated it:
Whole Platform is the surprise of the day. Stable tooling, nice looking, quite different.
Steven Kelly compared the Whole Platform with JetBrains MPS:
Whole Platform = gorgonzola MPS: more mature and with an Italian flavour
Rascal was presented by Jurgen Vinju. Rascal is a domain specific language for source code analysis and manipulation (i.e. meta-programming). Apparently it can also be used as a language workbench in combination with the IDE Meta Tooling Platform (IMP). As Jurgen stated it: IDE construction = Metaprogramming + GUI programming, which can be done using Rascal + Eclipse IMP.
Rascal is inspired by the ASF+SDF meta-environment and will be contributed to eclipse.org in June 2011.
Algorithms can be written in the language, but Rascal itself does not have many keywords or language features that
provide built-in algorithms (such as constraint solving). The notable
exceptions to this statement are general parsing and disambiguation,
pattern matching, general traversal and lexically scoped backtracking
features. It looks like a powerfull, complete language with a strong scientific basis.
Language extensions are possible by redefining or extending definitions of concepts. Rascal features strong modularization.
It seem to be not that strong on IDE stuff. IMP has it, but the integration with Rascal seems to be immature or Jurgen didn’t show it. The Rascal language itself, however, is interesting. As Meinte Boersma said it:
Very high-quality presentation by Vinju on Rascal which looks “academically inspired” but definitely reality-checked.
Spoofax is also (like Rascal) based on SDF (a syntax definition language) and Stratego (a transformation / rewriting language). It’s based on Eclipse and provides several meta-languages for language engineering.
Every spoofax language definition is an eclipse plugin. The plugins are dynamically loaded so you don’t have to start another Eclipse instance (more language workbenches based on Eclipse should support this!).
You can use Eclipse IMP in addition to Spoofax to enrich your IDE. Lennart didn’t explain in detail how this works and when you’ll need this.
The main advantage of Spoofax seems to be it’s sophisticated approach to grammars and parsing. As Markus Voelter stated it:
Defining expression grammars in #sdf and #spoofax really is much more convenient than in #xtext!
Mats Helander demonstrated the Intentional workbench. Intentional uses projectional editors (like JetBrains MPS), so you’re editing a projection of the AST directly. According to Mats you could see it as a semantic model (as Martin Fowler calls it) instead of an AST.
Multiple projections are possible. They don’t have to be complete, so it can be possible that you have to use more than one projection to complete your model.
Code generation is done by transforming your model to the AST of C# (so, in principle this is model-to-model transformation). The AST of Java, C#, and Ruby are available in the tool. If you want to generate other languages you will have to build an AST of the target language yourself or use the plain text support.
The results transformations are still editable, they are just projections in the end. Changes will also be updated in the model. So, you can offer a relation database scheme to an expert and let him use his expertise to improve the database scheme and thereby your entity model. This only works if your model-to-model transformation is bi-directional (which is only possible if the transformation keeps a 1 to 1 relationship between elements in source and target languages).
Meta-languages are only used when necessary. In most cases they are just extensions of C#. Model validation rules, for example, are specified in almost native C#.
Languages can be split in multiple files. As long as you define your concepts in the same namespace you can refer to them in the same way as you would do if they were in the same file. Mats didn’t explain if Intentional supports package definitions and imports.
The type system seem to be quite strong. You can even define distances between types.
Due to Intentionals projectional nature it is possible to add the results of calculations directly into your editor. These can for example be used for testing.
Mats finished his session by showing a functional specification like projection which makes it very easy for domain experts to be involved in software specification.
At the end of the session a Twitter discussion was started about the use of having multiple projections. Some “soundbites”:
Steven Kelly: Multiple projections often a solution looking for a problem. E.g. human langs don’t want multiple alphabets for 1 lang
Steven Kelly: MetaEdit+ has had multiple projections for 15 years – they don’t get used much. Multiple reprs of each object is used a lot more
Angelo Hulshout: Is that because mult. projection is hard to implement ( ==hard to use), or because people really don’t want need it?
Steven Kelly: I think because maintaining the same thing in two views is like doing it in two places: less work to do just one
Angelo Hulshout: Unless you have multiple people with different backgrounds working on it (at different times or in parallel?) Realistic?
Steven Kelly: Precisely what we thought it would be good for, but in practice one higher-level lang is better
Essential was presented by its creator Pedro Molina. The workbench is based on practical experience and builds on the following principles:
- No model infrastructure friction, use model interpretation to avoid round-trips.
- Target language independence.
- Minimalistic IDE.
- Clear separation of Concerns (much different from Rascal, which is one language to avoid problems with integrating the concerns).
- Easy integration in other development tool chains.
Essential uses continuous model interpretation for error checking. It uses StringTemplate for code generation. Model-to-model transformations are possible with the provided meta-language.
Language composition is possible in the same way as partical classes in C#.
It is possible to load plugins in the IDE that use reflection to analyze the model and can for example generate to different target languages.
The IDE looks very clean (compared to the other textual / projectional tools), or essential if you like, but according to Karsten Thoms that could be because it’s quite new:
Essential shows how simple a language workbench can be at the very beginning, and then the real world requirements kick in…
Mariot Chauvin showed us Obeo Designer. The focus of this workbench is on graphical modeling languages. It’s based on Eclipse and works with an Ecore metamodel.
A nice feature of Obea is the ability to use filters (or layers) on graphical editors. Quite useful to grasp certain aspects of big models in a glance. They also provide a symbol editor to create your own graphical elements for you language, including conditional styles. It is property based and compared to MetaEdit+ you have less freedom.
For code generation you can use the full power of Acceleo, which is available as an open source Eclipse project. Validations can be specified on the metamodel or in the graphical editor, depending on the scope you want.
The audience was quite positive to see a good working tool for graphical editors on the Eclipse platform.
Meinte Boersma: Do I insult Obeo when I say that their Designer provides a nice (and working!) abstraction over GMF/GEF?
Etienne Juliot: No, it isn’t an insult;) But it also provides viewpoint features for complex analysis:layers, conditional styles, refine, …
Markus Voelter: Although (I think) it still has some rough edges, Obeo Designer is the first reasonable tool for graphical editors on Eclipse
Pedro Molina: Obeo Designers brings an easy way of creating visual dsl over the EMF stack
Model-to-model transformations can be done using ATL, again an open source Eclipse project.
Multiuser support is available due to Eclipse team support integration. There is no support yet for live collaboration using CDO, but this will be part of next version, which sounds cool!
Mariot reacts on the previous twitter discussion about multiple projections and states that it is useful to have them. He gives the example of modeling for airplanes: different people aren’t doing the same job, but are still working on the same model. In this case multiple projections of that model can be quite useful.
Pedro Molina makes the comparison with the Visual Studio offering by stating:
First impression: I think Obeo Designer to Eclipse is quite similar to using DSL Tools on top of Visual Studio.
Karsten Thoms did finish the day with an demonstration of Xtext. Xtext is a workbench for creating external textual DSLs on the Eclipse platform. It has a tight integration with EMF, opening up a lot of possibilities to integrate with other tools and to, for example, mix graphical and textual languages.
The grammar is specified with a textual language, but railroad diagram are available and clickable (will select grammar elements).
The generation gap pattern is used for mixing generated with manual code. Code generation is done with Xtend 2, which is more java like than XPand, but also has some similarities. Xtend 2 definitions are compiled to Java code, against a small framework. Xtend 2 also supports model-to-model transformations.
Validation can be done using Java (the Check language, which was part of oAW is deprecated), which triggered a reaction from Steven Kelly (in 2 tweets):
Xtext implements constraints in Java – I think that’s the first tool today that doesn’t use a DSL for constraints. I would expect a DSL to be better. The Java looks quite verbose. Java as a fallback maybe OK, maybe a DSL for most simple checks?
Xbase, part of the Xtext offering, is a partial programming language implemented in Xtext and is meant to be embedded and extended within other programming languages and domain-specific languages (DSL) written in Xtext. It’s a great starting point for expression languages in your DSLs.
The Xtext workbench is mature, but had quite some big changes in the recent 2.0 version. Karsten showed some numbers to show that Xtext is scalable and performs well for big projects. It is used in practice for many big projects, which is an even better proof than just some numbers.
In the free style part of the assignment Karsten showed us some nice features like automatic formatting, quick fixes and told us about the instance DSL compiler which can for example be used to define tests.
The workshop was great! Really interesting to see all these tools in one day. Markus Voelter agrees:
This workshop is *so much better* than also those “academic” workshops with boring slide shows!
It wasn’t really a competition as these tools have quite different characteristics. The future will learn what characteristics are the most important ones. One conclusion: graphical DSLs work great in demonstrations. To really make a useful decision about the tool you are going to use, you have to “feel” the tool an explore its borders.
The main questions nagging me at the end of the day was whether there is a market for 10 different DSL tools / language workbenches? How big is the market? Will it grow? Markus Voelter thinks there should be a market when looking at the diversity of the tools:
Judging from #lwc11, language engineering really isn’t a niche (anymore). There are many diverse tools and approaches…. very nice!
One last observation: it was nice to have a parallel virtual-conference on twitter to directly share opinions without interrupting the presenter. It was also interesting to see the tweet rate increasing when a tool was more interesting, new, or impressive.
Thanks to Angelo Hulshout and Software Acumen for organizing this great workshop!