DSL in the context of UML and GPL

August 20, 2008 Johan Den Haan 7 comments

With his recent post Microsoft DSLs + UML = ??? Steven Kelly touched a nice point in the MDA vs. DSL discussion. His blog post is a reaction on Cameron Skinner’s explanation of Microsofts plans to add UML support to their DSL tools, in his post DSL + UML = Pragmatic Modeling.

In my latest article on InfoQ I’ve stated: "I think the proliferation of DSL’s can’t be fully avoided, but it can be made less painful by defining DSL’s as specific instantiations of GPL’s by using meta models. Although they are not real meta models, the use of stereotypes in UML is a good example of a technique for defining a model language for a specific purpose based on a GPL.". I’d like to explain this statement a bit more, especially in the context of the discussion between Steven and Cameron. I think these points (UML vs. DSL vs. GPL) are important in designing or using the right model-driven approach (or, if you don’t like the term ‘model-driven’, in rising the language abstraction level ;).

DSLs vs. UML

Some speculations were going around about the UML support Microsoft is planning in their Rosario release and how this affects their DSL toolkit. Cameron explains these points a bit more:

"I believe that supporting both [UML and DSL] approaches to modeling gives developers and Architects alike the "right tool for the right job". For those folks who want to analyze and design their architecture using a standard notation that does not imply an implementation decision, use some UML diagrams. UML is great for describing higher level concepts and for defining the initial glossary that can be used to describe the concepts necessary to facilitate broader communication. For those folks who have decided on an implementation strategy, and do not want to be encumbered by the more general nature of the UML to describe that implementation choice, use DSLs."

To summarize, Cameron explains the difference between UML and DSLs:

UML:

Useful to analyze and design the architecture.
A standard notation that doesn’t imply an implementation decision.
Great for describing higher level concepts and the initial glossary.

DSLs:

Less general than UML.
Based on an implementation strategy.

Steven Kelly reacts:

"That’s surprising to me. I’d always thought of UML as being more specific about implementation choices than DSLs — e.g. the restriction to an object-oriented language, and the individual classes, fields and methods of an implementation specified directly in the models. DSM [Domain-Specific Modeling] languages are almost invariably on a higher level of abstraction than UML. The only part of UML that is on a high level of abstraction is the Use Case diagram, and that’s only at the total loss of precision."

Cameron adds some more explanation: "In the coming months, you will very likely hear me or others on the team talk about using UML at the "logical" layer and DSLs at the "physical" layer". And from an answerin the comments: "start with a logical class diagram, then transform that into a DSL that is specific to you implementation or domain choice. Said in another way, the logical designers are meant to describe higher level, more general constructs, while the physical more about how to realize those higher level ideas."

To summarize again (combined with the previous statements):

UML is used at an implementation independent level.
DSLs are used at an implementation dependent level.

This sounds familiar? Just compare it to the concepts of the MDA! In principle Cameron states that UML should be used as the language for defining the PIM (Platform Independent Model) and DSLs should be used as the language for defining the PSM (Platform Specific Model).

Steven doesn’t like this approach: "The DSLs are thus not specific to the problem domain, as they should be, but to the solution domain: they have the implementation concepts of a particular Microsoft framework or library". He adds firmly "Putting UML before DSLs in this way isn’t just putting the cart before the horse: it’s putting the horse firmly into the cart — and pulling it yourself".

One step back: the concepts

Let’s take one step back and look at the concepts discussed above. In principle a software development process can be seen as a mapping between a problem space and a solution space. As visualized in Figure 1, in the problem space domain specific abstraction are used to create a model, while in the solution space implementation-oriented abstractions are used. The problem space model is often defined using natural language in some sort of a requirements document, while the solution space model is defined using a General Purpose Language (GPL) like Java or C#. The mapping is just a manual process.

Figure 1 – Mapping between problem space and solution space

The Model Driven Architecture (MDA) aims to define automatic mappings between problem and solution space. They therefore introduce a CIM or PIM to model the problem space, and a PSM to represent a model of the solution space. The MDA, defined by the OMG, of course aims at using UML to express the models. Once a PIM has been defined, it can be automatically transformed into a PSM using QVT (a model transformation language defined by the OMG). Sounds great, just define a mapping/transformation using QVT, model the problem space using UML and ready… your PSM is created automatically. However, UML and QVT are quite complex and abstract and ask for a lot of expertise. I’m a bit skeptical about the MDA approach for two reasons: they use UML as language, which is big and complex and therefore difficult to learn and use. Second, they only focus at one modeling dimension (abstract/concrete).

The second approach mentioned above is that of DSLs. In principle a DSL is an executable specification that offers a higher abstraction level (than GPLs) and is restricted to a particular domain. The domain specificity of a language is a matter of degree. Any language has a certain scope of applicability, but some of them are more focused than others. In general we can separate between two types of domains a DSL can be targeted at, (not surprisingly) the problem domain and the solution domain.

DSLs aimed at the solution domain (Steven calls them framework-based DSLs, I prefer ‘System Aspect DSLs‘) are usually aimed at a specific system aspect, like the data model, the presentation layer, security, business rules, workflows, etc. They have a much higher level of abstraction than GPLs like Java or C#, but are still applicable to more than one problem domain.

DSLs aimed at the problem domain (I’d like to call them ‘Subject Area DSLs‘) are used to directly model the problem space, while they are still directly executable. A DSL for insurance products is an example of such a subject area DSL. The difference between problem and solution space of course depends on your viewpoint.

Steven advocates the use of Subject Area DSLs, because they bring the real raise in abstraction level. I think it depends on the situation. While Subject Area DSLs are more specific and therefore also applicable in a smaller domain, the costs for designing and implementing the DSL should be taken in mind in comparison to the benefits. Second, although Aspect DSL are more generic and make use of solution-oriented abstractions, they are more and more rising to an abstraction level understandable for domain experts. For example a Business Rules DSL, a Workflow DSL, etc. These DSLs are generic and usable for a lot of problem domains, but they still have a much higher level of abstraction than existing programming languages.

And yes… Subject Area DSLs do have a higher level of abstraction, hence you should use them if it is cost effective to design and implement them. But don’t forget: depending on the size of the user community, development of training material, language support, standardization, and maintenance may become serious and time-consuming issues.

Combining the concepts

I definitely agree with Steven (see the citations above) that using UML and DSLs as presented by Cameron isn’t a very good idea. I do however think, that the worlds of MDE (MDE is broader in scope than MDA, it adds multiple modeling dimensions and a software engineering process) and DSLs aren’t opposites. I think that both DSLs and MDE are necessary assets for Model-Driven approaches. While multiple DSLs are needed to describe a software artifact (see for example the different architectural aspects of Service-Oriented Business Applications (SOBA) ), MDE is needed to provide a framework for connecting the different DSLs. An MDE methodology defines a framework of dimensions and their intersections, thereby defining the different models needed to describe a certain software application. This information also gives us the opportunity to discuss the needed DSL’s in a (more or less) formal way. Last but not least, an MDE methodology also describes a software engineering process and a maintenance process, thereby defining the order in which models should be produced, how they are transformed into each other (if applicable) and how to change an existing software system using models.

Another interesting combination is that between GPLs and DSLs. In my latest article on InfoQ I’ve stated: "I think the proliferation of DSL’s can’t be fully avoided, but it can be made less painful by defining DSL’s as specific instantiations of GPL’s by using meta models". This isn’t the same as what Cameron proposes about using UML to model the problem domain and DSLs to model the solution domain. What I suggests is defining DSLs using concepts from an existing GPL, i.e. by piggybacking or specializing an existing language. In this way the compiler, generator or interpreter for the GPL can be reused for making the DSL executable.

Just to make the idea clear I’ve also stated: "Although they are not real meta models, the use of stereotypes in UML is a good example of a technique for defining a model language for a specific purpose based on a GPL". However, I don’t recommend to use UML as backbone for your language. It won’t give you an easier life in making your DSL executable. A better example is using BPEL and BPMN for defining a domain-specific process modeling language (see last part of the article). I think that using generic / common standards can help in defining DSLs and making them executable and (maybe) even portable. Not by using all kind of model transformations, but by using a common meta-(meta-) model defining the generic concepts (i.e. ontology) to be used in the DSL.

7 Comments Added

Join Discussion

Amr August 20, 2008 | Reply
This is indeed a useful observation. UML can be considered a DSL. which really abstract away low level implementation issues. but given the right model transformation, you could not use UML, but rather design executable DSLs that will be transformed into GPL.
Steven Kelly August 21, 2008 | Reply
Hi Johan, and thanks for an interesting article. It’s always reassuring to see smart people coming to the same conclusions as I have. I agree there are situations where framework-based DSLs make sense, and that one of the major criteria for such situations is the cost-benefit analysis. This analysis is very different if you want to sell your DSL (as you do at Mendix) rather than use it yourself. I’ve written more in a new blog entry, DSLs: to buy, to build or to sell? – http://www.metacase.com/blogs/stevek/blogView?showComments=true&entry=3396789364
Franco August 22, 2008 | Reply
Methods & Tools has also just published in its summer issue an interesting article that put in perspective the DSL and UML approaches: http://www.methodsandtools.com/archive/archive.php?id=71
Sean Walsh September 24, 2008 | Reply
Johan – Very interesting article. We provide a Domain Specific Modeling Language for the domains of Rich Internet Applications and Web services. Its clear to us that this is a much more productive way to visualize requirements, build the application – especially if being done offshore – and maintaining/updating it over time.
Richard Wood June 6, 2009 | Reply
Hello Johan
Thanks a lot for all the insights you are sharing with us.
They are really helpful to me. My post here is a bit out of date but I thought to stick with the context of UML as DSL.
To be more precise UML+UML profile = DSL as that is how we are using it. Mostly out of the obvious reasons why we use UML at all for modelling. The point I’m making here is another. As soon as there is more than one level of transformation it is better to remove UML after the first level and continue with a "UML free" DSL. The following transformations are UML independent and simpler to write.
Which means there are two types of models. Input and generated.
It also means that the first transformation can be a purely "technical" one where a DSL can be mapped to UML and UML profile. A model expressed in UML + UML profile can then be transformed in a model expressed in that DSL.
I am currently writing a "meta" transformation which transforms a DSL into a UML profile and hopefully also generates a transformation from a UML model back to a DSL instance. To my knowledge there are no existing tools doing this.
What are your thoughts on this approach?
Cheers
Richard
Johan den Haan June 9, 2009 | Reply
Hi Richard,
Thanks for sharing your thoughts and practice too!
Your approach sounds interesting. As you may have seen in my posts I’m not really a fan of using UML for Model Driven Development. However, The approach you take (as far as I understand it from your short description) can be very suitable for two reasons:
1. It bridges existing UML modeling with DSLs. If people are already modeling in UML it can be better to use small steps in the direction of using DSLs instead of just throwing away all UML.
2. Tools. A lot of good UML tools exist. With DSLs you need to build your own tools (especially with graphical DSLs). Building such tools is quite an effort, so sticking to UML and UML profiles can be quite attractive.
You’re writing a meta transformation from a DSL to a UML profile. Does your DSL make use of a standardized metamodel? Or do you write a transformation for each specific DSL?
Richard Wood June 11, 2009 | Reply
Hello Johan
Well, we have to use UML. Company standard. So no choice there. But there are some plus points, as you mentioned in your articles.
Our DSL is defined by its metamodel (abstract syntax). To be specific we use the Eclipse Ecore format as meta language to define that metamodel. The “meta” transformation can be applied for any metamodel in Ecore format. Besides the metamodel the transformation also requires mapping information from DSL concepts to UML concepts. For example a business object is mapped to a UML class. This results in the generation of a UML profile stereotype with the name “business object” and extending from UML class. Stereotype inheritance and associations can be generated in a similar way. It seems a complicated process which is the reason for not doing it manually.
Regards
Richard