DSL in the context of UML and GPL
With his recent post Microsoft DSLs + UML = ??? Steven Kelly touched a nice point in the MDA vs. DSL discussion. His blog post is a reaction on Cameron Skinner’s explanation of Microsofts plans to add UML support to their DSL tools, in his post DSL + UML = Pragmatic Modeling.
In my latest article on InfoQ I’ve stated: "I think the proliferation of DSL’s can’t be fully avoided, but it can be made less painful by defining DSL’s as specific instantiations of GPL’s by using meta models. Although they are not real meta models, the use of stereotypes in UML is a good example of a technique for defining a model language for a specific purpose based on a GPL.". I’d like to explain this statement a bit more, especially in the context of the discussion between Steven and Cameron. I think these points (UML vs. DSL vs. GPL) are important in designing or using the right model-driven approach (or, if you don’t like the term ‘model-driven’, in rising the language abstraction level ;).
DSLs vs. UML
Some speculations were going around about the UML support Microsoft is planning in their Rosario release and how this affects their DSL toolkit. Cameron explains these points a bit more:
"I believe that supporting both [UML and DSL] approaches to modeling gives developers and Architects alike the "right tool for the right job". For those folks who want to analyze and design their architecture using a standard notation that does not imply an implementation decision, use some UML diagrams. UML is great for describing higher level concepts and for defining the initial glossary that can be used to describe the concepts necessary to facilitate broader communication. For those folks who have decided on an implementation strategy, and do not want to be encumbered by the more general nature of the UML to describe that implementation choice, use DSLs."
To summarize, Cameron explains the difference between UML and DSLs:
- Useful to analyze and design the architecture.
- A standard notation that doesn’t imply an implementation decision.
- Great for describing higher level concepts and the initial glossary.
- Less general than UML.
- Based on an implementation strategy.
Steven Kelly reacts:
"That’s surprising to me. I’d always thought of UML as being more specific about implementation choices than DSLs — e.g. the restriction to an object-oriented language, and the individual classes, fields and methods of an implementation specified directly in the models. DSM [Domain-Specific Modeling] languages are almost invariably on a higher level of abstraction than UML. The only part of UML that is on a high level of abstraction is the Use Case diagram, and that’s only at the total loss of precision."
Cameron adds some more explanation: "In the coming months, you will very likely hear me or others on the team talk about using UML at the "logical" layer and DSLs at the "physical" layer". And from an answerin the comments: "start with a logical class diagram, then transform that into a DSL that is specific to you implementation or domain choice. Said in another way, the logical designers are meant to describe higher level, more general constructs, while the physical more about how to realize those higher level ideas."
To summarize again (combined with the previous statements):
- UML is used at an implementation independent level.
- DSLs are used at an implementation dependent level.
This sounds familiar? Just compare it to the concepts of the MDA! In principle Cameron states that UML should be used as the language for defining the PIM (Platform Independent Model) and DSLs should be used as the language for defining the PSM (Platform Specific Model).
Steven doesn’t like this approach: "The DSLs are thus not specific to the problem domain, as they should be, but to the solution domain: they have the implementation concepts of a particular Microsoft framework or library". He adds firmly "Putting UML before DSLs in this way isn’t just putting the cart before the horse: it’s putting the horse firmly into the cart — and pulling it yourself".
One step back: the concepts
Let’s take one step back and look at the concepts discussed above. In principle a software development process can be seen as a mapping between a problem space and a solution space. As visualized in Figure 1, in the problem space domain specific abstraction are used to create a model, while in the solution space implementation-oriented abstractions are used. The problem space model is often defined using natural language in some sort of a requirements document, while the solution space model is defined using a General Purpose Language (GPL) like Java or C#. The mapping is just a manual process.
Figure 1 – Mapping between problem space and solution space
The Model Driven Architecture (MDA) aims to define automatic mappings between problem and solution space. They therefore introduce a CIM or PIM to model the problem space, and a PSM to represent a model of the solution space. The MDA, defined by the OMG, of course aims at using UML to express the models. Once a PIM has been defined, it can be automatically transformed into a PSM using QVT (a model transformation language defined by the OMG). Sounds great, just define a mapping/transformation using QVT, model the problem space using UML and ready… your PSM is created automatically. However, UML and QVT are quite complex and abstract and ask for a lot of expertise. I’m a bit skeptical about the MDA approach for two reasons: they use UML as language, which is big and complex and therefore difficult to learn and use. Second, they only focus at one modeling dimension (abstract/concrete).
The second approach mentioned above is that of DSLs. In principle a DSL is an executable specification that offers a higher abstraction level (than GPLs) and is restricted to a particular domain. The domain specificity of a language is a matter of degree. Any language has a certain scope of applicability, but some of them are more focused than others. In general we can separate between two types of domains a DSL can be targeted at, (not surprisingly) the problem domain and the solution domain.
DSLs aimed at the solution domain (Steven calls them framework-based DSLs, I prefer ‘System Aspect DSLs‘) are usually aimed at a specific system aspect, like the data model, the presentation layer, security, business rules, workflows, etc. They have a much higher level of abstraction than GPLs like Java or C#, but are still applicable to more than one problem domain.
DSLs aimed at the problem domain (I’d like to call them ‘Subject Area DSLs‘) are used to directly model the problem space, while they are still directly executable. A DSL for insurance products is an example of such a subject area DSL. The difference between problem and solution space of course depends on your viewpoint.
Steven advocates the use of Subject Area DSLs, because they bring the real raise in abstraction level. I think it depends on the situation. While Subject Area DSLs are more specific and therefore also applicable in a smaller domain, the costs for designing and implementing the DSL should be taken in mind in comparison to the benefits. Second, although Aspect DSL are more generic and make use of solution-oriented abstractions, they are more and more rising to an abstraction level understandable for domain experts. For example a Business Rules DSL, a Workflow DSL, etc. These DSLs are generic and usable for a lot of problem domains, but they still have a much higher level of abstraction than existing programming languages.
And yes… Subject Area DSLs do have a higher level of abstraction, hence you should use them if it is cost effective to design and implement them. But don’t forget: depending on the size of the user community, development of training material, language support, standardization, and maintenance may become serious and time-consuming issues.
Combining the concepts
I definitely agree with Steven (see the citations above) that using UML and DSLs as presented by Cameron isn’t a very good idea. I do however think, that the worlds of MDE (MDE is broader in scope than MDA, it adds multiple modeling dimensions and a software engineering process) and DSLs aren’t opposites. I think that both DSLs and MDE are necessary assets for Model-Driven approaches. While multiple DSLs are needed to describe a software artifact (see for example the different architectural aspects of Service-Oriented Business Applications (SOBA) ), MDE is needed to provide a framework for connecting the different DSLs. An MDE methodology defines a framework of dimensions and their intersections, thereby defining the different models needed to describe a certain software application. This information also gives us the opportunity to discuss the needed DSL’s in a (more or less) formal way. Last but not least, an MDE methodology also describes a software engineering process and a maintenance process, thereby defining the order in which models should be produced, how they are transformed into each other (if applicable) and how to change an existing software system using models.
Another interesting combination is that between GPLs and DSLs. In my latest article on InfoQ I’ve stated: "I think the proliferation of DSL’s can’t be fully avoided, but it can be made less painful by defining DSL’s as specific instantiations of GPL’s by using meta models". This isn’t the same as what Cameron proposes about using UML to model the problem domain and DSLs to model the solution domain. What I suggests is defining DSLs using concepts from an existing GPL, i.e. by piggybacking or specializing an existing language. In this way the compiler, generator or interpreter for the GPL can be reused for making the DSL executable.
Just to make the idea clear I’ve also stated: "Although they are not real meta models, the use of stereotypes in UML is a good example of a technique for defining a model language for a specific purpose based on a GPL". However, I don’t recommend to use UML as backbone for your language. It won’t give you an easier life in making your DSL executable. A better example is using BPEL and BPMN for defining a domain-specific process modeling language (see last part of the article). I think that using generic / common standards can help in defining DSLs and making them executable and (maybe) even portable. Not by using all kind of model transformations, but by using a common meta-(meta-) model defining the generic concepts (i.e. ontology) to be used in the DSL.