EECE 571F= Domain-Specific Languages

This is a page from yet
another great ECE, UBC subject.

Lectures (1)

Old:	1 \| 2 \| 3 \| 4 \| 5
New:	DCGnums \| Parsing \| Meta-prolog \| Faster1 \| Faster2 \| Rand-nums
Not done:	Abduction \| Stochastic abs

Domain-specific languages (DSLs)

Making languages bear-able:

For I am a bear of very little brain and long words confuse me.

Milne 1926

General purpose languages (GPLs) are a Good Thing.

But DSLs can be better:

Natural vocabulary for concepts are fundamental to problem domain.
Faster way to write common concepts

Languages are written for a purpose, for an audience.

Really, should we expect that they generalize to all purposes and all languages?
Explanation is a domain-specific, task-specific, user-specific inference [Wick & Thompson 1992].
High-level descriptions that are crystal clear to you, can be opaque to me [Clancey 1983].

Given the wrong language, you can't see the solution since the language prevents you from describing it:

Syme, from Orwell's 1984 [Orwell, 1949]:: "...the whole aim of Newspeak is to narrow the range of thought. In the end we shall make thoughtcrime literally impossible, because there will be no words in which to express it. ... The revolution will be complete when the language is perfect."
This sort of Newspeak appears not only in fiction. In 1918, National Geographic ran an article "How to be an American" describing how new immigrants were learning the ways of the good-old-USA. To a new immigrant in a strange new country, a missing piece of language in the following picture might be the point of the whole lesson.

The sexist sub-text of the lesson might easily be, in America, the place for women-folk is in the kitchen. Note the convenient political implications of such a message (if you were male). If women could be convinced that their place was only in the kitchen, then that would prevent them challenging the men-folk's rule in the new land. But I can't say anything more on this since I have to rush off and make my wife a cup of tea.

Anyway, forgetting politics and returning to computer science, we can say that control the language and you control what is easy or convulated to say.

Different representations of the same concepts can make certain inferences easier.

When you confront a (system) that you know nothing of,... your problem is how to assign interpretations to its symbols in a meaningful way...:

You may make several tentative stabs in the dark before finding a good set of words to associate with the symbols.
It is very similar to attempts to crack a code, or to decipher inscriptions in an unknown language...
When you hit a right choice... all of a sudden things just feel right, and work speeds up enormously.
Pretty soon everything falls into place
[Hofstadter, 1980, p50].

James Martin, Design of Real-time Computer Systems, 1967:

We must develop languages that the scientist, the architect, the teacher, and the layman can use without being computer experts.
The language for each user must be as natural as possible to him.
The statistician must talk to his terminal in the language of statistics.
The civil engineer must use the language of civil engineering.
When a man (sic) learns his profession he must learn the problem-oriented languages to go with that profession.

In planning, for example, the initial approach was to use a general-purpose logical theorem prover to construct plans:

When this proved intractable, the solution was the development of a more special-purpose representation --- the STRIPS representation of actions (of Fikes & Nilsson).
The STRIPS representation was designed to take advantage of the temporal structure of actions, particularly the fact that an action can be viewed as changing only a few aspects of the current situation.
This supported a special-purpose reasoning algorithm --- goal regression --- which was much more efficient than general purpose theorem proving.
Note that, while the resulting representation and reasoning algorithm are far less general than full first-order logic and theorem proving, STRIPS-style planning systems (particularly with some of the subsequent enhancements both to the language and the inference algorithm) are useful in a very wide range of applications.
Daphne Koller: Structured Representations and Intractibility

Given the right language, problem solving becomes easier [Hofstadter, 1980, p286]:

Chess experts perceive chess boards at a different level to novices.
Experts don't think deeper than novices; rather, they just don't waste as much time on bad ideas.
Expert's set of concepts are different to novices.
These concepts, these feature extractors act like a "filter": our expert literally does not see bad moves- no more than chess novices can see illegal moves.
Massive implicit pruning of the search space [Larkin, et.al. 1980].

High-level DSL nouns and verbs can be based on these expert feature extractors.

Procedural code to handle sensing the environment to implement the feature extractors.
Declarative code to describe logical glue to combine the feature extractors.

E.g.

if Knight at Pos does fork(PieceA,PieceB) and no other current goals then move Knight to Pos.

E.g. Definition of a feature extractor that recognizes an unusual run of temperature inside a piggery (example adapted from [Menzies, et.al. 1988].

% a pig is too warm if for more than 10 hours % the piggery is above 23 degrees (Celsius) defProblem( warmShed, temperature, 23, 10, 'shed too hot' ).

With this defined, a DSL for farm management could understand a rule such as

if warmShed and ... then ...

About DSLs

Domain-specific language (DSL):

A domain-specific language (DSL) is a programming language or executable specification language that offers, through appropriate notations and abstractions, expressive power focused on, and usually restricted to, a particular problem domain (from Domain-Specific Languages: An Annotated Bibliography).

A programming language dedicated to a particular domain or problem.

Provides appropriate built-in abstractions and notations;

Usually small (offering only a restricted suite of notations and abstractions).

Usually more declarative than imperative,

Often views as specification languages, not just an imperative language (focus of this subject).

Less expressive than a general-purpose language (GPL).

DSLs have also been called:

little languages [Bentley, 1988]
micro-languages,
application languages,
very high level languages.

Idiom support: some concepts in general purpose languages required idioms, idioms, idioms.

Idioms= Methods imposed by programmers to handle common forms, procedures.
- E.g. Ensure data is saved before the window is closed.
- E.g. Before conducting expensive tests, perform cheap tests that can rule out need for expensive tests.
No automatic support to check the idioms.

Timm's DSL Pre-conditions

The 1 day rule:

Users can get productive with the DSL in 1 day.
Not all users, just some users.
- Just the users who had the DSL created.
- Implies that DSL is not just high-level programming constructs;
- But constructs for an audience (did you understand the above examples? no matter- someone else does).

The elbow test:

Users elbow the analyst out of the way in their haste to get to the screen to change something that is obviously wrong to them.
Implies rapid comprehension of sentences in the DSL.

Example #1: SQL

Domain abstractions and notations include tables and operations on tables (such as join and sort).
Simple interface to running and controlling table updates.
Simple manipulation mechanisms, that cover a lot of common usages (e.g. table creation)

Example #2: UNIX Shell Script

Domain abstractions and notations include streams (such as stdin and stdout) and operations on streams (such as redirection and pipe).
Simple interface to running and controlling processes.
Simple control-flow and string manipulation mechanisms, that cover a lot of common usages.
Even though they are Turing-complete, they differ from GPLs:
- Unsuitable to manipulate, e.g., complex data structures
- In practice, they are used as a "gluing" facility for weaving together other functions; e.g.
  - AWK (or gawk)
  - sort
  - ls

/home/site> ls -lsa total 155 4 drwxr-xr-x 2 timm None 4096 Aug 29 00:05 . 4 drwxr-xr-x 3 timm None 4096 Aug 28 23:10 .. 5 -rw-r--r-- 1 timm None 4630 Apr 20 1999 BlueWa2.gif 4 -rw-r--r-- 1 timm None 3861 Aug 29 07:44 assignments.html 3 -rw-r--r-- 1 timm None 2919 Jan 11 1997 bgrnd9.gif 5 -rw-r--r-- 1 timm None 4581 Apr 20 1999 bolts.gif 4 -rw-r--r-- 1 timm None 3758 Dec 29 1997 comhap.gif 1 -rw-r--r-- 1 timm None 586 Aug 28 23:23 culled.pl 6 -rw-r--r-- 1 timm None 6073 May 8 2000 flower.gif 10 -rw-r--r-- 1 timm None 9333 May 8 2000 flowers.gif 6 -rw-r--r-- 1 timm None 5262 Dec 2 2000 hacker.gif 9 -rw-r--r-- 1 timm None 8453 Aug 29 07:44 index.html 1 -rw-r--r-- 1 timm None 955 Aug 28 15:33 join.gif 3 -rw-r--r-- 1 timm None 2302 Aug 29 07:44 l2.html 4 -rw-r--r-- 1 timm None 3110 Aug 29 07:44 l3.html 3 -rw-r--r-- 1 timm None 2302 Aug 29 07:44 l4.html 12 -rw-r--r-- 1 timm None 11635 Aug 29 07:44 lectures.html 23 -rw-r--r-- 1 timm None 22885 Aug 29 07:44 lectures.pl 22 -rw-r--r-- 1 timm None 21997 Aug 28 23:17 lectures.pl~ 6 -rw-r--r-- 1 timm None 5871 Aug 29 07:44 refs.html 4 -rw-r--r-- 1 timm None 3251 Aug 29 07:44 resources.html 3 -rw-r--r-- 1 timm None 2840 Aug 29 07:44 rules.html 12 -rw-r--r-- 1 timm None 12175 Aug 28 23:08 sql.png 1 -rw-r--r-- 1 timm None 206 Dec 8 2000 up.gif

Pre-processor.

/home/site> ls -lsa | gawk '{print $6}' 4096 4096 4630 3861 2919 4581 3758 586 6073 9333 5262 8453 955 2302 3110 2302 11635 22885 21997 5871 3251 2840 12175 206

Histogram generation:

gawk 'NR > 1 {n[int(/size)]++} END {for(i in n) { print string(width-n[i]," "),\ string(n[i], "*"),\ i*size + size/2 }} function string(n,c, i,s) { i=n while ( i-- ) {s= s c} return s }' width=$1 size=$2 $3 | sort +1 -2 -n

And finally...

/home/site> ls -lsa | gawk '{print $6}' | sh buckets.sh 20 1000 *** 500 **** 2500 **** 3500 **** 4500 ** 5500 * 6500 * 8500 * 9500 * 11500 * 12500 * 21500 * 22500

More examples

See the sample frame and rule-based languages.

DSL Advantages

Adopting a DSL approach to software engineering involves both risks and opportunities. The well-designed DSL manages to find the proper balance between these two. DSLs are:

Less comprehensive that general-purpose languages like C or Java.
Much more expressive in their domain.

As a result, they have properties that are crucial for the software industry:

DSLs allow solutions to be expressed in the idiom and at the level of abstraction of the problem domain. Consequently, domain experts themselves can understand, validate, modify, and often even develop DSL programs.
DSL programs are concise, self-documenting to a large extent, and can be reused for different purposes.
DSLs embody domain knowledge, and thus enable the conservation and reuse of this knowledge.
DSLs allow validation at the domain level. It becomes possible or much easier to automate formal proofs of critical properties of the software: security, safety, real time, etc.. Only need to test that the language constructs ensure (e.g.) safe operation. Then, any sentence written in that language is also safe.
DSLs enhance productivity, reliability, maintainability, and portability. Do I have figures on these, well.... (counter example- the Oak Ridge Study where the RLL team spent so much time fiddling their langauge, that they never wrote any sentences in that language [Johnson & Jordan, 1988] ).

These advantages have drawn the attention of:

Rapidly evolving markets where there is a need for building families of similar software, e.g., product lines
Markets where reactivity or software certification are critical:
1. Internet,
2. cellular phones,
3. smart cards,
4. electronic commerce,
5. embedded systems,
6. bank ATM,
7. ...

Some companies have indeed started to use DSLs in their development process:

ATT,
Lucent Technologies,
Motorola,
Philips,
...

However:

Most approaches and techniques are still ad hoc.
and researcher are actively studying those issues (see the conferences list).

DSL Disadvantages

The disadvantages of the use of a DSL are:

You're going to let end-users code? Are you insane?
1. Not end-end-users. Just the business analysts who, traditionally, would have provided the paper specification of the system.
2. Also, once the DSL is written, ?? use treatment learners to find key features/flaws with a system.
The costs of designing, implementing and maintaining a DSL.
?? better using logic programming.
The costs of education for DSL users.
?? demand the elbow test.
The limited availability of DSLs.
?? more DSLs if they can be built faster/cheaper.
The potential loss of efficiency when compared with hand-coded software.
?? Include a compiler with your DSL interpreter
The difficulty of finding the proper scope for a DSL.
?? ???
The difficulty of balancing between domain-specificity and general-purpose programming language constructs.
?? ???

Premise of this subject

A general framework for the cost-effective DSL is:

Define languages using Prolog operators.
Write DSL sentences as Prolog facts.
Interpret the Prolog facts using Prolog interpreters.
Optimize the interpreter using compilers.
Conduct what-if queries across DSL using stochastic abduction.
Implement the compilers and stochastic abduction via Prolog meta-interpreters.
Log the results of what-if queries, and use treatment learners to find methods to control the DSL program.

Note: not the only way to build a DSL:

Just (hopefully) the least work to get the most done.

Prolog operators

(NOTE: the following example comes from dsl1.pl.)

How NOT to write a DSL:

:- dynamic used/1, % refraction memory val/2. % working memory say(X) :- print(X),nl. and(A,B) :- A,B. or(A,B) :- A;B. in(A, to(B,C)) :- val(A,X), X >= B, X < C. upto(A,B) :- val(A,X), X =< B. below(A,B) :- val(A,X), X < B. equals(A,B) :- val(A,B). over(A,B) :- val(A,X), X > B. downto(A,B) :- val(A,X), X >= B. ugly(if(r1, then(or(and(below(age, 30), equals(name, ying)), not(and(equals(sex, male), downto(age, 40)))), say(hello(ying))))). yuck :- ugly(if(Id,then(Condition,Action))), not used(Id), Condition, !, print(firing(Id)), nl, assert(used(Id)), Action.

This works:

?- retractall(used(_)),yuck. firing(r1) hello(ying) Yes ?-

But it is so darn UGLY!

So, lets tell the Prolog compiler about some infix and prefix operators:

:- op(999, xfx, if). :- op(998, xfx, then). :- op(997, xfy, or). :- op(996, xfy, and). :- op(995, fy, not). :- op(700, fx, say). :- op(700, xfx, [upto,below,equals,over,downto,in]). :- op(1, xfx, to ).

Now we can write:

r1 if age below 30 and name equals 'ying' or not (sex equals male and age downto 40 ) then say hello(ying).

Which, by the way, has the same internal form as ugly/1 (shown above):

?- X if Y then Z, write_canonical(X if Y then Z). if(r1, then(or(and(below(age, 30), equals(name, ying)), not(and(equals(sex, male), downto(age, 40)))), say(hello(ying)))) X = r1 Y = age below 30 and name equals ying or not (sex equals male and age downto 40) Z = say hello(ying) Yes ?-

Prolog facts

So now, all our DSL "programs" become Prolog facts expressed using our pretty operators.

Prolog Interpreters

% better forward chaining interpreter fChain :- reset, steps. reset :- retractall(used(_)). steps :- step, !, steps. steps. % terminate if no satisfied rule can be found step :- X if Y then Z, % find a rule step1(X,Y,Z). % try to use it. % what we have before in yuck/0: but buried away % beneath several layers of convenience. step1(Id,Condition,Action) :- not used(Id),Condition, print(firing(Id)), nl, assert(used(Id)), !, Action.

Does it produce the same result as the above? Well, heck yes:

?- fChain. firing(r1) hello(ying) Yes ?-

Stochastic abduction for What-if

A topic for another time.

Prolog Meta-Interpreters

Once the interpreter is written, we can use Prolog's meta-query facility to quickly build a compiler.

fastChain :- reset, run. runs :- run, !, runs. runs. % terminate if no satisfied rule can be found term_expansion(X if Y then Z,(run :- Body)) :- clause(step1(X,Y,Z),Body). goal_expansion(X, Y) :- not predicate_property(X,built_in), clause(X,Y).

Note that:

This compiler is called once at load time, and never at runtime.
This compiler was built using bits and pieces from the interpreter. So, listen up: build the interpreter BEFORE the compiler.

Anyway, now anytime Prolog see a rule like this...

r2 if age below 30 and name equals 'eliza' or not (sex equals male and age downto 40 ) then say hello(eliza).

...it automatically converts it into:

?- listing(run). run :- not used(r2), ( val(age, A), A<30, val(name, eliza) ; not(( val(sex, male), val(age, B), B>=40 )) ), print(firing(r2)), nl, assert(used(r2)), !, print(hello(eliza)), nl. Yes

Which, by the way, runs just fine:

?- fastChain. firing(r2) hello(eliza) Yes ?-

Treatment learners

A topic for another time.

Not © Tim Menzies, 2001
Share and enjoy- information wants to be free.
But if you take anything from this site,
please credit tim@menzies.com.