EECE 571F= Domain-Specific Languages  
  
This is a page from yet
another great ECE, UBC subject.
[ Home | Assignments | Lectures |
DSL rules | Articles | Resources ]

Lectures (3)

Lectures:
Old: 1 | 2 | 3 | 4 | 5
New: DCGnums | Parsing | Meta-prolog | Faster1 | Faster2 | Rand-nums
Not done: Abduction | Stochastic abs

TCL- a procedural DSL builder

Writing DSLs is common. Can't we factor our the common bits, implement them once, then speed up our DSL generation?
  • Strong business case for such an approach. In "N" implementations are really just variations on 1 theme, then don't need to relearn all things from scratch for each new DSL.
  • This subject: common areas= logical.
  • Lex, Yacc: common areas= parsing text strings
  • TCL (tool command language): common areas= writing the right calls to lower-level procedures.
from Shyamalan Pather's notes on TCL:
  • Tcl was originally intended to be a reusable command language.
  • Its developers had been creating a number of interactive tools, each requiring its own command language.
  • Since they were more interested in the tools themselves than the command languages they would employ, these command languages were constructed quickly, without regard to proper design.
  • After implementing several such "quick-and-dirty" command languages and experiencing problems with each one, they decided to concentrate on implementing a general-purpose, robust command language that could easily be integrated into new applications.
  • Thus Tcl (Tool Command Language) was born.

    
        set month 2    
        set day 3    
        set year 97    
        set date "::"    
        puts 
    

    
    proc sum_proc {a b} {
        return [expr $a + $b]
    }
    
    proc magnitude {num} {
        if {$num > 0} {
            return $num
        } 
    
        set num [expr $num * (-1)]
        return $num
    }
    
    set num1 12
    set num2 14
    set sum [sum_proc $num1 $num2]
    
    puts "The sum is $sum"
    puts "The magnitude of 3 is [magnitude 3]"
    puts "The magnitude of -2 is [magnitude -2]"
    
          
    Output: 
    
    The sum is 26
    
    The magnitude of 3 is 3
    
    The magnitude of -2 is 2
    
    

    (puts displays the string)

  • The main difference between Tcl and languages such as C, is that Tcl is an interpreted rather than a compiled language.
  • Tcl programs are simply scripts consisting of Tcl commands that are processed by a Tcl interpreter at run time.
  • One advantage that this offers is that Tcl programs can themselves generate Tcl scripts that can be evaluated at a later time. This can be useful, for example, when creating a graphical user interface with a command button that needs to perform different actions at different times.

    
        set foo "puts hi"    
        eval $foo
    

  • Since that time, Tcl has been widely used as a scripting language.
  • In most cases, Tcl is used in combination with the Tk ("Tool Kit") library, a set of commands and procedures that make it relatively easy to program graphical user interfaces in Tcl.
  • One of Tcl's most useful features is its extensibility.
  • If an application requires some functionality not offered by standard Tcl, new Tcl commands can be implemented using the C language, a nd integrated fairly easily.
  • Since Tcl is so easy to extend, many people have written extension packages for some common tasks, and made these freely available on the internet.
TCL can be interpreted directly by its own tools (but that ain't the point).

TCL can be compiled into your source code.

TCL can call your program using a set of registered API.

TCL can be extended into any number of DSLs. From Tcl Developer Xchange: Extensions:


A Logic-Based DSL Methodology

What everyone else does:
  • Define a syntax
  • Get the syntax executing using:
    • Macros (e.g. the LISP approach)
    • Prototype an interpreter for that syntax using a high-level language; e.g. AWK, Perl, Python, ...
This subject:
  1. Structures
  2. BNF- = Partial BNF
  3. Define operators
  4. Static load
  5. Dependencies
  6. Interpreter
  7. Compiler
  8. Treatment learning

Structures

What are the main structures the "programmer" will manipulate in this DSL? The answer to this question is the art of DSLs. The good DSL designer seeks the smallest, simplest set.

For RULER:, there are:

  • Preference items.

    
    warnOnTypeClash == no.
    

  • Procedures (procs)

    
    say(X) means
        print(X),
        nl.
    

  • Types

    
    p = person(age,name,sex,status).
    

  • Rules

    
    r2
    if    sex equals male
          and age below 16
    then  status := inJail.   
    

BNF-

  • BNF= Backus-Naur Form
  • Named after its authors: John Backus and Peter Naur.
  • A syntactic description of a language.
  • It contains a set of re-write rules from a left-hand-side expressions to possible right-hand-side expressions.
  • Rules form a directed graph with a single root called "start", terminals (that have no rewrite rules) and non-terminals that aren't "start" or terminals.

E.g. expressions can be, among other things, identifiers or string constants. In BNF, this is written as follows:


expression ::== identifier | string;

("|" denotes "or"). A statement can be a print or an input statement:


statement ::==
   ("print" expression
   | "input" identifier)
   end_stmt.

end_stmt ::== ";".

Now, a program can be expressed as being a list of statements in the following way:


program :== vars types statements
statements ::== statement
           |   statement statements

Which says that a program contains one or more statements AFTER we define the vars and types.

So, the language we've defined in BNF includes the statements:


print a;
print "Hello";
input name;
    

Not legal is:


input "Hello";

because we've defined the input statement so it can only take an identifier, not a string constant.

A useful subset of all of BNF is BNF-:

W ::== X Y |Z
he structure W contains either the structures X followed by Y or the structure Z.
(X)*
Zero or more repeats of X.
(X)+
One or more repeats of X.
[X]
X is optional.
Terminals
Start with lower case, or are quoted.
X!
Something that satisfies a Prolog predicate X/1
E.g., for RULER:


  1. start ::==  CONFIG* PROC* TYPE* RULE+

Our system might have configuration information, procedures defined, types defined, but must have at least one rule.

Configuration information is very simple:


  2. CONFIG ::== SYMBOL "==" VALUE DOT
  3. DOT    ::== "."
  4. SYMBOL ::== atomic!
  5. VALUE  ::== atomic!

BNF- is a partial BNF language; i.e. when we don't explore some structure and just leave it to Prolog, we just mark it as a functor. Hence:


  6. PROC   ::== functor! "means" functor! DOT

Types aren't too hard either.


  7. TYPE       ::== SHORTHAND "=" TYPENAME
  8.                "(" ATTRIBUTE ATTRIBUTES ")" DOT
  9. TYPENAME   ::== atomic!
 10. ATTRIBUTE  ::== atomic!
 11. ATTRIBUTES ::= ("," ATTRIBUTE)+
 12. SHORTHAND  ::== char!            

 13. char(X) :- X @>= a, X @<= z.

Which just leaves rules:


 15. RULE  ::== ID if CONDITIONS then ACTIONS DOT
 16. ID    ::== atomic!
 17. CONDITIONS ::== CONDITIONS BINOP CONDITIONS
 18.             |   not CONDITIONS
 19.             |   CONDITION
 20. BINOP      ::== and | or
 21. CONDITION  ::== PROC
 22. ACTIONS    ::== ACTION and ACTIONS
 23.             |   ACTION
 24. ACTION     ::=  PROC

Define operators

From the SWI Prolog manual:
op(Precedence, Type, Name)
Declare Name to be an operator of type Type with precedence Precedence.

Name can also be a list of names, in which case all elements of the list are declared to be identical operators.

Precedence is an integer between 0 and 1200. Precedence 0 removes the declaration.

Type is one of: xf, yf, xfx, xfy, yfx, yfy, fy or fx. The `f' indicates the position of the functor, while x and y indicate the position of the arguments. `y' should be interpreted as ``on this position a term with precedence lower or equal to the precedence of the functor should occur''. For `x' the precedence of the argument must be strictly lower. The precedence of a term is 0, unless its principal functor is an operator, in which case the precedence is the precedence of this operator. A term enclosed in brackets (...) has precedence 0.

The predefined operators are shown in the following table. Note that all operators can be redefined by the user.


     ______________________________________________________________
     | 1200 |xfx  |-->, :-                                        |
     | 1200 | fx  |:-, ?-                                         |
     | 1150 | fx  |dynamic, multifile, module_transparent, discon-|
     |      |     |tiguous, volatile, initialization              |
     | 1100 |xfy  |;,                                             |
     | 1050 |xfy  |->                                             |
     | 1000 |xfy  |,                                              |
     |  954 |xfy  |\                                              |
     |  900 | fy  |\+                                             |
     |  900 | fx  |~                                              |
     |  700 |xfx  |<, =, =.., =@=,  =:=, =<, ==, =\=,  >, >=, @<, |
     |      |     |@=<, @>, @>=, \=, \==, is                      |
     |  600 |xfy  |:                                              |
     |  500 | yfx |+, -, /\, \/, xor                              |
     |  500 | fx  |+, -, ?, \                                    |
     |  400 | yfx |*, /, //, <<, >>, mod, rem                    |
     |  200 |xfx  |**                                            |
     |__200_|xfy__|^_____________________________________________|

Operator X is inside Operator Y if:
  • A BNF rule containing Y can be rewritten to a rule containing X.
  • AND only one DOT is found in the parse; i.e. all those rules are required to parse one term.

Operators inside another operator should have a lower precedence. For example:

  • and is inside if
  • means is not inside if since it is asserted as a separate Prolog term.
  • if and then seem to be outside of most things, so we'll give them a high precedence:

    
    :- op(999, xfx, if).  
    :- op(998, xfx, then).
    

    We use 999 as our top precedence since, it any higher, it would be inconvenient to reference X if Y then Z inside Prolog code (we'd need to add brackets to avoid our operators dominating "," which has a precedence of 1000).

  • Operators that can be inside themselves need an associativity of 'y' (e.g. and, or
  • Binary procedures can be easily handled via making their functor an operator. For example, after defining:

    
    :- op(700, xfx, in). 
    :- op(1,   xfx, to ).
    

    then we can call the procedure

    
    in(A, to(B,C)) means
        val(A,X),
        X >= B,
        X < C. 
    

    as follows:

    
    A in B to C
    

  • Binary operators like "and, or" must have a type of "xfx", "xfy", "yfx", or "yfy".
  • Prefix unary operators like "not" must have a type of "fx", "fy".
  • Postfix unary operators like "++" must have a type of "xf", "yf".
A common set of operator definitions is:


:- op(997, xfy, or).
:- op(996, xfy, and). 
:- op(995,  fy, not).

i.e. or is seen by the compiler before and and not. This definition supports the correct parsing of


a and b or c and not d

which is


or -- and -- a
|      |
|      |---- b
|
|--- and --- c
       |
       | --- not -- d

All of RULER's operators are:


:- op(1200, xfx, means).
:- op(999, xfx, if).  
:- op(998, xfx, then). 
:- op(997, xfy, or).
:- op(996, xfy, and). 
:- op(995,  fy, not).
:- op(700,  fx, say).
:- op(700, xfx, [upto,below,equals,over,downto,in, := ]). 
:- op(1,   xfx, to ).

Static load

To check if your ops work, code them up in Prolog, and see if your structures load without error.

Dependencies

Another test of your DSL language is whether or not you can explore all your structures and build a network linking them all.

Interpreter, Compiler,Treatment learning

Next lecture.

Some Semi-random thoughts

The DSL mindeset

Traditional software development:
  • Write a solution to problem X

DSL-based software development:

  • Build reuse into the lifecycle.
  • Write a language that lets you solve problems like X1
  • Test that language on X1
  • Reuse that language for X2,X3,X4,...
Can a working DSL for X be delivered in the same time as building a one-off solution to X?

DSL paradigms

Others folks-
  • DSL are a shorthand for macro calls to an imperative langauge.
  • E.g. TCL- the super procedural DSL (see below).

This subject-

  • DSLs implemented using a logic-based approach.
  • Interested in the cognitive and social implications of language choice.
  • Cognitive: some things are faster to say in language A than language B.
  • Social: control the language, control the community. E.g.
    • SUN vs Microsoft- JAVA vs C#
    • Within an organization- standards, hiring practices, methodologies are often language based.

      Change the language, exclude a community of expertise from a project.

      E.g. your coding standards for C++ don't apply to my DSL.

By the way, how to defend your DSL against the invasion of the standards committees:
  • Standard practice is to collect internal product attributes (e.g. SLOC, variables per method,
  • After this course, you will be able to collect external quality attributes - what decisions within a design effect quality goals?.
Exact connection from internal to external attributes open to debate:
  • General methods exist for determining that relationship for different domains- see http://www.ece.ubc.ca/~elec415/lect4.html and look for "Experiments with Internal/External Attributes".
  • But I think we can do better than those methods.

What is not a DSL?

By the way, what is not DSL? So, as you can see, our DSLs are part of a broader field. We restrict our scope only to make for 1 subject, not 5.

DSLs: an historical perspective

Over time, the following solutions have been tried to increase productivity:
  • Sub-routine libraries
  • Object-oriented frameworks and component frameworks:
  • High-level languages

Subroutine libraries:

  • Contain subroutines that perform related tasks in well-defined domains
  • e.g. differential equations, graphics, user-interfaces and databases.
  • Subroutine library = classical method for packaging reusable domain-knowledge.

Object-oriented frameworks and component frameworks:

  • Continue the idea of subroutine libraries.
  • Classical libraries have a flat structure, and the application invokes the library.
  • If invocations follow some standard pattern, then we can add that pattern to the library.
  • Framework is in controller.
  • Invokes methods provided by the application-specific code.

DSL-as-OO offers a natural bridge to current practice:

  • Collect scenarios from the domain.
  • Identify the nouns and verbs.
  • Cull and extend using noun analysis:
    • Cull the silly ones
      • E.G. there will be no data structure called "GUI".
    • Add the necessary ones
      • E.G. all the stuff programmers know are needed to run the show, but never appear in user-speak; e.g. "Database", "Matrix", "Tree".
    • For more on this process, see "Noun Analysis" in http://www.ece.ubc.ca/~elec415/lect2.html
  • Code up nouns are classes and verbs as methods.
  • Code up common patterns of usage (perhaps as separate classes or as class methods)
    • E.g. "Prompter" is a class that collects together a one-line text editor, buttons for ok,cancel,browse, help,close; drop-down menu list, scroll-bars; etc
    • But all that is buried away. We might just call it as:

      
      s := Executer prompt:
              'Type the name of the program, folder, document, or
           Internet resource, and Windows will open it for you. 
            ' default: Executer history.  
      

  • Concept of the spider-web:
    • Classes are libraries of services scattered around some repository.
    • Application is the smallest spider web that connects these services.
    • DSLs are the "head" built on the "shoulders" and "body" of some library of services.
    • Easy to quickly build and change.
    • Reality check: harder than it sounds.
Take that approach 100 steps further on:
  • Offer the "programmer"a language where all they see are their native idioms.
  • Remove from the language stuff the "programmer" will never use.
  • A domain-specific language (DSL) is a small, usually declarative, language that offers expressive power focused on a particular problem domain.
  • In many cases, DSL programs are translated to calls to a common subroutine library and the DSL can be viewed as a means to hide the details of that library.

Not © Tim Menzies, 2001
Share and enjoy- information wants to be free.
But if you take anything from this site,
please credit tim@menzies.com.