home
table of contents
Computer Science
January 2003
email

Yet Another Scripting Language - Part 4

This article is a continuation of the discussion of "yet another scripting language", a scripting language that I am proposing to create. The ongoing presentation is broken up into sections, each section discussing some aspect of the proposed language.

The articles are not definitive specifications; rather, they are working documents exploring possibilities.

There are three prior threads in comp.lang.misc that are devoted to this projected language. The thread titles are:

Yet Another Scripting Language - Syntax thoughts
Yet another scripting language - arithmetic
Yet another scripting language - formalizing flow control
There are two web pages that summarize my thoughts on said topics at the end of said threads. These pages are:
http://richardhartersworld.com/cri/2001/newlang01.html
http://richardhartersworld.com/cri/2001/newlang02.html
There are a number of changes from the earlier pages. In particular the flow control constructs have been simplified, and the "morph" and the "iterator" concepts as presented here are more mature.

The current interim name for this language is YASL(Yet Another Scripting Language - pronounced yazzle). Suggestions for a better name are solicited.

4.1 Some matters of syntax

I followed Joachim Durchholz's implied request that braces {} not be prominent (because of keyboard variations) and have switched the roles of <> and {}. Thus

<> delimits invocations, e.g. <cos x>
[] delimits iterators/indices, e.g., [0...n]
() delimits grouping, e.g., a * (b + c)
{} delimits text substitution, e.g., homepage_{name}

I have also replaced <- with := in assignment statements.

As a general note I have tried to avoid overloading special characters. Thus, parentheses are only used for grouping and braces are only used for text substitution.

4.2 Infix operators

Functions with two arguments which yield a boolean (true/false) value can be converted to infix operators by adding a terminating question mark. (This is a change from the previous posting.) For example:

  
  x lt? y   is an expression with value true or false
Functions used as infix operators include:

The numeric comparison functions:

        lt      arg 1 numerically less than arg 2
        gt      arg 1 numerically greater than arg 2
        le      arg 1 numerically less than or equal arg 2      
        ge      arg 1 numerically greater than or equal arg 2
        eq      arg 1 numerically equal to arg2
        ne      arg 1 numerically unequal to arg2      
The lexical comparison functions:
        sgt     arg 1 lexically greater than arg 2
        sle     arg 1 lexically less than or equal arg 2      
        sge     arg 1 lexically greater than or equal arg 2
        seq     arg 1 lexicographically equal to arg2
        sne     arg 1 lexicographically unequal to arg2
The membership functions:
        in      arg 1 is a value in the range of iterator arg 2      
        nin     arg 1 is a value in the range of iterator arg 2
White space around infix operators is obligatory.

The precedence order for infix operators is * / - + ? where ? denotes operators formed by affixing a question mark. This is not cast in stone; I may change to * / having equal precedence and + - having equal precedence.

4.3 Syntax of conditionals

=> is the condition/action signifier. Thus we have

        x lt? y => <foo>
        x gt? y => <bar>
        else    => <bazfaz>
This is the equivalent of
        if   (x < y) foo()
        elif (x > y) bar()
        else         bazfaz()
i.e., in a sequence of condition/action clauses the action corresponding to the first satisfied condition is executed. "Else" is the default case (always true); it automatically terminates a condition/action sequence.

The select construct consists of

        select (one or two arguments)
                sequence of condition/action clauses
                end select
In the conditionals the first (or first two) argument(s) are omitted and are understood to be filled by the argument(s) for the select statement. Thus the example above could be written
        select x y
                lt?  => <foo>
                gt?  => <bar>
                else => <bazfaz>
                end select
4.4 The morph concept

"Morph" is a neologism, a term I've invented to describe an approach I haven't seen elsewhere. A morph is an entity tied to an identifier that can have any of several types (and values) depending on the context. These different context dependent values are call aspects. As an illustration let foo be a morph. Then

  print foo             // Prints the string aspect of foo
  foo := x + y          // The number aspect of foo is set to x+y
  x   := <foo y>        // The function aspect of foo is invoked
All morphs are guaranteed to have a default value aspect and an executable function aspect. Aspects need not have any particular relationship to each except for the following rule:
When the function aspect is invoked the default value aspect is set to the return value of the function aspect. The execution of the function aspect updates the entire state of the morph.
In addition to aspects a morph can also have attributes. These attributes are in turn morphs; they are designated by qualified names separated by periods. Since the attributes are morphs they in turn can have attributes that are morphs. Here is an example:
  circle                // Identifier for a morph representing
                        // a circle.
  circle.radius         // Attribute of circle - its radius                        
  circle.center         // Attribute of circle - its center
  circle.center.x       // The center's x coordinate
  circle.center.y       // The center's y coordinate
Although morphs can be thought of as executable objects with state they are not objects as the term is generally understood in the object oriented programming paradigm. In particular:

(a) Morphs are necessarily bound to identifiers; each identifier is tied to a unique morph and vice versa. Objects, on the other hand, may have more than one identifier pointing to it.

(b) Objects themselves do not (usually) have values in their own right nor are they functions in their own right. Morphs always do and always are.

(c) Objects (in most paradigms) has a fixed structure, being instances of defined classes (templates being a partial exception.) Morphs do not have a fixed structure; they are protean; their structure can be modified at any time.

Special syntax is used to specify a morph aspect as such. Thus in YASL identifiers may not begin the $ character. The fixed aspects may be referenced with an "attribute" that does begin with a $ character. Thus if foo is a morph then

        foo.$s  is the current value of the string aspect
        foo.$r  is the current value of the number aspect
        foo.$b  is the current value of the boolean aspect
        foo.$f  is the current value of the function aspect
In YASL, although not necessarily in other languages using the morph concept if there should ever be such, all values (except functions) are strings. That is, the boolean aspect is either the string "true" or the string "false". Likewise the numeric aspect is a string that represents a valid number. When either the numeric aspect or the boolean aspect is set the string aspect is set to the new assignment.

Note: Naturally an implementation will only make these conversions on an as needed basis.

4.5 Assignment

Yasl has two forms of assignment - assignment of value and morph copying. The assignment of value lexeme is := ; it sets a specific aspect. Here are some examples:

        x := a + b      // The number aspect of x is set to a+b
        x := foo.$f     // The function aspect of x is set to the
                        // function aspect of foo.
The right arrow is a specialized convenience. It is used when we want to assign a value to a list of identfiers. For example:
        0 -> a, b, c.x, c.y
sets the value (number aspect) of each of a, b, c.x, and c.y to 0. The right arrow is also used in scatter assignment of lists; see the section below on iterators.

The equals sign (=) along with the "set" verb is used for morph copying. For example:

        set     x = y   // x is now an exact copy of y
        set foo.a = bar // foo.a is now an exact copy of bar
Morph copying can be used to emulate inheritance (or implement it if that is your taste in language.) Thus we could something like this: We define class_1, class_2, etc as templates for objects. Morph copying serves as a "new" operator. Substructures can be "inherited" by one class from another by copying substructures. For example:
        set class_1.print = class_2.print  // inheritance
        set object_1      = class_1        // instantiation
This capability does not mean that YASL is an OOL; the emulations are not constrained nor are they obligatory. It does mean, however, that some concepts from OOP may be convenient to use.

4.6 Iterators

In this context an iterator is a mini-program in the form of an expression that generates a well defined, finite sequence of values. Syntactically, iterators are delimited by a pair of brackets []. As defined here a iterator has a left part (optional) and a right part, separated by a colon.

The right part is a sequence specification, which may be the composition of any of the following:

        an enumerated list, e.g.        x, y, z
        a numeric sequence, e.g.        m...n
        a stepped numeric sequence      m..2..n
        a table index expression        foo[]
        a table fields expression       bar.[]
(for table indices and field expressions see below, the structure of variables.) For example, the following is a sequence specification:
        1..2..n, x, y, foo[], foo.[]
The left part is a series of transformation operators, separated by semicolons. Transformation operators apply a transformation to each element of the sequence specified in the right part. For example $fun is an operator with one argument, the name of a function. The named function is applied to each element in the sequence specification to produce the final sequence. For example, suppose that sq is a function that squares its argument. Then the iterator
        [$fun sq: 1...5]
generates the sequence 1, 4, 8, 16, 25. It should be understood that a iterator does not represent a list as such; it is an expression that will generate a sequence upon demand. How that is done is entirely up to the internal implementation.

Iterator expressions are not bound to variables; brackets tied to an identifier signify indexing.

4.7 The structure of variables

In YASL an identifier always represents a table of variables (morphs). The columns of this table are the sub-fields of the identifier (variable). In addition to the named columns there is one anonymous column; it holds the indexed principal morphs. The rows are indexed numerically; row zero holds the names of the sub-fields (conventionally the anonymous column has name $v); rows 1 to n contain the n instances of the sub-fields. The table is guaranteed to have at least one column (the $v column) and one row, 0. The special fields, $nrow and $ncol, hold the number of rows (not counting row 0) and the number of columns.

References into the table are specializations of the form variable_name[index].[field_name]. Here are the possibilities for table foobar:

foobar                  refers to foobar[1].$v
foobar.$v               refers to foobar[1].$v
foobar.a                refers to foobar[1].a
foobar[i]               refers to foobar[i].$v
foobar[i].a             refers to foobar[i].a
foobar[]                is an iterator representing col $v
foobar[].a              is an iterator representing col 'a'
foobar.[]               is an iterator rep. all columns except $v
foobar[i].[]            is an iterator holding values from row i

foobar[].[]             is syntactically illegal
Tables are resized dynamically. Thus in the following code
        x[2] := poodle
the table for x is expanded to have a row 2 if it does not have one already. When the iterator expressions are on the left hand side of an assignment the table is resized to reflect the new dimensions. For example
        foobar[] := [a, b, c]
simultaneously sets the $v column to be a, b, and c, and sets the number of (data containing) rows to be 3.

4.8 Sample code - qsort

To better get a sense of the flavor of the language here is some sample code in an implementation of quicksort. The line numbers are there for annotation.

Note: This is not the "real" quicksort because it does not alter the array in place; rather it generates two sub-lists that are sorted in turn.

01   function qsort vec
02       vec.$nrow le? 0 => $[] := []
03       else            => begin
04           pivot := vec[1]
05           loop val = vec[] ; select val pivot
06               lt?   => small[] := [small[], val]
07               gt?   => big[]   := [big[],   val]
08               else  => same[]  := [same[],  val]
09               end loop
10           $[] := [<qsort small>[], same[], <qsort big>[]]
11           end
12       end qsort
01: Functions, procedures, coroutines, and executable files are all the same in that the same data passing conventions are used and that they are all invokable in the same way. Arguments are passed call by copy.

02: The terminal point of the recursion is when the array to be sorted is empty. The code is divided into two cases, the terminal case being handled in line 02 and the recursion being handled in lines 03-11.

Line 02 is a condition/action pair on a single line. The condition is that the array to be sorted (vec) is empty, i.e., its length is zero. The condition could have been written <le vec.$nrow 0> instead. The action takes a bit of explanation.

Each occurence of qsort in the code is a separate morph. In the calling function qsort, the morph, has a state that is altered when it is called. The code in qsort is responsible for ensuring that the altered state of the calling morph is as it should be.

The issue here is that the sorted vector must be returned as part of the structure of the calling qsort morph, i.e., in the calling routine qsort[] will hold the sorted array. The question then is how to refer to the function state within the function body. The tactic used is to use $ to stand for the morph. The action clause sets the anonymous column to be empty. An alternative would be to set the number of rows to be 0, i.e., $.$nrow := 0.

03: "Else" is a reserved word having the value of true. Note that indentation is strictly enforced. In YASL indentation must agree with scope delimiters.

04: Tables and arrays use 1 based indexing for the actual content.

05: A loop statement may have more than one clause. The 'for' clause has the form: var = iterator; the var ranges sequentially over the values produced by the iterator. The 'select' clause asserts that the body of the loop is entirely occupied by the select block. This is a general rule, i.e.,

        loop clause_1; ... clause_n
                body
                end loop
is equivalent to
        loop clause_1
                loop/select clause2
                        ...
                        body
                        end loop
                end loop
06-08: The array is sorted into three sub-arrays called small, same, and big. The idiom, foo[] := [foo[], x], says that x is appended to foo[].

09: The end statement is indented to the body of the block. Upon reflection I opine that that will be a fixed rule - it eliminates the "indentation style wars" before they happen.

10: The expression, <foo ...>[], refers to foo[] after the invocation is complete. This is a general rule; the function expression is syntactically equivalent to the morph. As in line 02, $[] is the altered content of qsort in the calling routine.


This page was created January 21, 2003. This page was last updated July 13, 2003.

home
table of contents
Computer Science
January 2003
email