table of contents
Computer Science
September 2001

Thoughts on (yet another ) scripting language

This is the first article in a projected series consisting of design notes for a scripting language tentatively entitle RSL (Richard's Scripting Language). The general theory is that it will be a language that I want to program in; however it will be put into the public domain. Comments and suggestions are welcome, of course.

The format that I will use is a series of numbered notes covering a variety of topics and considerations. As a general remark I have a certain amount of experience in using and creating scripting languages. The discussion will cover both implementation considerations and language design. There will be references to a language that I developed called Lakota (this language is proprietary and I cannot release it; however I am drawing on lessons learned from usage of that language.) Some notes will be ideas I am playing with.

Section I - General requirements

This section has observations about the general requirements for the interpreter considered as a program.

1.1 Invocation

The interpreter will be invokable in three separate modes: (a) by executing files using the standard hack, (b) as an interactive shell, and (c) as a callable routine with another program. When called it can be passed a string that is executed as text.

1.2 History & editing

In interactive shell mode there should be a simple editor available. It should be possible to edit the prior input text or to edit a block of new text. The editor should be either a simplified vi or a simplified emacs or both (other suggestions?). I definitely don't want to get into the fancy editor game. The kinds of things that should be editable include:

(a) files
(b) labelled blocks of text
(c) an unlabelled block
(d) resident procedures and functions
(e) current history (pulled in as a block
(f) the current command line (and prior using the usual) File editing naturally can be done using the editor of one's choice. Likewise command like editing can be done using a standard package. The issue is whether there should be embedded simple editor. A more detailed discussion is given below.

1.3 Threading

It should be possible to create interpeter threads within a process. The threads should be able to suspend themselves and be resumed by the outside process.

1.4 Platforms

At a minimum it should run on m$ windows and linux. Freebsd should also be supported. The implementation will include code for other platforms - unix variants, VMS, and MVS (with posix) - since I know how to do this.

1.5 Implementation language

The source code will be in portable C. Machine and OS dependencies will be isolated and will be matrix factored.

1.6 Library

The interpreter will consist of a main program to serve as a shell and a library which can be linked into C programs. Note: other languages may be supported depending on OS and language.

1.7 Extensions

The interpreter will be extensible. There are several possibilities here. First of all there are syntax extensions which alter the language syntax. This is an open and unexamined can of worms at the moment. Then there are full functional extensions which add new functions; presumably these involve an boiler plated interface. Finally there are raw calls into C. This will require some minimal interface.

1.8 Compilation and byte codes

My thought here is that as the interpreter parses input it "compiles" it into byte code (virtual machine code) and that byte code files can be executed by the interpreter. There are a number of issues here, e.g., selection of the byte code language. In a sense this is a separate language design issue. Another thought here is that the engine to process the byte code should be separate from the rest of the interpreter, i.e., the byte code language and the byte code engine can be changed without impacting the rest of the interpreter.

Embedded editor analysis

The transfer of blocks of text from the shell (keeping in mind that we are talking about the interactive shell and not the interpreter engine) to random J editor is the issue at hand. The question is how to do this in a manner that is consistent across operating systems and editors. Some of the solutions that occur to me include:

(a) Write a temp file, invoke the editor of choice (presumably specified in a user config file), and read the temp file when the editor terminates. This solution is general but rather clumsy.

(b) Write a temp file. Invoke an editor (which may have been done previously - I usually keep an editor permanently open in a separate window) separately and edit the temp file. Read the temp file into the shell. This works in pretty much any windowed environment but not on a bare console screen environment.

(c) Make text selectable for cut and paste. Copy the selected text into an edit window and paste it. This is OS and editor dependent.

(d) Run the interpreter engine under the kitchen sink editor of one's choice and interface it to said editor. This obviously is possible (and anybody who wants to do it is welcome to do so) but isn't of particular interest to me.

Note that the entire issue only arises for transient text. Anything that is relatively permanent is (I expect) going to be in a file. Typical examples that occur to me are:

(a) You've entered a sequence of commands with a typo and you want to reissue the commands with the typo corrected. This goes beyond simple command line editing.

(b) You want to alter the contents of a data set as an experiment but don't want to alter the file from which the data set was extracted.

(c) Similarly you want to alter the text of a function as an experiment without altering the file in which it is defined.

Of these (a) is probably the one most likely to be used.

Matrix factoring of dependencies

This is something that the people at Bell Labs worked out some time ago. The general idea is that you have dimensions in your conditional code, in particular features and platforms. You build up this database of feature/platform variations. Your individual C code may look something like this:

#define NEED_TIME_OF_DAY 1
#include dependencies.h
In dependencies.h there is preprocessor code that looks something like
#elif SYSTEM_WIN95
There is a top level which is the platform, an intermediate level which is common groupings for platforms, e.g. BSD, SYSV, LINUX, VMS, WIN16BIT, WIN32BIT, et cetera, and finally specific features. You compile with the platform as a preprocessor argument; after that everything falls into place so that all of the bits and pieces are picked up by the chained defines and conditional so that each file gets what it needs without having large amounts of preprocessor code in the individual files.

This page was last updated September 9, 2001.

table of contents
Computer Science
September 2001