The Shen OS Kernel Manual

copyright Mark Tarver 2016
all rights reserved

1. Introduction  

Note: In what follows we include with each section heading, a list of references including links. TBoS means The Book of Shen 3rd edition, SD is Shendoc 19.2.

Shen is a functional programming language developed by Mark Tarver and introduced in September 2011. It has been implemented in many languages including Common Lisp, Emacs Lisp, Scheme, Python, Haskell, Ruby, Clojure, Java, Javascript, and the JVM. It is notable for its extreme portability based on its reduced Kλ instruction set and the power of its type system. Unlike languages in the Hindley-Milner category, Shen incorporates a Turing-equivalent notation based on sequent calculus that compiles down into a dialect of Prolog. Shen Prolog is part of the system and was used to implement the type system.

In addition Shen contains its own compiler-compiler, Shen-YACC, which was used to build the reader and compiler for the language. Shen includes most of the features common to modern functional programming languages such as pattern-matching, currying, partial applications, guards and (optional) static type checking.

2. License  

The Shen kernel went open source in February 2015 when it transitioned to a 3-clause BSD license under version 19. Currently commercial development is underway on a cloud based version of Shen (Shen Professional) as a paid monthly service.

3. History TBoS p 14-17, p 388-394 SD, SD, SD

Shen owes its origins to work done in the 90s by Mark Tarver on applying high performance automated reasoning techniques to the problem of type checking programs. The first version of the work SEQUEL (SEQUEnt processing Language) premiered at the 1993 International Joint Conference on Artificial Intelligence under the title 'A Language for Implementing Arbitrary Logics'. Subsequently the language went through a cycle of development before emerging in 2005 as Qi. As with SEQUEL, Qi was written in Common Lisp. Qi was presented in 2008 under the title 'Lisp for the C21 Century' and at the European Conference on Lisp in 2009. In the latter address Tarver suggested that Qi could be implemented with an instruction set much smaller than the 118 Common Lisp system functions used to implement the language. The resulting language could be ported over a much wider range of platforms than Common Lisp.

In 2010 Vasil Diadov and other sponsors put up the capital to make this experiment happen. The result was the Shen kernel; delivered in 2011 and implemented in 43 (later 46) primitive functions under Common Lisp that defined a miniature Lisp called Kλ. Kλ was defined in what became the first standard for the new language; Shendoc 1.7. The latest standard is Shendoc 19.2.

Following the introduction of Shen in 2011, the language was rapidly ported to Scheme, Python, Clojure, Javascript, the JVM, Haskell and Emacs Lisp. The first book on Shen was published in 2012 and went through two revisions issuing in The Book of Shen 3rd edition in January 2015. This text is generally known as TBoS. The Shen kernel also went open source in February 2015.

In September 2015, another group of investors supported Tarver in forking the kernel to form the basis of a more extensive system with commercial applications. This version, Shen Professional (SP), added concurrency, graphics, better compilation, web connection and extended libraries to an upgraded kernel. Shen thus runs on the freemium model; with the kernel being OS and the enhanced SP available by subscription from 2017 onwards.

We strongly recommend anybody who masters the material in this manual and who wants to learn more, studies the references in The Book of Shen 3rd edition.

4. The Core Language TBoS p 20-125

4.1 Base Types
TBoS p 21, SD

The base types of Shen are symbols, strings, numbers and booleans. All these objects are self-evaluating; they evaluate to themselves when entered (note numbers may be subject to rounding and e numbers may be printed as e numbers or conventional numbers).

1. Introduction

2. License

3. History

4. The Core Language

4.1 Base Types
4.1.1 Symbols
4.1.2 Strings
4.1.3 Numbers
4.2 Function Applications
4.3 The Top Level
4.4 Arithmetic Operations and Relations
4.5 Comments
4.6 Sequences

4.6.1 Lists
4.6.2 Tuples
4.6.3 Vectors

4.7 Lambda Expressions and Local Assignments
4.8 Global Assignments
4.9 Higher Order Functions
4.10 Lazy Evaluation
4.11 I/O
4.12 Loading Files
4.13 Streams
4.14 Exceptions
4.15 Hashing
4.16 Property Lists
4.17 Eval

5 Defining Functions

5.1 Partial Functions and Tracking
5.2 List Handling Functions
5.3 String Handling Functions
5.4 Tuple Handling Functions
5.5 Vector Handling Functions
5.6 Guards
5.7 Backtracking
5.8 Writing in Kλ
5.9 Macros

6. Packages

7. Shen-YACC

7.1 Shen-YACC as a Recognisor Generator
7.2 Semantic Actions in Shen-YACC

8. Shen Prolog

9. Types

9.1 Basic Types and Constructors
9.2 Functions and Types
9.3 Synonyms

10 Sequent Calculus

10.1 Recursive Types
10.2 Exotic Types
10.2.1 Dependent Types
10.2.2 Negative Types
10.2.3 Subtypes
10.2.4 The Type of All Sets

11 Glossary of System Functions

12 The Syntax of Shen

4.1.1 Symbols TBoS p 21, SD

A symbol is series of alphanumeric characters plus the characters + ! $ % & * _ ' / . A symbol cannot begin with number. The exact syntax for symbols is given here.

Symbols are self-evaluating in Shen; the symbol a evaluates to aabc to abc. Symbols can be derived from interning suitable strings.

(1-) abc
abc

(2-) abc'
abc'

(3-) (intern "cheese")
cheese
4.1.2 Strings TBoS p 21,61-66 SD

Strings begin and end with double quotes. Strings can be created from atoms (strings, symbols and numbers) by the str function. A unit string is a string consisting of a single character; "w", "e", "4" etc.

The fast 2-place concatenation operator for strings is cn@s is a polyadic concatenation operator. (pos s n)selects the nth unit string from s starting from zero. tlstr returns all but the first element of a string.

"hello world"
"hello world"

(pos "hello world" 6)
"w"

(@s "1" "2" "3")
"123"

(tlstr "123")
"23"

The function make-string is polyadic and permits embedded directives in its first argument. A directive is either ~A, ~S, ~R, ~%.

  1. ~A is a directive to insert the corresponding argument as part of the string ignoring any double quotes.
  2. ~S is a directive to insert the corresponding argument as part of the string, not ignoring double quotes.
  3. ~R is a directive to insert the corresponding argument as part of the string, printing list structures with round brackets.
  4. ~% is a directive to format a new line.

Here is a script.

(make-string "a string~%")
"a string

"
(make-string "hello world~%")
"hello world

"

(make-string "~A says, hello world~%" "Fred")
"Fred says, hello world

"
(make-string "~S says, hello world~%" "Fred")
""Fred" says, hello world

"

(make-string "~S say, hello world~%" [Bill and Ben])
"[Bill and Ben] say, hello world

"

(make-string "~R say, hello world~%" [Bill and Ben])
"(Bill and Ben) say, hello world

"

c#n; where n is a natural number, embeds a unit string corresponding to code point n into a string. Shen supports code points 0-127 (the ASCII set) and may depending on the platform, support others. The functions n->string and string->n map a code point to a unit string and a unit string to a code point.

(n->string 56)
"8"

"
(string->n "y")

"121"

"hello worldc#33;"
"hello world!"

Note ($ 123) is read by the reader as ["1" "2" "3"]. $ is not a function however. The function explode which produces the same result is a function.

4.1.3 Numbers TBoS p 34-36, SD

Numbers in Shen are positive, negative or zero and either whole or floating point. The negative number -3 is just written '-3'. Cancellation is used so that --3 is read as 3. Integer numbers are treated as a subclass of floating points so that 1 = 1.0 is true. After Shen version 2.0  e notation was introduced 6.0221418e23 is equivalent to 6.02214181023

In Shen (- 3) is actually an application of two place subtraction to 3

and produces, not a number, but a closure. The exact syntax for numbers is given in the SD link to this section.

4.1.4 Booleans TBoS p 23-24, SD

The booleans in Shen are true and false. Basic boolean operations are or, and, if, not and cases. (if P Q R) evaluates Q if P is true, else it evaluates R if P is false and returns an error in all other cases. (cases P Q R S T U) is short for (if P Q (if R X (if T U error!))) where error! is triggered by all cases failing. and and or are polyadic - they accept n arguments and, where n >= 2, returns either true or false or an error if a non-boolean argument is evaluated. (or P Q R ...) returns true if one of P Q R ... returns true and (and P Q R ...) returns true if all P Q R return true.

(and true false)
false

(or true false)
true

(if (= 0 1) a b)
b

(cases (= 1 0) a
       (= 2 0) b
       true c)
c
4.2 Function Applications TBoS p 21-22

Applications in Shen are written in prefix form in the manner of Lisp. Partial applications are supported for nearly all functions and evaluate within Shen to closures. The mode of evaluation is applicative order evaluation and is strict*.

*Except in the case of the following; ifandorcasesfreezelet, /., lambda

(+ (* 7 8) 2)
58

(* 7)
#<FUNCTION :LAMBDA (#:Y18379) (multiply #:Y18378 #:Y18379)>

Certain functions that obey the associativity law, i.e. (f (f x y) z) = (f x (f y z)), are also polyadic. They accept n arguments; these include +, *, or, and and append. Thus (+ 1 2 3) is a legal expression.

4.3 The Top Level TBoS p 22-23

The Shen top level is a read-evaluate-print loop as in all functional languages. When you start it up, you get something this (depending on release and platform).

Shen, copyright (C) 2010-2015 Mark Tarver
www.shenlanguage.org, Shen 19
running under Common Lisp, implementation: SBCL
port 1.9 ported by Mark Tarver


(0-)
(0-) hello
hello

Each input is numbered starting with 0.

The top level has a feature borrowed from Unix of allowing past inputs to be repeated. !! repeats and evaluates the last input. !n repeats and evaluates the nth input. !s where s is a symbol repeats the last input whose leading function has s as a prefix. %n prints the nth input w.o. evaluation. %s prints the last expression entered whose prefix is s. % prints all expressions entered since the session began.

(0-) hello
hello

(1-) (* 7 8)
56

(2-) !1
(* 7 8)
56

(3-) !*
(* 7 8)
56

(4-) %*
1. (* 7 8)
2. (* 7 8)
3. (* 7 8)
4.4 Arithmetic Operations and Relations TBoS p 34-36, SD, SD, SD, SD

+, *, - and / are the basic arithmetic operations. + and * are polyadic. / produces a float if the divisor is not a factor; thus (/ 5 6) produces a float. =, >, < >= and <= are the usual relations. Notice = is polymorphically typed. In the absence of type checking, = will compare any two objects

4.5 Comments TBoS p 29-30, SD

Shen follows the convention of starting comments with \* and ending with *\. Multiline comments can include comments and new lines. Single line comments begin with \\ and are terminated with a new line.9

(* 7 \* This is a multiline
comment which is ignored *\6)

\\ This is a single line comment

4.6 Sequences TBoS p 49-52, TBoS p 190, TBoS p 85-98

Apart from strings, Shen contains several constructors for building sequences.

  1. List constructors.
  2. Tuple constructors.
  3. Vector constructors.
4.6.1 Lists TBoS p 49-52, SD

Lists in Shen are produced by placing the items in square brackets. The list of the numbers 1, 2 and 3 is just [1 2 3]. head and tail take the head and tail of a list.

[1 (+ 1 1) (+ 1 2)]
[1 2 3]

(head [1 2 3])
1

(tail [1 2 3])
[2 3]

cons adds an element to the front of a list. Shen includes an infix | which works in the manner of Prolog; [1 2 |[3]] conses 1 and 2 to the list [3]. If the second argument to cons is not a list then the result is a dotted pair in Lisp jargon.

(cons 1 [2 3])
[1 2 3]

[1 2 | [3]]
[1 2 3]

[1 | (+ 1 1)]
[1 | 2]
4.6.2 Tuples TBoS p 190, SD

Shen supports tuples; the function (@p a b) returns the pair of a and b@p is polyadic and associates to the right; (@p a b c) is shorthand for (@p a (@p b c))fst and snd select the first and second elements of a pair. The function tuple? recognises all and only tuples.

(@p (+ 1 1) a)
(@p 2 a)

(fst (@p a b))
a

(snd (@p a b))
b
4.6.3 Vectors TBoS p 85-98, SD, SD

(vector n) creates a (standard) vector with n elements numbered from 1 to n. For standard vectors the zeroth address holds an integer that gives the size of the vector. Certain special vectors called print vectors can be created that use this zeroth index for other purposes (see SD reference).The shortest standard vector is created by expression (vector 0) which creates a standard vector whose zeroth address contains the object zero. This called the empty vector and Shen permits the user to write <> as shorthand for the empty vector.

Vectors are printed out as <...> where ... are the contents of the vector.

(<-vector v n) accesses the nth (n >= 1) element of vector v. Note that this will raise an error if no object has been placed in the nth index.

The function limit accesses the 0th element of a vector v and thus gives its size.

Shen vectors are mutable. (vector-> v n x) destructively modifies v by placing x in the nth address of v.

A 2-dimensional array is simply a vector of vectors.

The non-destructive operation (@v x v) creates a new vector v' whose tail is the same as v and whose head is x@v is polyadic (@v A B ...N V) adds n elements in order to a vector V, copying it and creating a new vector.

(set *myvector* (@v 1 <>))
<1> 

(limit (value *myvector*))
1

(set *myvector* (@v 0 (value *myvector*)))
<0 1>

(limit (value *myvector*))
2

(@v -1 (value *myvector*))
<-1 0 1>

(limit (value *myvector*))  \\@v is non-destructive 
2

(<-vector (value *myvector*) 2)
1

(vector-> (value *myvector*) 2 a)
<0 a>

(value *myvector*) \\ vector-> is destructive, the global is changed
<0 a>
4.7 Lambda Expressions and Local Assignments TBoS p 56-57

(λ x (* x x)) is written as (/. X (* X X)) in Shen. It is customary to use capital letters for variables. (/. X Y Z (+ X Y Z)) is an acceptable abbreviation for (/. X (/. Y (/. Z (+ X Y Z)). You can also write (lambda X (+ X X)) though this form does not sustain abbreviations, (lambda X Y (+ X Y)) is not legal (it should be (lambda X (lambda Y (+ X Y))) ).

Local assignments are made using let. let takes a variable v, an expression e1 and an expression e2 and binds v to the normal form of e1 in the evaluation of e2.

let is polyadic; (let X1 a1 .... Xn an) is short for (let X1 a1 .... (let Xn an)...).

(let X 6 
     Y 5 
     (* X Y))
30
4.8 Global Assignments TBoS p 81-84, SD

Global assignments are made using set and the values of the variables recovered using value.

(set dozen 6)
6

dozen
dozen

(value dozen)
6

(bound? dozen)
true
4.9 Higher Order Functions TBoS p 70-80, SD

Higher order functions are supported. When passing a symbol denoting a function as an argument to a higher order function it is customary to use function to indicate that the symbol is denoting a function.

(define foldl
       F Z [] -> Z
       F Z [X | Xs] -> (foldl F (F Z X) Xs))
foldl

(foldl (function +) 0 [1 2 3])
6

(map (+ 1) [1 2 3 4])
[2 3 4 5]

(map (/. X (- X 1)) [1 2 3 4])
[0 1 2 3]
4.10 Lazy Evaluation TBoS p 123-126

Shen provides on-demand type secure lazy evaluation; the function freeze creates an object b; from its argument a without evaluating a. The function thaw takes b and evaluates it to the normal form of a. These functions satisfy the equation (thaw (freeze a)) = a.

(freeze (factorial 8))
#<FUNCTION :LAMBDA NIL (factorial 8)>

(thaw (freeze (factorial 8)))
40320
4.11 I/O TBoS p 99-108, SD

The basic read and write operations in Shen are constructed on top of two primitives of Kλ called read-byte and write-byte which take a stream as an argument. If no stream is given then these functions take the standard input and the standard output as default arguments. All other I/O system functions are built from these two primitives. These include

Type Functions Comments
Input read-byte, read, lineread, input, input+ (note all of the preceding take a stream as an optional argument which if not specified becomes the standard input)
Input read-file, read-file-as-bytelist, read-file-as-string  
Output write-byte, pr (note all of the preceding take a stream as an optional argument which if not specified becomes the standard output)
Output nl, output, print, write-to-file  

The functions are covered in the glossary of functions. In brief

function description
   
input takes the user input and evaluates it
input+ takes a type and a stream as an input and type checks the user input according to the type. If a type error is found, an exception is raised otherwise behaves as input
lineread reads the user input and when the input is terminated by a new line, returns the list of parsable tokens of that input
nl takes a non-negative integer n and prints n new lines. If no n is given prins a single new line.
output takes the same input as make-string and returns the same output but as a side-effect prints the resulting string
pr takes a string and a stream and prints the string to the stream
print prints any object returning that object
read takes the user input and parses it without evaluating the result
read-byte takes a stream and reads the first byte off it. If the stream is empty returns -1.
read-file Takes a file name (string) and returns the list of parsable tokens in it
read-file-as-bytelist Takes a file name (string) and returns the list of bytes in it.
read-file-as-string Takes a file name (string) and returns the contents of it as a string.
write-to-file Takes a file name f (string) and an expression e and writes the normal form of e to f overwriting any contents. If e is a string, then the contetns of e are written to f without the outer quotes.

In the examples below, the user input is in red.

(lineread)
1 2 3
[1 2 3]

(pr "hello world")
hello world"hello world"

(input)
(* 7 8)
56

(read)
(* 7 8)
[* 7 8]

(input+ number)
a
a is not of type number


(read-file "factorial.shen")
[define factorial 0 -> 1 X -> [* X [factorial [- X 1]]]]

4.12 Loading Files TBoS p 30- 31, SD

The function load loads a file containing a Shen program using the pathname as a string.

All files are opened relative to the value for the home directory. The value of this global is changed by the
cd function.

(load "factorial.shen")
factorial
loaded

(factorial 3)
6

(cd "My Workspace")
"My workspace/"

(load "factorial.shen")
OPEN: File "My Workspace/factorial.shen" does not exist

(cd "") {resets the home directory to default}
""

(load "factorial.shen")
factorial
loaded
4.13 Streams TBoS p 103-106, SD

The functions (stinput) and (stoutput) return the standard input and output streams. In Shen, streams are created to read from or to files. The function to create a stream is (open string direction). string is a pathname to the file and direction is in for source and out for sink.

The primitive read and write functions for streams are read-byte and write-byte. These read a byte from a source stream or print a string to a sink stream. close closes a stream.

(set *mystream* (open "gate-of-consciousness.ico" in)) {read a binary file}
#<INPUT BUFFERED FILE-STREAM (UNSIGNED-BYTE 8) #P"gate-of-consciousness.ico">

(read-byte (value *mystream*))
0

(read-byte (value *mystream*))
0

(read-byte (value *mystream*))
1

(close (value *mystream*))
[]
4.14 Exceptions TBoS p 121-123, SD

error has the same formatting features as output and uses a string to raise an error as an exception.

trap-error takes an expression e as its first argument and a function f as its second and evaluates e returning the normal form. But if an exception is raised from e, the exception is sent as an argument to f.

error-to-string converts any exception to the string from which it was derived.

(error "this is an error message, followed by a new line~%")
this is an error message, followed by a new line


(trap-error (error "this is an error message, followed by a new line~%") (/. E "I trapped the error."))
"I trapped the error."

(trap-error (error "this is an error message, followed by a new line~%") (/. E (error-to-string E)))
"this is an error message, followed by a new line

"
4.15 Hashing TBoS p 91-92, SD

(hash a n), where is any expression and n is a natural number, will give the hash value of a within the interval 0 to n.

(hash abc 10)
4

(hash abc 100)
94

(hash abc 1000)
294
4.16 Property Lists TBoS p 93-95, SD

Properly speaking property lists in Shen are property vectors, since vectors are used to store the information. However usage from Common Lisp has made the term ‘property list’ part of the common currency of computing.

The function (put object pointer value) creates a pointer from the object to the value which can be retrieved by (get object pointer)put is a destructive operation that uses hashing into a vector. Both put and get accept, as an optional final argument, a vector into which all pointers are created.

(put mark sex male)
male

(get mark sex)
male

(get mark height)
value not found
4.17 Eval TBoS p 93-95, SD

The eval function receives an expression e and returns the normal form e* of the expression eb, where eb results from e by replacing all square brackets by round ones.

(eval [+ 1 2])
3

(eval [defun car [cons X Y] -> X])
car

(car [1 2])
1
5. Defining Functions TBoS p 34-134

Shen has the usual destructuring capabilities found in pattern-directed languages like ML and Haskell. Pattern-matching capability is provided for all the main sequence types of the language; lists, strings, vectors and tuples. Variables are symbols beginning in uppercase. A wildcard _ is a variable used to signify an input whose identity is of no interest. A function definition without types has the form

(define <name> <rule1> ...<rulen>)

A function name is a symbol beginning in lowercase. Rules have the form

for k >= 0.

<pattern1> ...<patternk> -> <result> OR <pattern1> ...<patternk> -> <result> where <guard>

Alternatively

<pattern1> ...<patternk> <- <result> OR <pattern1> ...<patternk> <- <result> where <guard>

The backarrow <- has a special significance in backtracking. A pattern can be any atom (string, symbol, number, boolean) or sequences of patterns built by cons, @s, @v and @p. For convenience Shen treats [a b] and (cons a (cons b [])) as interchangable and | is used in patterns.

Variables are symbols beginning in uppercase. _ is a wildcard.

Patterns can be non-left linear. Repeated variables must be bound to the same value.

5.1. Partial Functions and Tracking TBoS p 26

In cases where no rule applies Shen will raise an error and offer to track the offending function.

(define likes
       tom dick -> yes
       dick harry -> yes
       harry tom -> yes)
likes

(likes tom dick)
yes

(likes dick fred)
partial function likes
track likes? (y/n) n

(track f) will trace a function f and (untrack f) will untrace it.

5.2 List Handling Functions TBoS p 49-60

The following function tests if one list is a prefix of another.

(define prefix?
  [] _ -> true
  [X | Y] [X | Z] -> (prefix? Y Z)
  _ _ -> false)
prefix?

(prefix? [1 2 3] [1 2 3 4 5 6])
true

(prefix? [] 8)
true
5.3 String Handling Functions TBoS p 66-69, SD

Eliminate "Julius Caesar" in favour of "Mark Anthony".

(define rep
       \\ replace all occurrences of "Julius Caesar" in a string by "Mark Anthony" 
       "" -> ""
       (@s "Julius Caesar" Ss) -> (@s "Mark Anthony" (rep Ss))
       (@s S Ss) -> (@s S (rep Ss)))
rep

(rep "Julius Caesar invaded Britain")
"Mark Anthony invaded Britain"

 
5.4 Tuple Handling Functions TBoS p 190, SD

Convert any tuple into a list.

(define tuple->list
\\ recurse through a tuple converting into a list 
  (@p X Y) -> [X | (tuple->list Y)]
   X -> [X])
tuple->list

(tuple->list (@p 1 2 3))
[1 2 3]
5.5 Vector Handling Functions TBoS p 88, SD

Pattern-matchng over vectors works when the vector is fully populated. Trying to access an element in an index which has not been assigned a value raises an error. Note vector destructuring in this way is slow and the facility is not often used. Faster means using for exist in Shen Professional.

(define vector-double
\\ non-destructively double every element in a vector 
   <> -> <>
   (@v X V) -> (@v (+ X X) (vector-double V)))
vector-double

(vector-double (@v 1 2 3 <>))
<2 4 6>
5.6 Guards TBoS p 43

Guards are also present in the language

(define greater-or-equal
  X Y -> X where (> X Y)
  _ Y -> Y)
greater-or-equal

(greater-or-equal 4 3)
true
5.7 Backtracking TBoS p 127-139

Backtracking is invoked in a Shen function f by using <- in place of ->. The effect is that the expression after the <- is returned only if it does not evaluate to the failure object (fail). If (fail) is returned; then the next rule in  f is applied.

(define foo
        X <- (if (integer? X) 0 (fail))
        X -> X)
foo

(foo 5)
0

(foo a)
a
5.8 Writing in Kλ TBoS p 170-181, SD, SD, SD, SD

Shen compiles into a miniature Lisp called Kλ. It is possible to communicate directly to Kλ through the Lisp top level. Kλ is very much an orthodox Lisp in the manner of Scheme, except symbols are not quoted (as in Shen). There is no significance associated with uppercase letters because pattern-matching does not exist in Kλ.

(defun rev (x)
        (if (empty? x) x 
            (append (rev x) (cons (hd x) ()))))
rev

(rev [1 2 3])
[3 2 1]
5.9 Macros TBoS p 108-120, SD

Shen macros allow the user to program the Shen reader. Programs are read into the reader as list structures, just as in Lisp. By manipulating these list structures, we can program in our own syntax. A Shen macro has the same syntax as a Shen function except that

  1. defmacro is used instead of define.
  2. The macro must define a 1-place function.
  3. The default rule X -> X is inserted by Shen when compiling the macro.

All macros are applied to a fixpoint to every token read by the reader. i.e. let m1,...mn be the list of macros existing in the system and let f be the composition of m1,...mn. Then every token t read by the reader is transformed to fix(f,t) where fix is defined as

fix(f,x) = x if f(x) = x
fix(f,x) = fix(f, f(x)) if ~(f(x) = x)

For example, suppose we have created a 2-place function log and wish to create a version of log that takes the second argument as optional. If the optional argument is omitted, we want the log function to work to the base 10.

To create the illusion of polyadicity, we use our two place log function and write a macro to insert the second argument if it is missing.

(log 100)
#<FUNCTION :LAMBDA (#:Y191) (log #:Y190 #:Y191)> {the second argument is missing, so the result is a partial application}

(1-) (defmacro logmacro
       [log N] -> [log N 10])
macro
logmacro

(log 100)
2

Note a macro has to be read in and compiled before it is used. It cannot be used to macroexpand expressions within the file it is defined in.

6 Packages TBoS p 113-120, SD

The polyadic function package has the form (package S L E1 ... En) where

  1. S is a symbol beginning in lowercase which is the name of a package; (e.g mypackage).
  2. L is an expression evaluating to a list (possibly empty) of non-variable symbols external to the package.
  3. E1 ... En are a series of Shen expressions.

The Shen reader prepends the package symbol followed by dot before all the lowercase user symbols when evaluating E1 ... En except names of Shen system functions and those listed in L

(package my-stuff [main] a b c append main)
my-stuff.a
my-stuff.b
my-stuff.c
append
main

Packages are used to enclose programs to avoid name clashes and in macros to generate programs. The function external applied to a package name gves the list of symbols external to that package.

7 Shen-YACC TBoS p 140-154, SD

Shen-YACC is a compiler-compiler based on earlier work by Dr Gyorgy Lajos on his METALISP Ph.D. project which was in turn based on even earlier work by Alexander Birman on TDPL parsing. The syntax of YACC is based on the BNF notation of Backus. Used at its most basic level, YACC is a generator for recognisors based on grammars. More usefully, YACC can be used to develop efficient compilers for programming languages and transducers for natural language processing.

7.1 YACC as a Recognisor Generator TBoS p 141

Consider the following grammar.

<sent> := <np> <vp>
<np> := <det> <n> | <name>
<det> := the | a
<n> := cat | dog
<name> := Bill | Ben
<vp> := <vtrans> <np>
<vtrans> := likes | chases

In Shen-YACC, this grammar would be represented as:

(defcc <sent>
  <np> <vp>;)
warning: <np> <vp> has no semantics.
<sent>

(defcc <det>
the; a;)
warning: the has no semantics.
warning: a has no semantics.
<det>

(defcc <np>
  <det> <n>;<name>;)
warning: <det> <n> has no semantics.
warning: <name> has no semantics.
<np>

(defcc <n>
  cat; dog;)
warning: cat has no semantics.
warning: dog has no semantics.
<n>

(defcc <name>
  bill; ben;)
warning: bill has no semantics.
warning: ben has no semantics.

(defcc <vp>
 <vtrans> <np>;)
warning: <vtrans> <np> has no semantics.
<vp>

(defcc <vtrans>
   likes; chases;)
warning: likes has no semantics.
warning: chases has no semantics.
<vtrans>

If semantic actions (i.e instructions on how to process the input) are not included, Shen-YACC warns the user and inserts a default semantic action (semantic completion). This default action appends non-terminals and conses terminals to form an output. The spacing is left to the judgement of the programmer, but ;s separate rules. When one of these definitions (e.g. for <sent>) is entered to the top level, it is compiled into Common Lisp by YACC with the message warning; no semantics in <np> <vp>.

The compiler generated acts as a recogniser for sentences of the language characterised by this grammar. If it is not a sentence of the language, then an exception is returned. If the input to the compiler belongs to this language, the program behaves as an identity function. The compiler is invoked by the higher-order function compile, that receives the name of a YACC function and parses its second input by that function.

(compile (function <sent>) [the cat likes the dog]) 
[the cat likes the dog]

(compile (function <sent>)[the cat likes the canary])
parse error

(compile (function <vp>) [chases the cat])
[chases the cat]

Note that names of YACC functions should always be enclosed in angles. YACC is sensitive to left-recursion which will force an infinite regress. YACC code is not type checked, but the code can be tracked just like regular code. Lists to the right of := are constructed in YACC using […] or cons or any of the conventional methods. Unlike Shen, the constructor | cannot be used in the syntax of an expansion (i.e. to the left of :=), though it can be used to the right (in a semantic action) to perform consing. However […] can be used to the left of :=Variables and wildcards are allowed to pattern match under Shen-YACC as in Shen and lists can be embedded in the input.

<bcs>, below, recognises the inputs belonging to [bm][cn]. Variables are uppercase as in Shen.

(defcc <bcs>
  <bs> <cs>;)
warning: [cons <bs> []] [cons <cs> []] has no semantics.
<bcs>

(defcc <bs>
  b <bs>;
  b;)
warning: b <bs> has no semantics.
warning: b has no semantics.
bs>

(defcc <cs>
  c <cs>;
  c;)
warning: c <cs> has no semantics.
warning: c has no semantics.
<cs>

(compile (function <bcs>)[b b b c c])
[b b b c c]
7.2 Semantic Actions in YACC TBoS p 142-143

Semantic actions are attached to grammar rules by following each rule by a :=. This YACC definition receives a list of as and changes them to bs.

(defcc <as>
  a <as> := [b | <as>];
  a := [b];)
<as>

(compile (function <as>) [a a a a a])
[b b b b b]

The first rule says that any input of the form a <as> is to be translated into an output consisting of b consed to the translation of <as>. The syntax of <as> requires that the input be a non-empty list composed of occurrences of a. So (compile (function <as>) [a a a]) gives [b b b]. The second rule is the base case.

As in Shen, round brackets signify function applications and square ones form lists. The following reformulation is an example:

(defcc <sent>
  <np> <vp> := (question <np> <vp>);)
<sent>
         
(define question
  NP VP -> (append [(protect Is) it true that] NP VP [?]))
question

(compile (function <sent>) [the cat likes the dog])
[Is it true that the cat likes the dog ?]

<!> and <e> are both reserved non-terminals. <e> always succeeds consuming none of the input and under semantic completion returns the empty list. <!> always succeeds and consumes all of the input and under semantic completion returns that remaining input.

8 Shen Prolog TBoS p 291-314, SD

Here are the memberreverse and append functions in Shen Prolog.

(defprolog member
   X [X | _] <--;
   X [_ | Y] <-- (member X Y);)
member

(defprolog rev
       [] [] <--;
       [X | Y] Z <-- (rev Y W) (conc W [X] Z);)
rev

(defprolog conc
  [] X X <--;
  [X | Y] Z [X | W] <-- (conc Y Z W);)
conc

(prolog? (member 1 [1 2]))
true

(prolog? (member 0 [1 2]))
false

(prolog? (member X [1 2]))
true

(prolog? (member X [1 2]) (return X))
1

(prolog? (rev [1 2] X) (return X))
[2 1]
9 Types TBoS p 187-369

 

9.1 Basic Types and Constructors TBoS p 187-200

The basic datatypes and types of Shen are

Typing (tc +) to the REPL activates type checking. Here are a few examples.

a
a : symbol

"hello world"
"hello world" : string

(= 4 5)
false : boolean

(* 2.3 2)
4.6 : number

[1 2 3]
[1 2 3] : (list number)

(@v 1 2 3 <>)
<1 2 3> : (vector number)

(@p 1 2 a)
(@p 1 (@p 2 a)) : (number * (number * symbol))

(@s "10" " green" " bottles")
"10 green bottles" : string

(freeze (* 7 8))
#<FUNCTION :LAMBDA NIL (multiply 7 8)> : (lazy number)

(10 +) (* 7)
#<FUNCTION :LAMBDA (#:Y18379) (multiply #:Y18378 #:Y18379)> : (number --> number)
9.2 Functions and Types TBoS p 192 onwards

Shen is an explicitly typed polymorphic language in the manner of Hope; it requires that functions be entered with their types. A --> B --> C is shorthand for A --> (B --> C). Signatures are entered in {...} immediately after the name of the function.

(define member
  {A --> (list A) --> boolean}
  _ [] -> false
  X [X | _] -> true
  X [_ | Y] -> (member X Y))
member : (A --> ((list A) --> boolean))

(define square
  {number --> number}
  X -> (* X X))
square : (number --> number)

(define swap
  {(A * B) --> (B * A)}
  (@p X Y) -> (@p Y X))
swap : ((A * B) --> (B * A))

(define unit-vector?
 {(vector A) --> boolean}
 (@v _ <>) -> true
      _ -> false)
unit-vector? : ((vector A) --> boolean)

(define unit-string?
  {string --> boolean}
  (@s _ "") -> true
  _ -> false)
unit-string? : (string --> boolean)

(member 1 [1 2 3])
true : boolean

(square 4)
16 : number

(swap (@p 1 2))
(@p 2 1) : (number * number)

(unit-vector? (@v 1 <>))
true : boolean

(unit-string? "a")
true : boolean
9.3 Synonyms TBoS p 223

Synonyms allow the use of shorthands for types. All types are normalised to their definiens.

(synonyms coor (number * number))
synonyms : symbol

(@p 1 2) : coor
(@p 1 2) : (number * number)
10 Sequent Calculus TBoS p 201-233

In Shen, datatypes are formalised in a series of (single conclusion) sequent calculus rules. If we want to introduce a new type t, then we have to write down a series of deduction rules describing the conditions under which an object x can be proved to be an inhabitant of t. For clarity, these rules are organised into datatypes usually named after the type defined. For instance, we want to create a type colour in which red, yellow and green are colours. In sequent format, we write:

_________
red : colour

____________
yellow : colour

____________
green : colour

In Shen ...

 (datatype colour

    ____________
    yellow : colour;

    __________
    red : colour;

    ___________
    green : colour;)
  type#colour

  red
  red : symbol

  red : colour
  red : colour

  blue : colour
  type error

The term red is now overloaded - it is both a symbol and a colour. Shen plumps for the base type first when overloading is present.

The use of 3 deduction rules is otiose - only one is needed if a side condition is placed before the rule. A side condition is signalled by the use of if  followed by some boolean expression, or let followed by a variable and an expression.

(datatype colour

   if (element? X [red yellow green])
   _________________________________
   X : colour;)
type#colour

Let’s suppose we were writing a card game and we want to use lists like [ace spades] [10 hearts] [5 diamonds] [jack clubs] as cards. If we were to enter [5 diamonds] to Shen it would come back with a type error. So we want to define a type card which is the type of all cards. A card is a 2-element list; the first element being a rank and the second a suit.

(datatype rank

   if (element? X [ace 2 3 4 5 6 7 8 9 10 jack queen king])
   ________________________________________________________
   X : rank;)
type#rank

(datatype suit

   if (element? Suit [spades hearts diamonds clubs])
   ________________________________________________
   Suit : suit;)
type#suit

(datatype card

  Rank : rank; Suit : suit;
  _________________________
  [Rank Suit] : card;

  Rank : rank, Suit : suit >> P;
  _________________________________
  [Rank Suit] : card >> P;)
type#card

The first rule says that a two-element list can be proved to be of the type card provided the first and second elements can be proved to be a rank and a suit respectively. The second rule says that given any proof in which it is assumed a two element list is a card, we can replace this assumption by the assumptions that the first and second elements are a rank and a suit. We need both rules to complete the identification of cards with pairs of ranks and suits f we do not use synonyms). Note that semi-colons separate individual goals to be proved; >> is the Shen turnstile and commas are used to separate individual formulae in the list of assumptions to the left of >>.

Shen permits a shorthand for expressing this type;

(datatype card

   Rank : rank; Suit : suit;
   =========================
   [Rank Suit] : card;)
type#card

Here are some sample inputs.

[5 spades]
[5 spades] : card

[king hearts]
[king hearts] : (list symbol)

[king hearts] : card
[king hearts] : card

(define get-suit
  {card --> suit}
  [Rank Suit] -> Suit)
get-suit : (card --> suit)
10.1 Recursive Types TBoS p 225-231

Now consider a simple recursive type; the type natnum of all natural numbers in successor notation:

0
[succ 0]
[succ [succ 0]] ...

(datatype natnum

   ___________
   0 : natnum;

   N : natnum;
   ==================
   [succ N] : natnum;)
type#natnum

0
0 : number

0 : natnum
0 : natnum

[succ 0] 
[succ 0] : natnum

A more complex type; the type of all fully parenthesised arithmetic expressions in list form.

(datatype arithexpr

 N : number;
 ___________
 N : arithexpr;
     
 if (element? Op [+ / - *])
 M : arithexpr; N : arithexpr;
 =============================
 [M Op N] : arithexpr;)
     
[9 + 7]
[9 + 7] : arithexpr
10.2 Exotic Types  

We'll now look at types not readily formalised in some of the Hindley-Milner languages.

10.2.1 Dependent Types  

In mainstream functional languages, terms to the left of : and types to the right of : are disjoint. In dependent types, types either incorporate terms or terms types. Here's an example; I want a dependently typed function myprog that takes a type as an input and enforces that type on the input. myprog does nothing special in this example; it simply returns the input.

(datatype my-prover-types

  P : Type;
  _______________________
  (myprog Type X) : Type;)
type#my-prover-types

(define myprog
  Type P -> P)

(myprover symbol p)
p : symbol

(myprog number p)
type error
10.2.2 Negative Types  

This requires advanced features discussed in TBoS. Something belongs to the type (not number) if it cannot be proved to be of the type number.

(datatype not

   X : A; !; fail!;
   _____________________
   X : (mode (not A) -);   
   
   _____________________
   X : (mode (not A) -);)
  
type#my-prover-types

f : (not number)
f : (not number)

f : (not symbol)
type error

[4 r] : (list (not string))
[4 r] : (list (not string)) 
10.2.3 Subtypes TBoS p 241-244, SD

A is a subtype of B if every inhabitant of A is an inhabitant of B. Here we define subtype and also the type int of all integers. We declare that int is a subtype of number, that anything that passes the integer? test is an int and that +, - and * can be used to construct ints. We can now verify the type of +int.

(datatype subtype
  
   (subtype A B); X : A;
   _____________________
   X : B;)
type#subtype
   
(datatype int

  if (integer? N)
  ________________
  N : int;
  
  ____________________
  (subtype int number);
  
  if (element? f [+ - *])
  __________________________
  f : (int --> int --> int);)   
type#int
  
(define +int
  {int --> number --> number}
   M N -> (+ M N))  
 +int : (int --> (number --> number))   
10.2.4 The Type of All Sets  

set is a type operator that applies to any list without duplicated elements. This code again uses some advanced material from TBoS.

(datatype set

   _____________
   [] : (set A);   
   
   X : A; Y : ((set A) & (without X));
   ===================================
   [X | Y] : (mode (set A) -);
   
   X : (set A);
   _____________
   X : (list A);
   
   X : P; X : Q;
   ====================
   X : (mode (P & Q) -);   

   ________________
   [] : (without X);

   if (not (= X Z))
   Y : (without Z);
   ______________________
   [X | Y] : (without Z);  
   
   ______________________________________________
   (not (element? X Y)) : verified >> Y : (without X);) 
    

We can show this works and define set union,

[1 2 3] : (set number)
[1 2 3] : (set number)

[1 2 3 1] : (set number)
type error
   
(define un
  {(set A) --> (set A) --> (set A)}
  [] S -> S
  [X | Y] S -> [X | (un Y S)]     where (not (element? X (un Y S)))
  [_ | Y] S -> (un Y S))
un : ((set A) --> ((set A) --> (set A)))
    
11 Glossary of System Functions TBoS p 370-385

You can find a glossary of system functions here.

 

12 The Syntax of Shen TBoS p 386-387

The syntax of Shen is presented as a context-free grammar in BNF notation annotated where necessary to explain certain context-sensitive restrictions. Terminals which represent parts of the Shen language are bolded, and in particular the bar | in Shen is bolded to distinguish it from the | used in the BNF to separate alternatives. For all X, the expansion <X> ::= ε | … indicates that <X> may be expanded to an empty series of expressions.

Syntax Rules for Shen Definitions

<Shen def> ::= (define < lowercase > {<types>} <rules>) | (define <lowercase> <rules>) | (defmacro <lowercase> <rules>)
<lowercase> ::= any <symbol> not beginning in uppercase
<rules> ::= <rule> | <rule> <rules>
<rule> ::= <patterns> -> <item> | <patterns> <- <item> | <patterns> -> <item> where <item>| <patterns> <- <item> where <item>
<patterns> ::= ε | <pattern> <patterns>
<pattern> ::= <base> (except -> and <-) | [<pattern> <patterns>
| <pattern>] | [<patterns>] | (cons <pattern> <pattern>) | (@p <pattern> <pattern> <patterns>) | (@s <pattern> <pattern> <patterns>) | (@v <pattern> <pattern> <patterns>)
<item> ::= <base> | [<items>
| <item>] | [<items>] | <application> | <abstraction>
<items> ::= <item> | <item> <items>
<base> ::= <symbol> (except
|) | <string> | <boolean> | <number> | ( ) | [ ]
<application> ::= (<items>)
<abstraction> ::= (/. <variables> <item>)
<variables> ::= <variable> | <variable> <variables>
<variable> ::= any <symbol> beginning in uppercase
<types> ::= ε | (<types>) | <types> --> <types> | <symbol>
<symbol> := <alpha> <symbolchars> | <alpha>
<symbolchars> := <alpha> <symbolchars> | <digit> <symbolchars> | <alpha>| <digit>
<alpha> ::= a | b | c | d | e | f | g | h | i | j | k | l | m | n | o | p | q | r | s | t | u | v | w | x | y | z | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z | . | = | - | * | / | + | _ | ? | $ | ! | @ | ~ | > | < | & | % | ' | #| ` | ; | : | { | }
<digit> ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
<number> ::= <signs> <float> | <signs> <integer> | <signs> <e-number>
<signs> ::= ε | + <signs> | - <signs>
<float> ::= <digits> . <digits> | . <digits>
<integer> ::= <digits>
<digits> ::= <digit> | <digit> <digits>
<e-number> ::= <integer> e - <integer> | <integer> e <integer> | <float> e - <integer> | <float> e < integer>

Syntax Rules for Shen Datatype Definitions

<datatype_definition> ::= (datatype <lowercase> <datatype-rules>)
<datatype-rules> ::= <datatype-rule> | <datatype-rule> <datatype-rules>
<datatype-rule> :: = <side-conditions> <schemes> <underline> <scheme>; | <side-conditions> <simple schemes> <double underline> <formula>;
<side-conditions> ::= ε; | <side-condition> | <side-condition> <side-conditions>
<side-condition> | if <item> | let <variable> <item>
<underline> ::= _ | one or more concatenations of the underscore _
<double underline> ::= = | one or more concatenations of =
<simple schemes> ::= <formula> ; | <formula> ; <simple schemes>
<formula> := <item> : <item> | <item>
<schemes> ::= ε | <scheme> ; <schemes>
<scheme> ::= <assumptions> >> <formula> | <formula>
<assumptions> ::= <formula> | <formula>, <assumptions>

Syntax Rules for Shen Prolog Definitions

<prolog_definition> ::= (defprolog <lowercase> <clauses>)
<clauses> ::= <clause> | <clause> <clauses>
<clause> ::= <head> <-- <tail>;
<head> ::= <prolog-patterns>
<prolog-patterns> ::= ε; | <prolog-pattern> <prolog-patterns>
<prolog-pattern> ::= <base>| [<prolog-pattern> <prolog-patterns>
| <prolog-pattern>] | [<prolog-patterns>] | (cons <prolog-pattern> <prolog-pattern>)
<tail> := ε; | <application> <tail>