An Enchilada program or expression is written in postfix syntax, with the following advantages:
Postfix syntax (together with term rewriting and immutability) enables powerful equational semantics. In turn, this also allows rewriting rules to be applied syntactically.
For example, a simple rule that can always be applied is the double negation rule:
X - - Y <==> X Y
The most general rule is the equational rule that holds for all A, B and X that are Enchilada expressions:
A X == B X <=> A == B
Unlike mainstream programming languages, Enchilada uses a term(=expression) rewriting engine as its computational model.
Expressions can be potentially rewritten when they contain operators, next to valid operands. There are two types of operators: unary and binary.
An unary operator rewrites one operand to its left while a binary operator rewrites two operands from its left. Postfix expressions can even be rewritten in parallel by using the following algorithm:
1 2 3 4 + 5 6 - 7 8
1 2 3 4 + 56 - 7 8
1 23 4 + 5 6 - 7 8
1 23 4 + 56 - 7 8
1 2 7 5 _6 7 8
An expression is built from four types of atoms: sequences, operators, symbols and strings.
Every operand in Enchilada is of one type: the sequence. A sequence is defined to be a list of expression pairs. Depending on the order and the value of pairs, sequences can be interpreted as numbers, (multi)sets or (multi)maps.
Syntactically, a sequence is a list of n pairs enclosed by square brackets.
Each pair in a sequence holds two expressions - left and right - which are syntactically separated by an equals(=) symbol. The left is defined as the key of a pair, and the right as the value of a pair:
[k(1)=v(1);k(2)=v(2);... ;k(n-1)=v(n-1);k(n)=v(n)]
Because each pair is built from expressions, sequences and expressions can be nested at alternating levels:
[1 2 +=[[3=5]=5;=6 7 *;4=[[[2]]=+]]]
A value or key is allowed to be an empty expression. A sequence is interpreted as a number if all its keys and values are empty.
As an result, addition is equal to the concatenation of 'empty-spaced' sequences:
[=;=;=;=] [=;=] + == 4 2 + [=;=;=;=;=;=] == 6
The equals symbol is dropped when a pair has an empty value:
[1=2;3=;4=;=5;=6;7=8] == [1=2;3;4;=5;=6;7=8]
Sequences carry sign. A sequence carries negative sign when it is prefixed with an underscore.
The sign of a sequence can be turned with the negate(-) operator:
10 3 4 + - + == 10 7 - + == 10 _7 + == 3
All the arithmetical operations work as expected on signed numbers and signed sequences:
_[1;2;3;4;5] 3 + == _[1;2]
Other operators (such as combine) use sign to overload their behaviour.
Rather than considering only one item at each processing step, it can be much more succinct or efficient to (concurrently) process a sequence of items as if it were one single unit.
The Enchilada core is especially designed to provide the basic functionality to manipulate sequences of complex data in numerous ways.
Moreover, because of Enchilada's vector-like operators, simple algorithms like sum or product can be implemented without looping or recursion.
The following paragraphs describe each operator in more detail:
The binary multiply(*) operator is Enchilada's most versatile operator: it can be used as a basis to implement map, fold and other purely functional vector operations.
The base case of multiplication is when two sequences - holding one pair each - are multiplied. In this case, keys and values of both sequences will be concatenated independently as follows:
[1=2] [3=4] * == [1 3=2 4]
In the case that either sequence holds more than one pair, the Cartesian product is applied:
[1;2;3] [4;5] * == [1 4;1 5;2 4;2 5;3 4;3 5]
Interestingly, the Cartesian product of two numbers also yields a number:
[=;=;=;=] [=;=;=] * == 4 3 * [=;=;=;=;=;=;=;=;=;=;=;=] == 12
The base case of the binary maximum(|) operator is the same as multiplication, i.e. keys and values are concatenated independently when both operands hold one pair:
[1=2] [3=4] | == [1 3=2 4]
In the case that either operand holds more than one pair, pairs are drawn from both operands and are combined on equal index and for all indices up to a maximum (the maximum size of either operand):
[1;2;3;4;5;6] [7;8;9] | == [1 7;2 8;3 9;4;5;6]
Similar to multiplication and concatenation, the maximum of two numbers also yields a number:
[=;=;=;=] [=;=] | == 4 2 | [=;=;=;=] == 4
The binary minimum(&) operator compares pairs, drawn from each sequence, for all indices up to a minimum.
For each index, the greatest pair is chosen:
[3;5;7;9;11;13;15] [0;3;6;9;12;15] & == [3;5;7;11;13;15]
The binary modulus(%) operator trims the first or second operand to a length that is equal to the modulus of their sizes.
The modulus operation yields zero if either operand is zero:
If both operands have positive or negative sign, the first operand is trimmed. Otherwise the second operand is trimmed.
[1;2;3;4;5;6;7;8] 5 % == [1;2;3]
_[1;2;3;4;5;6;7;8] _5 % == _[1;2;3]
_3 [1;2;3;4;5;6;7] % == [1;2;3;4]
3 _[1;2;3;4;5;6;7] % == _[1;2;3;4]
The divide(/) operator divides the (size) of the first operand, given the greatest common divisor (gcd) between both operand sizes.
Visually, pairs in the first operand are indexed via a 2-dimensional line algorithm, with the gcd as the width:
[ 1; 2; 3; 4; 5; 6; 7; 8; 9; 10;11;12; 13;14;15; 16;17;18] 3 / == [1;4;8;11;15;18]
With divide, a sequence can be interpreted as a matrix and transposed as such:
4 [ 1; 2; 3; 4; 5; 6; 7; 8; 9;10;11;12] * 4 / == [1; 5; 9; 2; 6;10; 3; 7;11; 4; 8;12]
A sequence can be reversed with the unary reverse(`) operator. Reversing a sequence twice yields the same sequence:
[1;2;3;4;5] ` ` == [5;4;3;2;1] ` == [1;2;3;4;5]
The unary iota operator(~) replaces every key in a sequence with its index:
[a=x;b=y;c=z] ~ == [0=x;1=y;2=z]
The iota of number x, yields a sequence with numbers 0 to (x-1):
10 ~ == [0;1;2;3;4;5;6;7;8;9]
A sequence is defined to be a multiset when all its pairs are sorted. A multiset may hold duplicate pairs:
[a=1;b=2;c=3;c=3;e=5;e=5;f=6]
The 'is-sorted' invariant is cached for all (sub)sequences. As a result, nearly sorted sequences are sorted very efficiently by Enchilada.
The binary combine (<) operator takes two sequences, sorts them, and then applies a multiset operation. Depending on sign, a union, difference or intersection is performed.
A union is performed when both sequences have positive sign:
[1;2;2;3;4;5] [1;2;4;4;4;5] < == [1;1;2;2;2;3;4;4;4;4;5;5]
A left-difference is taken when the second sequence has negative sign. The right-difference is taken when the first sequence has negative sign:
[1;2;2;2;3;3;4;5] _[1;2;2;3;3;3;5] < == [2;4]
An intersection is performed when both sequences have negative sign:
_[1;1;2;3;4;4;5;5] _[2;3;4;4;4;5] < == [2;3;4;4;5]
The unary turn(:) operator swaps key and value for each pair in a sequence. Turning a sequence twice yield the same sequence:
[a=1;b=2;c=3] : : == [1=a;2=b;3=c] : == [a=1;b=2;c=3]
The unary wipe(#) operator replaces every key in a sequence with an empty expression. Wiping a sequence multiple times will yield the same sequence as when it is wiped once.
Combined with turn, wipe can be used to wipe keys, values or both:
[a=1;b=2;c=3] # : # == [=1;=2;=3] : # == [1;2;3] # == 3
A multiset can be accessed as a multimap via the binary match(>) operator. The match operation indexes keys to produce associated values.
For each matching key, associated values are concatenated into a sequence. Non matching keys will yield the 0 sequence:
[a=1;b=2;b=3;c=4;d=5] [a;c;a;b;f;d] > == [[1];[4];[1];[2;3];0;[5]]
When the unary unique operator(') is applied on a multimap, pairs that hold the same key will be collapsed into one. Applying unique multiple times will yield the same result as when is applied once:
[a=1;b=2;b=3;c=4] ' ' == [a=1;b=2 3;c=4] ' == [a=1;b=2 3;c=4]
Two sequences can be tested on equality with the binary equals(?) operator. Equals yields one if two sequences are equal; otherwise, it yields zero:
0 [a;b;c] [a;b;c] ? ? == 0 1 ? == 0
Key expressions in a sequence can be chopped into single atoms with the unary chop(\) operator.
First, all the keys of a sequence are concatenated into one expression and then each atom of that expression is turned into a key. Values are destroyed by chop:
[1 2 3=4 5 6;7 8 9=10 11 12] \ == [1;2;3;7;8;9]
The meta operator rewrite(@) takes two operands. The size of second operand determines the number of subsequent rewrites of all the (expression)pairs in the first operand:
[1 2 + 3 4 + +=5 6 + 7 8 + + -] 2 @ == [10;26 -]
Most sequences in Enchilada are lazy. For example, reversing(`) a sequence yields a lazily reversed sequence.
The unary force(!) operator forces a sequence to be strict.
There is no syntactic difference between a lazy and strict sequence, except that a strict sequence may require less memory internally and, potentially, less computation.
However, forcing a big lazy sequence to be strict may require extreme computational resources.
Only the lazy parts of a sequence will be optimally forced to be strict: every (sub)sequence that is already strict yields the same (sub)sequence and adds no extra runtime cost:
[8;7;6] [1;2;3;4;5] ` + ! == [8;7;6] [5;4;3;2;1] + ! == [8;7;6;5;4;3;2;1] ! == [8;7;6;5;4;3;2;1]
A sequence is de-solved(.) into a expression by (syntactically) removing its square brackets, semi-colons and equals symbols:
[1=2;3 4;5 6=7] . ==1 2 3 4 5 6 7
A symbol is a string of alpha-numeric and underscore characters, that must be postfixed with a letter:
This_is_a_symbol one1_two2_three3 etc etc
A symbol is a passive placeholder which cannot be rewritten by normal operators. Instead, lambdas or the replace operator must be used to replace symbols with values.
Symbols may appear anywhere, even at the top level expression.
A lambda is a unary operator that replaces a symbol in its lexical scope when it is rewritten.
Syntactically, a lambda is a list of symbols + an expression, separated by an equals symbol and enclosed by curly brackets. Everything between the curly brackets is defined to be the lambda's scope:
{symbol-list = expression}
When a lambda is rewritten, a symbol is popped from the symbol list. Then all symbols that match the popped symbol are replaced by the sequence sitting left to the lambda:
1 {a=a a} == 1 1
1 2 {a b=b a} == 1 {a=2 a} == 2 1
1 2 {b={a=b a}} == 1 {a=2 a} == 1 2
1 2 {a b=[a;[b]]} == {a=[a;[2]]} == [1;[2]]
[c] [d] {a b=[a=b]} == [c] {a=[a=[d]]} == [[c]=[d]]
Alternatively, symbols can be replaced with expressions via eager lambdas. Eager lambdas are denoted with two equals symbols(==) instead of one.
An eager lambda replaces each symbol in scope with a de-solved sequence.
[+] {p=={x y=x y p}} == {x y=x y +}
[1=2;3=4;+] {d=={x=x d x}} == {x=x 1 2 3 4 + x}
Symbols can also be replaced with expressions via the binary replace(^) operator.
The replace operator takes two sequences: the first sequence holds (nested) symbols that need to be replaced; the second sequence holds a multimap that maps keys(symbols) to values(expressions):
[4 5 add 6 sub] [add=+;sub=- +] ^ == [4 5 + 6 - +]
Replacing symbols is optimally efficient: s (nested) symbols in a sequence of size n are replaced in O(s*log(n)) time.
An Enchilada string is syntactically enclosed by double quotes:
"Hello world!"
What makes strings different from other atoms is that consecutive strings are syntactically concatenated into one:
"Hel" "lo" " w" "o" "r" "ld!" == "Hello world!"
Internally, a string of length n is an expression with n consecutive character elements.
Strings support unicode and allow special characters such as tabs and carriage returns.
"test: " " tab tab" " and some utf-8: Γ Δ Ε Ζ Η Θ"
During runtime, symbols can be dynamically manipulated via sequences.
What makes symbols so powerful is that sequences with symbols can be re-used during various compilation stages.
For example, 'holes' in a template can be replaced with concrete content via symbols. Here is an example of a XML template that shows the replacing of two holes:
["<div><" t ">" c "</" t ">" "<div>"] [c="Jim";t="p"] ^ == ["<div><p>Jim</p><div>"]
Another example shows the 'magic' manipulation of two sequences that hold symbols:
[p] {u==[a;b] [c;d;e] * [c=u] ^} == [a p;a d;a e;b p;b d;b e]
Every Enchilada operator defines a total function. As such there are no runtime exceptions.
The REPL provides a global memory slot where temporary results can be stored.
Depending on the storage back-end, the global slot can be persisted in memory, on a filesystem or in a Distributed Hash Table (DHT).
The global slot can be set by prefixing the top-level expression with an equals symbol:
=1 2 3
Subsequently, the global slot can be referenced via the global reference, which is syntactically denoted by ():
() + + == 1 2 3 + + == 1 5 + == 6
The global slot can be persisted on back-end storage by using the save(.s) command.
.s
After the global slot has been persisted, the runtime returns a unique reference to what is held in storage. This reference can be re-used in other expressions, or by other Enchilada nodes that share the same storage back-end.
(P1asW6BED++fOl4o86DsE5DVYdU=) == 1 2 3
A reference is replaced with its associated expression during parsing: it is a parsing exception if a reference cannot be resolved.
(AAAAAAAAAAAAAAAAAAAAAAAAAAA=)