Language Specification
This page defines the Astreum language syntax, expression model, evaluation semantics, operator set, actor model, metering, and module system as implemented in the machine package and CLI module loader.
1. Lexical elements
The tokenizer produces a flat list of string tokens from source text.
- Whitespace: Spaces, tabs, and newlines delimit tokens and are otherwise ignored.
- Parentheses:
(and)are standalone tokens that delimit list expressions. - Quote token:
'is emitted as a standalone token. - Integer literals: Decimal integers such as
1,255, and-128parse toExpr.Int. - Float literals: Values such as
3.14and-2.5parse toExpr.Float. - String literals: Double-quoted text such as
"hello world"parses toExpr.String. - Hex bytes:
0x1fand0Xabparse toExpr.Bytes. - Symbols: Any other contiguous non-whitespace, non-parenthesis string. Examples:
sum,def,math.sum. - Line comments:
;starts a comment that runs to end-of-line. - Block comments:
#;skips the next complete expression, including nested lists.
2. Expression data model
All runtime values are instances of Expr. The implementation exposes six variants:
Expr.Link
- A pair of optional
Exprreferences:headandtail. - Can also carry unresolved hash pointers (
head_hash/tail_hash) for lazy DAG traversal. Link(None, None)is the NIL sentinel.- Hash domain tag:
\x00. - Serialization tag:
0x00.
Expr.Symbol
- Wraps a UTF-8 string name such as
"sum","def", or"my.var". - Hash domain tag:
\x01. - Serialization tag:
0x01.
Expr.Bytes
- Wraps raw byte data such as
b"\x01"orb"\x00\x80". - Used for byte literals and hex literals.
- Hash domain tag:
\x02. - Serialization tag:
0x02.
Expr.Int
- Arbitrary-precision signed integer.
- Hash domain tag:
\x03. - Serialization tag:
0x03.
Expr.Float
- IEEE 754 double-precision float.
- Hash domain tag:
\x04. - Serialization tag:
0x04.
Expr.String
- UTF-8 string content.
- Hash domain tag:
\x05. - Serialization tag:
0x05.
Content-addressed hashing
- Every
Exprhas a cached 32-byte Blake3 hash computed lazily. Expr.to_bytes(expr)uses a tag byte followed by variant-specific payload bytes.Expr.from_bytes(data)deserializes back to anExprtree and raisesValueErroron invalid data.
3. Parsing
tokenize(source: str) -> List[str]produces tokens.parse(tokens: List[str]) -> (Expr, List[str])consumes one expression and returns it with the remaining tokens.(opens a list. Items are parsed recursively until). Empty list()producesLink(None, None).- Decimal integers parse to
Expr.Int. - Float tokens parse to
Expr.Float. - Double-quoted strings parse to
Expr.String. 0xor0Xprefixed hex tokens parse toExpr.Bytes.- All other tokens become
Expr.Symbol. ParseErroris raised on unexpected end-of-input or unmatched).
4. Evaluation model
evaluation(machine, expr, stack, env) -> List[Expr] is the core recursive evaluator.
Symbol dispatch
- Operator: If the symbol is in
OPERATOR_LIST, the handler is called. The operator pops arguments from the stack and pushes results. - Variable: Otherwise, the symbol is looked up via
env.get(value). If bound, the value is pushed. If unbound,NILis pushed. - Meter charges: bound lookups cost
symbol_size + value_size; unbound lookups costsymbol_size + 1.
Atom evaluation
Bytes,Int,Float, andStringvalues push themselves onto the stack.- Charges are size-based and depend on the concrete value.
Link evaluation
- Quote: If the list head is
quoteor', the tail is pushed unevaluated.(quote)with no tail pushesNIL. - Normal: Evaluate head, then evaluate tail recursively. This is how postfix dispatch works.
Result
Machine.run(expr, env)callsevaluationand returns the top of stack, orNILif the stack is empty.
5. Operators
Stack notation below uses (before -> after).
5.1 Arithmetic
+-(b a -> sum). Int + Int returns Int. Float + Float returns Float. Mixed Int/Float promotes to Float.--(b a -> diff). Same type rules as+.*-(b a -> product). Same type rules as+./-(b a -> quotient). Int / Int uses integer division. Float / Float uses float division. Mixed Int/Float promotes to Float.%-(b a -> remainder). Int only.sqrt-(a -> sqrt(a)). Float only.
Example: (1 2 +) -> 3. (1.5 2.5 +) -> 4.0.
5.2 Bitwise
&-(b a -> a & b). Bytes only.|-(b a -> a | b). Bytes only.^-(b a -> a ^ b). Bytes only.~-(a -> ~a). Bytes only.
5.3 Shifts and rotates
<<- logical left shift on Bytes.>>>- logical right shift on Bytes.>>- arithmetic right shift on Bytes.rol- rotate left on Bytes.ror- rotate right on Bytes.
5.4 Stack operations
dip- temporarily removes one value, evaluates the next expression, then restores the saved value.drop- discard the top stack value.dup- duplicate the top stack value.swap- swap the top two stack values.
5.5 Expression construction
link-(head tail -> Link(head, tail)).head-(link -> head). PushesNILif the head is missing.tail-(link -> tail). PushesNILif the tail is missing.is_atom-(expr -> 1|0). Returns 1 for non-Linkvalues, includingBytes,Int,Float,String, andSymbol.is_eq-(b a -> 1|0). Structural equality.eval-(expr -> evaluated). Re-enters the evaluator on the value.quote-(a -> (' a)). Stack operator that wraps a value in a quotation.symbol-(a -> symbol|NIL). ConvertsBytes,String,Int, orFloattoExpr.Symbol.str-(a -> string|NIL). Converts any atom toExpr.String.float-(a -> float|NIL). ConvertsInt,Bytes(exactly 8 bytes),String, orSymboltoExpr.Float.int-(a -> int|NIL). ConvertsBytes,String,Symbol, orFloattoExpr.Int.bytes-(a -> bytes|NIL). ConvertsInt,Float,String, orSymboltoExpr.Bytes.ref-(hash -> expr|NIL). Resolves a 32-byte hash to a stored expression.load-(hash -> full_expr|NIL). Deep-resolves a 32-byte hash recursively.
5.6 Control flow
fn- pops a body and parameter list, then binds arguments in a child environment with lexical parentage.lambda- same asfnbut with no parent environment and nodef_target.if-(cond then else -> result). The condition is evaluated first. Truthiness is non-zeroBytes, non-zeroInt, non-zeroFloat, or non-NILLink.
5.7 Definition
def-(name value -> ). Storesvalueundernameinenv.def_targetorenvifdef_targetisNone. Write-once per target environment.
Example: (10 x def) binds x to Int(10).
5.8 Actor model
spawn-(body name -> name|NIL). Spawns a new actor thread runningbodyin a child environment.namemust be aSymbol.send-(target msg -> ). Sendsmsgto the mailbox of actortarget.receive-(target -> msg|NIL). Blocks until a message arrives in the mailbox of actortarget.
6. Environment and scoping
Env(data, parent, def_target)stores local bindings, an optional lexical parent, and the environment that receivesdefwrites.get(key)checks local data first, then walks parent environments.put(key, value)binds in the local environment.defwrites toenv.def_target.fnsetsdef_target=global_env, sodefinside a function writes globally while lookups still resolve lexically.lambdacreates an environment with no parent and no def target.
7. Machine and metering
Machine class
Machine(node, mode="dynamic", meter_enabled=True, meter_limit=None)orchestrates evaluation.- In
"dynamic"mode, all operators execute normally. - In
"deterministic"mode,spawn,send,receive, andevalpushNILinstead of executing. run(expr, env=None)evaluates an expression and returns the top of the stack orNIL.
Meter (gas)
Meter(enabled, limit)tracks byte-level computation cost.charge_bytes(n)is a no-op when disabled. If the limit would be exceeded, it raisesMeterExceededError.- Operator charges are size-based; arithmetic does not use width-squared charging in the current implementation.
8. Module system
The module system is implemented by the CLI tool, not the machine evaluator. It processes .aex files into environments at load time.
Module file structure
- A module file is a sequence of top-level S-expressions, each parsed independently.
- Each expression must be a 3-element form:
(value name_or_prefix terminator). - The terminator must be
deforimport.
Definitions
(value name def)storesvalueundernamewith no runtime evaluation at load time.- Names are UTF-8 symbols. Example:
(1 version def).
Path imports
(prefix "path/to/module.aex" import)or(prefix path/to/module.aex import)loads another module file.- Paths may be absolute or relative to the importing module's directory.
- All definitions in the loaded module are prefixed with
prefix..
Reference imports
(prefix (0x... ref) import)loads a module expression stored in atom storage by content hash.
Symbol rewriting
- When a module is loaded under a prefix, all symbol references within its definitions are rewritten to fully qualified names. Example:
sum->math.sum.
Circular import protection
- The loader maintains an
active_stackset. If a module is encountered while already on the stack,ValueErroris raised.
