Ironwall

Ironwall Language Specification

Ironwall Lexical Specification

This document defines Ironwall's lexical boundary. The goal is to keep atomic shapes, syntax sugar, and name-closure rules closed and explicit, so that ambiguity is not deferred into later syntax and semantic stages.

1. Design Principles

  • Lexical rules must be closed, predictable, and easy to diagnose statically
  • The lexical stage accepts only a finite and explicit set of atomic shapes; it does not perform loose parsing that "guesses meaning from context"
  • Composite names related to the module system are closed at the lexical stage itself; ~ and @ are not left for later character-by-character recombination
  • Chain forms such as a.b.c are only surface syntax sugar, not an independent operator category

2. Allowed Characters

  • The set of non-whitespace characters allowed by the lexer is: ASCII letters, decimal digits, _, ., $, ^, ~, @, and the four bracket kinds
  • Whitespace serves only as a separator and carries no semantic meaning
  • Any character outside this set must be rejected directly at the lexical stage

3. Bracket Kinds

Ironwall distinguishes four kinds of brackets, and the lexer must preserve the bracket kind:

  • Parentheses ( )
  • Square brackets [ ]
  • Braces { }
  • Angle brackets < >

The four bracket kinds are not interchangeable containers. Each bracket kind corresponds to a different syntax domain.

4. Identifier Categories

4.1 ordinary identifiers

  • Regex: [a-zA-Z_][a-zA-Z0-9_]*
  • Examples: x, foo, my_var, _tmp

4.2 package path

  • Regex: seg (~ seg)+
  • Here seg must be an ordinary identifier
  • Examples: a~b, std~time, test~fixtures~parser_structures

4.3 package-qualified-name

  • Regex: <package-path>@<name>
  • The left side of @ must be a complete package path
  • The right side of @ must be a single ordinary identifier
  • Examples: app~cli@main, std~time@timestamp

4.4 typed atom

Ironwall accepts only postfix type spelling for typed atoms: $payload^type.

  • payload comes first and type comes after
  • type must be an ordinary identifier
  • If payload has identifier shape, it denotes a typed database reference
  • If payload has numeric shape, it denotes a typed numeric literal
  • Examples: $hello^s3, $line_break^c4, $42^i5, $3p14^f5

4.5 package-qualified typed database reference

The canonical shape of a package-qualified database reference is: <package-path>$<reference-id>^<ty>.

  • The left side must be a complete package path and may not use @
  • <reference-id> must be an ordinary identifier
  • The package-qualified shape is used only for database references, not for numeric literals
  • Therefore a~b~d$name^s3 is legal, while a~b~d$3p14^f5 must be rejected directly at the lexical stage

5. Closure Rules for $payload^type

5.1 typed database reference

When payload has ordinary-identifier shape, and the whole atom does not form a legal typed numeric literal, the atom is treated as a typed database reference.

  • Example: $hello_world^s3
  • Example: $answer_main^i5
  • Example: a~b~d$banner_title^s3

5.2 typed numeric literal

The numeric type prefixes are:

  • Signed integers: i5, i6, i7
  • Unsigned integers: u5, u6, u7
  • Floating point: f5, f6, f7
  • Complex numbers: z5, z6, z7

The digit payload rules are as follows.

5.2.1 integer payload

Legal integer payload shapes:

  • 0
  • Decimal positive integers, such as 42
  • Hexadecimal integers, such as 0x2A
  • Negative-integer encoding, such as 0neg332

Constraints:

  • Decimal positive integers may not have meaningless leading zeros; except for 0, they must start with 1-9
  • 0x must be followed by at least one hexadecimal digit
  • Negative hexadecimal spellings such as 0neg0x2A are not supported; if a negative number needs to be represented, decimal negative payload must be used
  • The role of hexadecimal payload is "a literal shape aligned with a bit-level representation", not merely an alternative decimal spelling sugar
  • That is, the intent of 0x2A is to express an integer in a bit-pattern-oriented way, not to encourage treating hexadecimal and decimal as fully equivalent surface notations that can be swapped freely
5.2.2 floating-point payload

Legal floating-point payload shapes:

  • Use p in place of the decimal point, for example 3p14
  • Support finite negative floats, for example 0neg3p14
  • Scientific notation uses ep / en, for example 3p14ep23, 3p14en20
  • Support finite negative scientific notation, for example 0neg3p14en20
  • Shapes with only an exponent and no fractional part, for example 5ep10
  • Special values: inf, 0neginf, nan

Constraints:

  • The fractional part after p may not be empty; 3p is illegal and must be written as 3p0
  • The exponent part must be a non-negative decimal integer
  • Finite negative floats use the 0neg prefix
5.2.3 complex payload

At the spec layer, z5, z6, and z7 complex literals are explicitly supported.

Their strict shape is:

0real<RealPart>img<ImagPart>

Where:

  • The payload must begin with 0real
  • img must appear exactly once
  • RealPart may not be omitted
  • ImagPart may not be omitted
  • Both RealPart and ImagPart must be legal real-number payloads
  • Legal real-number payloads include: integers, negative integers, floating point, negative floating point, scientific notation, negative scientific notation, inf, 0neginf, and nan
  • Traditional complex spellings that mix +, -, ., e, or i into the payload are not allowed

Examples:

  • $0real0neg42p32img0neg3p22^z5
  • $0real3p14img2p0^z6
  • $0realinfimg0neginf^z7

Illegal examples:

  • $0realimg1^z5
  • $0real3p14^z5
  • $3p14img2p0^z5
  • $0real3p14img2p0img1^z5

The semantics of a complex payload are those of a primitive complex literal, not a plain-text shorthand for calling z*_rect.

5.2.4 deciding between typed database reference and typed numeric literal

In the general case, the two are not ambiguous:

  • Database-reference payloads start with identifier-like shapes
  • Numeric-literal payloads mainly start with digits or keyword-like constant shapes

Therefore, in most cases, the two paths are naturally separated by lexical form.

The one exception that must be preserved explicitly is the floating-point keyword constants:

  • inf
  • nan

Although these payloads start with letters, under the f5 / f6 / f7 prefixes they must be classified as numeric literals first, not as database references.

That is:

  • $inf^f5 is a floating-point literal
  • $nan^f5 is a floating-point literal
  • $inf^s3 is still a database reference
  • $answer^i5 is still a database reference
5.2.5 examples

Legal:

  • $0^i5
  • $42^i5
  • $0neg332^i5
  • $0x2A^u5
  • $3p14^f5
  • $0neg3p14^f5
  • $3p14ep23^f6
  • $3p14en20^f7
  • $0neg3p14en20^f5
  • $inf^f5
  • $0neginf^f5
  • $nan^f5
  • $0real0neg42p32img0neg3p22^z5
  • $0real3p14img2p0^z6
  • $0realinfimg0neginf^z7
  • $hello^s3
  • a~b~d$hello^s3

Illegal:

  • 42
  • 0p0
  • $001^i5
  • $0neg0x2A^i5
  • $0realimg1^z5
  • $3p14img2p0^z5
  • $0real3p14img2p0img1^z5
  • i5$42
  • s3$hello
  • a~b~d$3p14^f5

Not allowed:

  • Bare 42
  • Bare 3p14
  • Inferring a default numeric type from context

6. Expansion of Chained Surface Sugar

At the lexical level, only one chained syntax sugar form is supported, and each segment may be only one of the following two categories:

  • An ordinary identifier
  • A package-qualified-name, such as a~b@c

$payload^ty and pkg$reference^ty do not participate in any chained expansion.

6.1 dot chains

a.b.c is expanded during lexical desugaring into nested cm_get calls.

  • a.b.c -> (cm_get (cm_get a b) c)
  • a.b.c.d -> (cm_get (cm_get (cm_get a b) c) d)
  • a~b@c.d~e@f.h~i@j -> (cm_get (cm_get a~b@c d~e@f) h~i@j)
  • This is lexical sugar for member-read semantics; later semantic processing still follows the ordinary rules of cm_get
  • A dot chain must be lexically one continuous raw chunk, so a . b, a. b, and a .b are all illegal
  • A formatter may rewrite a restorable nested cm_get chain back into a.b.c without changing semantics

6.2 illegal chain shapes

Illegal examples:

  • a-b-c
  • hello..world
  • foo.-bar
  • $hello.world^s3

7. Comment Ban

  • No comment syntax such as //, #, /* */, or ; is defined
  • When explanatory text is needed, it should be represented through typed database entries or other ordinary language data
  • Comments have no privileged lexical path

8. Examples of Illegal Shapes

The following shapes must be rejected at the lexical stage or at a very early syntax stage:

  • a~~b
  • a~@main
  • a~b@c@d
  • 1abc
  • @main
  • a~b.iw
  • $hello.world^s3
  • Bare numeric 42
  • 0real3p14img2p0
  • i5$42
  • a~b~d$3p14^f5

9. Lexical Boundary

  • Bracket kinds must be preserved lexically
  • Package paths, package-qualified-names, and typed references are each closed into a single atom
  • a.b.c is already expanded before entering later stages, and no chained atom remains

Ironwall Syntax Specification

This document describes the core syntax shapes of Ironwall. It answers only "how to write it" and does not repeat the full semantics of types and modules; those are defined separately by the type, semantics, and module specifications.

1. Root Structure

  • Every module-mode .iw source unit must have exactly one root block: {program <unit-id> ...}
  • program may appear only at the root and may not be nested inside other expressions
  • The canonical shape of unit-id is <package-path>@<unit-name>

Example:

{program app~cli@main
  (function main ([args <array s3>]) to i5 in $0^i5)
}

2. Keywords

The formal specification uses the following keywords:

  • var
  • var_set
  • function

Only the keywords above are accepted.

3. Binding Syntax

Binding positions uniformly use:

[name Type]

Rules:

  • name must be an ordinary identifier
  • Type must be written explicitly
  • Spellings such as [x] and [x _] that omit the type are illegal

4. Blocks and Order

4.1 {...} block

{e1 e2 ... eN}
  • Denotes a sequential-evaluation block
  • Returns the value of the last expression
  • An empty block is illegal

5. Variables and Assignment

5.1 var

(var [x T] expr)
  • Creates and initializes a named binding
  • var is used both for local variables and for top-level globals

5.2 var_set

(var_set x expr)
  • Reassigns an existing binding
  • Assignment to object fields does not go through var_set, but through the cm_set builtin

6. Functions

6.1 anonymous function fn

(fn ([p1 T1] [p2 T2] ...) to Ret in body)
  • fn is a first-class value
  • The parameter list and return type must both be written explicitly

6.2 named function function

(function name ([p1 T1] ...) to Ret in body)
  • function must appear at the top level
  • Named functions with the same name may form an overload set by parameter type

6.3 declare

(declare (function name ([p1 T1] ...) to Ret))
  • Declares the signature of an external function without providing an Ironwall body
  • declare may appear only at the top level

7. let

(let (([x T] e1) ([y U] e2) ...) in body)
  • The binding list is written with double parentheses
  • Every binding must carry an explicit type
  • A let body has exactly one main expression

8. Conditionals and Loops

8.1 if

(if cond then a else b)

8.2 while

(while condition in body)

8.3 cond

(cond
  (c1 e1)
  (c2 e2)
  (else eN)
)
  • The else branch must appear last

9. Type Syntax

9.1 function type

<to Ret from T1 T2 ...>

9.2 union type

<union T1 T2 ...>
  • A union type must contain unique immediate member types
  • Duplicate immediate members are rejected during type validation rather than deduplicated
  • Nested union syntax is allowed and denotes a nested union member, not an expanded member list

9.3 generic head

<generic Name T1 T2 ...>
  • This shape is used only in the header of a generic class or generic function declaration

10. Generic Declaration and Instantiation

10.1 generic function declaration

(function <generic id T> ([x T]) to T in x)

10.2 generic class declaration

(class <generic Box T>
  (property [value T])
  (constructor ([v T]) in (cm_set self value v))
)

10.3 explicit instantiation

<id i5>
(<id i5> $42^i5)
<Box i5>
  • <name T...> denotes explicit application of type arguments to a generic name
  • If it is then wrapped in an outer (...), the instantiated result is being called

11. match

(match value
  ([x T1] body1)
  ([y T2] body2)
  ...
)
  • Every branch begins with a typed bind

12. Classes

12.1 class declaration

(class Point
  (property [x i5])
  (property [y i5])
  (method sum () to i5 in (add (cm_get self x) (cm_get self y)))
  (constructor ([x0 i5] [y0 i5]) in
    {
      (cm_set self x x0)
      (cm_set self y y0)
    }
  )
)

12.2 class member clauses

  • (property [name Type])
  • (method name ([p T] ...) to Ret in body)
  • (constructor ([p T] ...) in body)

13. Calls and Object Operations

13.1 ordinary calls

(callee arg1 arg2 ...)

For the following builtins, the frontend also accepts additional variadic surface sugar:

  • add / sub / mul / and / or may be written with >= 2 parameters; the parser lowers them into a right-associative binary tree
(add a b c d)

Equivalent to:

(add a (add b (add c d)))
  • le / lt / ge / gt / eq may be written with >= 2 parameters; semantically, they form a pairwise comparison chain joined by right-associative and
(le a b c d)

Equivalent to:

(and (le a b) (and (le b c) (le c d)))
  • This is frontend sugar, not extra runtime/builtin overloads; therefore 0 and 1 parameter forms are still illegal

13.2 object construction

(class_new Point $1^i5 $2^i5)

13.3 field reads and writes

(cm_get obj field)
(cm_set obj field expr)
  • Only the object primitive set class_new / cm_get / cm_set is accepted

Lexical sugar:

  • a.b.c is equivalent to (cm_get (cm_get a b) c)
  • a.b.c.d is equivalent to (cm_get (cm_get (cm_get a b) c) d)
  • Every segment must be an ordinary identifier or a package-qualified-name
  • a . b, a. b, and a .b are all illegal, because this sugar must be lexically a single raw chunk with no spaces
  • a-b-c is not member-read syntax

14. Typed Literal / Reference

Typed literals and typed database references accept only the following canonical shape:

$payload^type

Rules:

  • $42^i5, $3p14^f5, and $hello^s3 are all legal atoms
  • When a package-qualified database reference is needed, it must be written as pkg$reference^ty
  • pkg$reference^ty denotes only a cross-package database reference, not a numeric literal
  • Therefore a~b~d$banner^s3 is legal, but a~b~d$3p14^f5 is illegal

If a short-name database reference is not unique within the visible package set, it must be rewritten as a package-qualified database reference.

15. Array Syntax

  • The builtin array type is written as <array T>
  • The related builtin call shapes are:
(array_new <array T> len init)
(array_get xs idx)
(array_set xs idx value)
(array_length xs)

15. import

(import a~b~c)
  • import may appear only at the top level
  • The target of import is a package path, not a file path

Ironwall Type System Specification

This document defines Ironwall's type construction, type equality, assignability, and the closure rules for generics and union types.

1. Primitive Types

The primitive types are:

  • Signed integers: i5, i6, i7
  • Unsigned integers: u5, u6, u7
  • Floating point: f5, f6, f7
  • Complex numbers: z5, z6, z7
  • Characters: c3, c4, c5
  • Strings: s3, s4, s5
  • Others: bool, unit

The naming convention is "prefix letter + exponent n". Its design intent is that 2^n represents a width grade; however, type equality still depends only on the type name itself.

2. Class Types

2.1 Ordinary classes

  • Every top-level class forms a nominal type
  • A class type is identified by its class name, not by structural equality

2.2 Generic class instances

  • Explicit instantiations such as <Pair i5 s3> and <Node i5> form concrete types
  • Generic class instances are still nominal types; both the type name and all type arguments must match

2.3 Builtin generic types

The builtin generic type is:

  • <array T>

array is a builtin runtime type, not a user-defined class.

3. Function Types

Function types are written as:

<to Ret from T1 T2 ...>

Rules:

  • Parameter count, parameter order, each parameter type, and the return type all participate in type equality
  • Zero-argument functions are still one case of function type

4. Union Types

Union types are written as:

<union T1 T2 ...>

Closure rules:

  • Union members are canonicalized at the type layer
  • Nested unions are not flattened; a nested union remains an immediate member type
  • Duplicate immediate members are a type error and must be rejected; they are not silently deduplicated
  • Member order does not affect the final notion of type equality
  • Every immediate member type inside a union must be unique within that union

Therefore, the following types must be considered equal:

  • <union i5 f5>
  • <union f5 i5>

The following type is distinct from both of the above because the nested union is preserved:

  • <union i5 <union f5 i5>>

The following type is invalid because i5 appears twice as an immediate member:

  • <union i5 f5 i5>

5. Type Equality

5.1 primitive

  • Two primitive types are equal only if their type names are exactly the same

5.2 class

  • Two class types are equal only if their class names are exactly the same

5.3 generic class / generic function instance

  • They are equal only if the generic name is the same and all type arguments are pairwise equal

5.4 function type

  • The parameter count must be the same
  • The parameter order must be the same
  • The corresponding parameter types must be equal
  • The return type must be equal

5.5 union type

  • After canonicalization, the member sequence must match exactly
  • Canonicalization sorts immediate members for equality, but does not flatten nested unions

6. Assignability

The isAssignable rule is intentionally conservative:

  • If actual and expected are type-equal, assignment is allowed
  • If expected is a union, and actual is type-equal to one of its member types, assignment is allowed
  • No other implicit assignability relation is defined

This means:

  • An i5 value may be used directly as a member value of <union i5 f5>
  • i5 is not implicitly converted to f5
  • <union i5 f5> is not implicitly narrowed to i5

7. Generics

7.1 Supported range

  • Generic classes are supported
  • Generic functions are supported
  • Generic declare is not supported
  • Type aliases are not supported

7.2 Explicit-first

  • Generic instantiation must explicitly write out all type arguments
  • The language does not provide inference that auto-fills type arguments from value arguments
  • A generic function name cannot be used as a bare value; it must first be explicitly instantiated

8. Explicit Annotation Requirements

The following positions must all carry explicit types:

  • [name Type] bindings
  • Function parameters
  • Function return types
  • Class properties
  • Top-level globals

Additional restrictions:

  • The declared type of a top-level global must be a primitive type, or a union containing at least one primitive member
  • The final value of a top-level global initializer must be a primitive payload assignable to that type

The following are not allowed:

  • Omitting the type of a let / var binding
  • Omitting the return type of a function
  • Omitting the type of a property

9. Numeric Type Rules

  • There is no default integer type and no default floating-point type
  • Numeric literals must be written as typed literals
  • There is no implicit numeric promotion such as i5 -> f5, f5 -> f6, or i5 -> i6
  • The available signatures of arithmetic and comparison builtins are determined by the builtin specification, not filled in through implicit conversion

10. unit

  • unit is both a primitive type and the spelling of its unique value
  • unit is commonly used for side-effect flows, empty results, and empty branches of types such as <union unit T>

11. Type Alias Ban

  • type alias is strictly forbidden

12. Overloading

  • Functions are overloaded by the uniqueness of the function name and parameter list
  • Generic classes and generic functions are overloaded by the uniqueness of the generic name and generic parameter count

Ironwall Builtin Boundary Specification

This document describes only the language builtins of Ironwall.

1. Layering Principle

Ironwall divides available capabilities into two layers:

  • Language builtins: recognized directly by the compiler and part of the core semantics
  • std~... packages: the standard library provided through ordinary top-level definitions

This boundary must remain clear:

  • Builtins do not require import
  • Names exported from std~... must be brought into scope through the corresponding (import std~...) before they can be used directly
  • There is no special rule that says "because it comes from the base lib, it automatically becomes a builtin name"

2. Language Builtins

2.1 builtin generic type

The language-level builtin generic type is:

  • array

It is written as:

<array T>

2.2 builtin call names

The core builtin call names are:

  • add
  • sub
  • mul
  • div
  • mod
  • le
  • lt
  • ge
  • gt
  • eq
  • neq
  • not
  • and
  • or
  • xor
  • bwand
  • bwor
  • bwxor
  • ls
  • rs
  • class_new
  • cm_get
  • cm_set
  • array_new
  • array_get
  • array_set
  • array_length
  • s3_new, s3_get, s3_set, s3_length
  • s4_new, s4_get, s4_set, s4_length
  • s5_new, s5_get, s5_set, s5_length
  • z5_new, z5_set, z5_real, z5_img
  • z6_new, z6_set, z6_real, z6_img
  • z7_new, z7_set, z7_real, z7_img

Only the spellings above are accepted. Object primitives accept only class_new / cm_get / cm_set, and variable reassignment accepts only var_set.

2.3 builtin signature closure

  • The numeric arithmetic builtins add / sub / mul / div / mod support same-type operations on u5|u6|u7|i5|i6|i7|f5|f6|f7, with no cross-type promotion
  • The comparison builtins le / lt / ge / gt / eq / neq support same-type comparisons on u5|u6|u7|i5|i6|i7|f5|f6|f7 and return bool
  • The same comparison builtins also support same-type comparisons on c3|c4|c5 and return bool; their semantics are defined by single code-unit / byte ordering
  • not supports (bool) -> bool
  • and / or / xor support only bool
  • The bitwise / shift builtins bwand / bwor / bwxor / ls / rs support u5|u6|u7|i5|i6|i7, and do not support f5|f6|f7
  • s3_new / s4_new / s5_new support two signatures: (sN) -> sN and (i5, cN) -> sN
  • s3_get / s4_get / s5_get have the signature (sN, i5) -> cN
  • s3_set / s4_set / s5_set have the signature (sN, i5, cN) -> unit
  • s3_length / s4_length / s5_length have the signature (sN) -> i5
  • z5_new / z6_new / z7_new have the signature (zN) -> zN
  • z5_set / z6_set / z7_set support two signatures: (zN, zN) -> unit and (zN, fN, fN) -> unit
  • z5_real / z5_img return f5; z6_real / z6_img return f6; z7_real / z7_img return f7
  • The frontend surface sugar additionally accepts >= 2 argument forms for add / sub / mul / and / or; semantically, they are lowered into a right-associative binary tree. For example, (add a b c d) is equivalent to (add a (add b (add c d)))
  • The frontend surface sugar additionally accepts >= 2 argument forms for le / lt / ge / gt / eq; semantically, they are expanded into a pairwise comparison chain and joined with right-associative and
  • The variadic surface sugar above does not change the builtin core type boundary: the 0 argument form is still illegal, not remains a standalone unary (bool) -> bool builtin, and div / mod / neq / xor / not are not automatically included in this sugar family

2.4 object and array primitives

  • Whether class_new is legal is determined by the constructor set of the target class
  • cm_get / cm_set are class-object primitives, not general library APIs
  • array_new / array_get / array_set / array_length are array primitives, not std~... package helpers
  • s3_* / s4_* / s5_* are text primitive families, not std~... package helpers
  • z5_* / z6_* / z7_* are primitive complex copy / update / projection families, not std~... package helpers

4. Visibility and Reserved Names

In the specification:

  • Builtin names are global reserved top-level names
  • self is a reserved name
  • Ordinary names exported by std~... packages are not part of the global reserved set

This means:

  • User packages must not export names such as add, array_new, s3_new, z5_real, or self
  • Names such as print, sin, val_to_f7, and bin_to_f7 exported by std~... are only ordinary exports reserved within their corresponding packages; they are not language builtins

Ironwall Module System Specification

This document defines Ironwall's multi-file module semantics. The core principle is that semantic identity is determined only by unit id, and that import, package export, entry selection, and global initialization are all closed under one unified set of rules.

1. Core Terms

1.1 source unit

  • A .iw file participating in module mode is a source unit
  • The language-level identity of a source unit is determined by its file-name stem

1.2 package path

  • A package path is formed by joining ordinary identifiers with ~
  • Example: a~b~c

1.3 unit id

  • The canonical unit id shape is <package-path>@<unit-name>
  • Example: app~cli@main

1.4 literal db asset

  • A literal db is a package-level asset, not an anonymous JSON mapping
  • One literal-db file corresponds to one database-reference bundle in a package, not to a single reference
  • The canonical file-name shape is <package-path>$<reference-bundle>.json
  • Example: app~assets$banner.json

2. File Names and program Header

2.1 canonical file name

Under multi-file module mode, the canonical file name is:

<package-path>@<unit-name>.iw

For example:

  • a~b@date.iw
  • std~time@timestamp.iw
  • app@main.iw

2.2 canonical header

The root of the source file must be written as:

{program <package-path>@<unit-name>
  ...
}

2.3 consistency constraints

Compilation must be rejected in the following cases:

  • The file-name stem and the unit id in the program header do not match
  • A single file contains more than one root program
  • The canonical unit id is missing
  • Duplicate unit ids appear in the same semantic closure

3. Directory Semantics

  • Directories have no language-level meaning
  • If two source units in different engineering locations have the same unit id and both participate in compilation, that is a same-unit-id conflict
  • Directories are only an engineering organization mechanism, not part of language semantics

Literal-db files obey the same rule: semantic identity depends only on the file stem, not on the containing directory.

4. Top-level Structure Restrictions

Under module mode, the top level may contain only:

  • (import package-path)
  • class
  • function
  • declare
  • Generic class
  • Generic function
  • Top-level var

The following are forbidden at top level in module mode:

  • Bare top-level executable expressions
  • Non-top-level import
  • Non-top-level class / function / generic definitions

5. Packages and Exports

5.1 package identity

  • Package identity depends only on the package-path string itself
  • One package may be composed of multiple source units

5.2 package export set

The following named top-level definitions enter the package export set:

  • class
  • function
  • declare
  • Generic class
  • Generic function
  • Top-level globals
  • Literal-db references

5.3 the special status of main

  • Top-level main is a unit-local entry symbol
  • main does not enter the ordinary package export set
  • Other units may not refer to a unit's main as an ordinary exported symbol through pkg@main

6. main Rules

If a top-level function is named main, it must satisfy all of the following:

  • It must not be declare
  • It must not be generic
  • It must be at the top level
  • It must have exactly one parameter
  • That parameter must be named args
  • The parameter type must be <array s3>
  • The return type must be i5
  • At most one main may be defined in a single unit

A project may contain multiple entry units; if the entry is not unique, the entry unit must be selected explicitly.

7. import

7.1 syntax and target

(import a~b~c)
  • The target of import is a package path, not a file path and not a unit id
  • import may appear only at the top level

7.2 duplicate, missing, and unused imports

The following cases must be errors:

  • Importing the same package more than once in one unit
  • Importing a package that does not exist
  • An import that ultimately contributes no visibility to any short-name or fully-qualified cross-package resolution

Note:

  • import controls cross-package visibility and imports only the exact target package
  • import a~b does not implicitly import a~b~c or any other child package
  • A cross-package fully-qualified name such as pkg@name or pkg$reference^ty still requires pkg to be the exact package imported by the current unit
  • Using a fully-qualified name from an imported package counts as using that import

8. Name Resolution

8.1 short-name resolution order

The resolution order for an unqualified short name is:

  1. Local lexical scope
  2. The current package
  3. Imported packages
  4. Builtin names

Once one layer uniquely matches, resolution stops and later layers are not searched.

8.2 current package wins first

  • When the current package matches, the result must not be upgraded into ambiguity merely because an imported package has a symbol with the same name
  • If multiple imported packages all match the same name, an ambiguity error must be reported

8.3 fully-qualified names

The fully-qualified form for a package-exported symbol is:

<package-path>@<symbol-name>

Its meaning is:

  • Directly reference a top-level name visible from some package
  • Require the target package to be either the current package or an exact package imported by this unit
  • It may not bypass package-export rules to access unit-local special cases
  • Overload resolution continues only inside the same package's same-name function set

The package-qualified form of a database reference does not use @, but instead:

<package-path>$<reference-id>^<ty>

Where:

  • @ is reserved for global / class / function names in package exports
  • $ is reserved for literal-db reference names
  • They are different naming entry points and may not be mixed

If a short-name database reference is not unique within the visible package set, a package-qualified database reference must be used. The package in a package-qualified database reference must also be the current package or an exact package imported by this unit.

9. Package-level Symbol Conflicts

Ironwall adopts a single main namespace with two limited overload exceptions.

The following cases must be errors:

  • Two class definitions with the same name in one package
  • A class and an ordinary function with the same name in one package
  • A class and a global with the same name in one package
  • A class and a generic class with the same name in one package
  • A class and a generic function with the same name in one package
  • A global and a function / declare with the same name in one package
  • A global and a generic class with the same name in one package
  • A global and a generic function with the same name in one package
  • A generic class and an ordinary function / declare with the same name in one package
  • A generic function and an ordinary function / declare with the same name in one package
  • Two generic class definitions in one package with the same name and the same number of type parameters
  • Two generic function definitions in one package with the same name and the same number of type parameters
  • Two ordinary functions or declares with exactly the same signature in one package

The following cases are allowed:

  • Ordinary named functions in one package may form an overload set by signature
  • Generic class declarations in one package may form an overload set by the number of type parameters under the same name
  • Generic function declarations in one package may form an overload set by the number of type parameters under the same name
  • Different packages may export the same short name

Additional rules:

  • class, ordinary function / declare, generic class, generic function, and top-level global all share a single package-level main namespace
  • Inside this main namespace, the names of class, ordinary function / declare, and top-level global must all be pairwise distinct
  • Generic class and generic function may not reuse any of those non-generic names either
  • There are only two allowed same-name cases: ordinary named functions overloaded by function signature, and generic class / generic function overloaded by type-parameter count

9.1 literal-db rules

A literal-db file must satisfy the following:

  • The file-name stem must be <package-path>$<reference-bundle>
  • The JSON top level must be an object
  • All keys and all values must be strings
  • The key of the first key-value pair does not participate in semantic analysis and may be any non-empty string
  • The value of the first key-value pair must be exactly equal to the file stem, so that it aligns with the file name
  • Aside from the first key-value pair, many additional pairs are expected; together they form the same db bundle
  • Aside from the first key-value pair, every key must have the shape referenceId^ty
  • Aside from the first key-value pair, every value must be a string; even numeric content must be encoded as a string first and then interpreted by the typed-reference rules
  • Within the same package, all referenceId^ty across all db files must be globally unique

Example:

{
  "this_key_is_ignored_and_only_the_value_is_checked": "app~assets$banner",
  "hello^s3": "Hello",
  "answer^i5": "42"
}

The following cases must be errors:

  • The file-name stem and the value of the first key-value pair do not match
  • Duplicate literal-db entry names appear within the same package
  • Source code writes a package-qualified non-reference shape such as a~b~d$3p14^f5

10. Reserved Names

  • The language builtin top-level names form a reserved set
  • self is also a reserved name
  • Ordinary names exported from std~... are not part of the global reserved set; they are ordinary imported-package exports
  • User packages must not define top-level exports that conflict with the builtin reserved set

11. Top-level Globals

11.1 basic rules

  • A top-level var is treated as a global definition
  • A global must have both an explicit type and an initializer
  • The declared type of a global must be a primitive type, or a union containing at least one primitive member
  • If the global type is a union, the payload computed by the initializer must also be a primitive payload assignable to that union
  • Declaration-first / initialization-later style is not supported

11.2 readability and writability

  • A package may read and write its own globals
  • Visible globals from other packages may also be read and written
  • When accessing another package through either a short name or a fully-qualified name, exact import is required for visibility; fully-qualified names remove short-name ambiguity but do not bypass import

11.3 initializer bans

A global initializer must not do the following:

  • The initializer must statically converge into a primitive payload at compile time
  • The initializer must not read any global
  • The initializer must not call ordinary functions, generic functions, or declare
  • The initializer must not allocate heap shapes such as class / array / closure / union objects
  • The initializer must not contain while, match, or any other node that cannot be guaranteed to stay inside the static-primitive subset
  • If the initializer needs intermediate state, it may use only local let / local var with explicit types, and the values of those locals must also always remain primitive payloads

The static-primitive subset contains at least:

  • Primitive typed literals
  • Literal-db text references
  • true, false, unit
  • if, cond, seq
  • Local let with explicit types
  • Local var with explicit types and var_set on that local
  • Direct pure builtin calls whose results remain primitive payloads

12. Global Initialization Model

  • The semantic result of a top-level global initializer must already be determined at compile time
  • There is no initializer read-dependency between globals; therefore no user-visible global-init dependency graph is defined
  • File discovery order, directory order, and lexicographic order have no semantic force
  • If a global is never read by any program fragment reachable from the entry, the compiler may omit it from the final program; this does not change language-level observable semantics

13. Separate Compilation Artifacts

  • A source unit may be compiled independently into its own unit artifact
  • If the unit contains GC-visible layouts or top-level globals, the artifact should carry that unit's own metadata table and global-var table
  • The runtime identity of a metadata table must be represented by a deterministic UUID; link/integration must not identify it merely by load order
  • When multiple separately compiled units are integrated, the final program must produce a metadata-table collection and a global-var-table collection
  • These collections are the runtime/GC-visible link result; they preserve the identity of "which unit artifact a table belongs to" instead of flattening everything unconditionally into one table with lost provenance

13.1 precompiled-lib archives

  • The toolchain may package a set of modules into a precompiled library archive in .tgz format
  • An archive must at least carry:
  • manifest.json
  • Each separately compiled unit's own machine artifact
  • Each separately compiled unit's own runtime-support artifact
  • The archive does not carry a source bundle; a consumer's static checking of a precompiled library must rely only on the manifest signature tables rather than rereading the library source
  • If a single package is split across multiple units, the archive must preserve that unit boundary too; it may not secretly flatten them into a single library.s and erase per-unit metadata/global-table identity

13.2 manifest contracts

  • The compiledUnits field of a manifest must list, per unit:
  • unitId
  • assemblyPath
  • supportPath
  • metadataTableExportSymbol
  • globalTableExportSymbol
  • runtimeInitExportSymbol
  • runtimeInitExportSymbol is responsible for attaching that unit's local metadata/global table into the collection and then executing that unit's top-level initialization body
  • The manifest must carry these signature tables:
  • global signatures
  • class signatures
  • function signatures
  • generic class signatures
  • generic function signatures
  • All names inside those signature tables must use full package-qualified names rather than bare exported short names
  • The manifest must also carry generic monomorph tables:
  • generic class monomorph table
  • generic function monomorph table
  • The semantic key of a monomorph table is not the source-level literal <generic ...> form, but <generic, normalized endtype tuple>
  • If the type arguments of a monomorph entry still contain user generic class instances, they must first be recursively normalized into endtypes before being written into the table
  • The value of a monomorph table must be the real name of the concrete class/function; this name may be a monomorphized internal symbol, but it must preserve the full package-qualified name of the source generic rather than degrading into only a short export or anonymous hash
  • Consumer compilation and final linking must both resolve to the same concrete class/function name

13.3 consuming precompiled libraries

  • One or more precompiled library archives may be loaded in an ordinary compile/check/run/emit flow
  • A consumer's static checking of imported classes/functions/globals from a precompiled library must rely only on the manifest signature tables; it must not require rereadable source from inside the archive
  • From the consumer's perspective, generic class/function signatures from the loaded archive must be visible just like imported package exports
  • When the consumer instantiates a generic class/function from a precompiled library:
  • Every type argument must first be recursively reduced into an endtype
  • Then the manifest monomorph table must be looked up using <generic, normalized endtype tuple>
  • If the lookup hits, the resulting concrete name must be used
  • If the lookup misses, compilation must be rejected immediately; the compiler must not silently fall back to remonomorphizing that library generic on the fly
  • After consumer compilation completes, final linking must link in the archive's per-unit artifacts as well

14. Entry

  • If there is no top-level main, no executable entry can be generated
  • If exactly one unit defines main, it may be selected automatically as the entry
  • If multiple units define main, the entry unit must be selected explicitly

Ironwall Core Semantics Specification

This document describes Ironwall's core semantics, including scope, evaluation rules, mutability, the constraints on classes and arrays, and the error model.

1. Overall Principles

  • Explicit beats implicit
  • Static analyzability beats stacks of syntax sugar
  • Safety and auditability beat complex implicit behavior
  • The language provides no language-level exception system

2. Scope and Name Resolution

Inside a core expression, names are resolved in the following order:

  1. Local lexical scope
  2. Top-level names in the current package
  3. Top-level names in imported packages, including explicitly imported std~... packages
  4. Language builtin names

Finer package rules at the module layer are defined by the module specification.

3. Mutability

3.1 Mutable bindings

The following bindings are semantically mutable through var_set:

  • Local variables introduced by var
  • let bindings
  • Top-level globals visible to the current unit

3.2 Immutable bindings

The following bindings are immutable:

  • Parameters of fn / function
  • Parameters inside class methods and constructors
  • self

Applying var_set to an immutable binding must be an error.

4. Visibility of let

  • let bindings take effect from left to right in written order
  • An ordinary binding value cannot forward-reference a later ordinary binding
  • If a binding value is itself an fn, that fn may participate in a local recursive function set
  • Even in a locally recursive case, ordinary non-function bindings still obey the prefix-visible rule

5. Control Flow

5.1 if

  • cond must be bool
  • The then and else branch types must be equal
  • Only the selected branch is evaluated

5.2 while

  • condition must be bool
  • The condition is checked before each iteration body runs
  • The type of the whole while expression is always unit

5.3 cond

  • else must exist and must be the last branch
  • Every non-else condition must be bool
  • All branch result types must be equal

5.4 block

  • {e1 ... eN} is evaluated in written order
  • The value of the block is the value of its last expression

6. Unions and match

6.1 union member lifting

  • If T is a member of <union ...>, then a T value may be assigned directly to that union type
  • A union must carry a runtime tag

6.2 match

  • The matched value must be a union type
  • The branch set must exhaustively cover all union member types
  • The bound type in each branch must correspond to some union member
  • The result types of all branch bodies must be equal
  • If a union member is itself a union, the outer match binds that nested union value. A second match is required to inspect the nested union's own runtime tag and payload

If a value does not satisfy the type precondition of match, that is an unrecoverable failure.

7. Classes and Objects

7.1 Basic class constraints

  • Every class must have constructors; multiple constructors are allowed, and they are overloaded by parameter uniqueness
  • Property names must be unique within a class
  • Method names must be unique within a class
  • A property and a method may not share the same name
  • Inheritance is not supported

7.2 constructor constraints

  • A constructor must initialize all properties
  • A constructor must not read a property before that property has been initialized
  • When a property is read indirectly through a method, the initialization-order requirement still applies

7.3 self

  • self is automatically bound only inside methods and constructors
  • self is an immutable binding, but its fields may be initialized or modified through cm_set

8. Arrays

  • <array T> is a fixed-length array
  • array_get / array_set must perform bounds checks
  • array_length returns i5
  • If a class is batch-created by array_new as an element type, that class must have a zero-argument constructor, so that the array can be built through the zero-arg constructor

9. Top-level globals

  • A top-level var in module mode denotes a global
  • A global must have both an explicit type and an initializer
  • A global type must be a primitive type, or a union containing at least one primitive member
  • A global initializer must be statically reduced to a primitive payload at compile time
  • A global initializer must not read other globals, and must not call user-defined functions, generic functions, or declare
  • A global initializer is restricted to control flow and builtins inside the static-primitive subset
  • As long as a global is visible to the current unit, that global may be read and written; short-name access must still obey import visibility rules

Finer module-level global rules are defined by the module specification.

10. Precompiled Generic Instantiation

  • Generic classes and generic functions coming from a precompiled library remain semantically generic, but the consumer may use only monomorph entries that the library has explicitly packaged
  • The consumer's static checking of these imported symbols as classes / functions / globals may rely only on the signature table in the library manifest; semantically, it must not require the library to still carry source code that can be re-resolved
  • For the type arguments of imported precompiled generics, the compiler must first perform compile-time evalmon-style normalization:
  • A primitive endtype remains itself
  • An already concrete class endtype remains itself
  • A nested generic class instance must first recursively normalize its inner type arguments, and then use the precompiled class monomorph table to find the corresponding concrete class endtype
  • Only after all type arguments converge into endtypes can the outer generic function / generic class monomorph lookup succeed
  • Therefore, for a shape such as <make_box <Box <Box i5>>>, the inner <Box i5> and <Box <Box i5>> must be reduced layer by layer into concrete class endtypes before looking up the outer make_box monomorph entry
  • The concrete class/function name returned by monomorph lookup must be the single name shared by later static checking and linking; if it is an internally generated name, that name must still preserve the full package-qualified name of the source generic
  • If lookup fails at any layer, it is a static error and compilation must be rejected
  • A consumer of a precompiled library must not rematerialize the library's user generics on the fly when a table entry is missing; the semantic contract is "usable only when the table lookup hits"

11. Error Model

11.1 Static errors

The following are static diagnostics:

  • Lexical errors
  • Syntax errors
  • Type errors
  • Name-resolution ambiguities
  • Illegal top-level structure
  • Global-init cycles
  • Assignment to immutable bindings

11.2 Runtime failures

The following are unrecoverable runtime failures:

  • Array out-of-bounds access
  • Invalid union tag
  • Violated builtin preconditions
  • Other unrecoverable failures that violate execution preconditions

11.3 Exception ban

  • The language does not provide throw, try, or catch
  • Recoverable failure should be modeled with unions or other explicit data models

Ironwall C FFI Specification

This document defines Ironwall's current C FFI rules, including Ironwall calling C, C calling Ironwall, the types allowed across the boundary, naming rules, and concrete examples.

1. Position

Ironwall does not encourage FFI.

FFI is a temporary compromise, not Ironwall's ideal boundary. The reason is direct:

  • C does not provide the memory-safety and type-safety guarantees that Ironwall wants to provide
  • C code can read and write out of bounds, keep dangling pointers, corrupt the runtime heap, corrupt GC metadata, and free memory incorrectly
  • Once execution enters C, Ironwall's safety model can only treat C as a trusted but unsafe external world

Therefore:

  • FFI should be used only at necessary system boundaries, for existing C libraries, platform syscall wrappers, and transitional runtime glue
  • FFI should not be treated as a routine abstraction mechanism
  • FFI should not be used to bypass the Ironwall type system, GC, safety boundary, or module rules
  • A bug on the C side is a bug that can break the whole process; it is not an ordinary error that Ironwall can fully isolate

The existence of FFI is an engineering reality, not a language direction. Ironwall's long-term direction should be to reduce FFI surface area, not to expand it.

2. Core Model

C FFI has two directions:

  • Ironwall calls C: external C functions are declared in Ironwall with declare
  • C calls Ironwall: Ironwall functions with names following the export rule are exported as C-callable wrappers

The two directions use different ABIs:

  • declare ... clang ... uses the low-level runtime ABI, where C functions directly receive and return iw_value_t
  • iwlang export uses a host-friendly ABI, where C functions use int64_t, const char *, char *, and host array structs

This split is intentional:

  • When Ironwall calls C, C is treated as an internal runtime extension and must understand iw_value_t
  • When C calls Ironwall, the external caller should not depend directly on heap-object layout, so the wrapper performs value conversion and copying

3. Naming Rules

FFI symbols must use a full name carrying a namespace UUID and confirmation tag. Old-style bare C symbols do not conform to the spec.

3.1 names for Ironwall calling C

Format:

_<uuid>_clang_<function_name>_<tag1>

Where:

  • <uuid> is a namespace string containing only ASCII letters and digits
  • clang is fixed and indicates "this symbol is provided by C"
  • <function_name> must match the C-identifier shape: [A-Za-z_][A-Za-z0-9_]*
  • <tag1> is an 8-digit hexadecimal confirmation tag

Example:

_81af42c9d7354eb08bfe95163c04ad20_clang_iw_build_json_add_seven_c267f2a7

If the tag does not match the uuid / function name, compilation must reject it.

3.2 export names for C calling Ironwall

Format:

_<uuid>_iwlang_<function_name>_<tag1>

Where:

  • iwlang is fixed and indicates "this symbol is exported by Ironwall"
  • The other fields follow the same rules as the clang naming form

Example:

_4a8b9c0d1e2f34567890abcdef123456_iwlang_iw_export_i5_roundtrip_bca9013a

3.3 purpose of the tag

The confirmation tag is not a security key and not a permission mechanism. Its purpose is:

  • To prevent hand-written symbols from silently linking when the namespace or function name was typed incorrectly
  • To provide a low-cost consistency check for cross-language boundary names
  • To keep old-style bare symbols from silently mixing into the formal FFI spec

Implementations should generate tags with one consistent hash rule. Users should not calculate tags manually; they should use tooling or existing helpers.

3.4 confirmation-tag algorithm

The confirmation tag is generated with Ironwall's current hashText algorithm.

hashText(input) is defined as 64-bit FNV-1a:

  • Initial value: 14695981039346656037
  • Prime: 1099511628211
  • For each character in input, take the integer value returned by charCodeAt / the UTF-16 code unit
  • hash = hash xor code_unit
  • hash = (hash * prime) mod 2^64
  • The final output is a 16-digit lowercase hexadecimal string, left-padded with 0 if needed

The hash input for a declared C function using clang is:

<uuid>clang<function_name>

The hash input for an exported Ironwall function using iwlang is:

<uuid>iwlang<function_name>

tag1 is the last 8 hexadecimal digits of hashText(hash_input).

For example:

uuid = 81af42c9d7354eb08bfe95163c04ad20
language = clang
function_name = iw_build_json_add_seven
hash_input = 81af42c9d7354eb08bfe95163c04ad20clangiw_build_json_add_seven
hashText(hash_input) = 6d7038b4c267f2a7
tag1 = c267f2a7

Therefore the full symbol is:

_81af42c9d7354eb08bfe95163c04ad20_clang_iw_build_json_add_seven_c267f2a7

Runnable Node.js validation code:

function hashText(input) {
  let hash = 14695981039346656037n;
  const prime = 1099511628211n;
  for (let i = 0; i < input.length; i += 1) {
    hash ^= BigInt(input.charCodeAt(i));
    hash = (hash * prime) & 0xffffffffffffffffn;
  }
  return hash.toString(16).padStart(16, "0");
}

function confirmationTag(uuid, language, functionName) {
  return hashText(`${uuid}${language}${functionName}`).slice(-8);
}

function declaredCFunctionName(uuid, functionName) {
  const language = "clang";
  return `_${uuid}_${language}_${functionName}_${confirmationTag(uuid, language, functionName)}`;
}

function exportedIwFunctionName(uuid, functionName) {
  const language = "iwlang";
  return `_${uuid}_${language}_${functionName}_${confirmationTag(uuid, language, functionName)}`;
}

const declaredUuid = "81af42c9d7354eb08bfe95163c04ad20";
const exportedUuid = "4a8b9c0d1e2f34567890abcdef123456";

console.log(hashText(`${declaredUuid}clangiw_build_json_add_seven`));
console.log(declaredCFunctionName(declaredUuid, "iw_build_json_add_seven"));
console.log(exportedIwFunctionName(exportedUuid, "iw_export_i5_roundtrip"));
console.log(exportedIwFunctionName(exportedUuid, "iw_export_s3_roundtrip"));
console.log(exportedIwFunctionName(exportedUuid, "iw_export_array_i5_roundtrip"));

Expected output:

6d7038b4c267f2a7
_81af42c9d7354eb08bfe95163c04ad20_clang_iw_build_json_add_seven_c267f2a7
_4a8b9c0d1e2f34567890abcdef123456_iwlang_iw_export_i5_roundtrip_bca9013a
_4a8b9c0d1e2f34567890abcdef123456_iwlang_iw_export_s3_roundtrip_d247d3be
_4a8b9c0d1e2f34567890abcdef123456_iwlang_iw_export_array_i5_roundtrip_f3f8886c

4. Ironwall Calling C

4.1 Ironwall declaration syntax

Ironwall declares C functions through declare:

(declare
  (function _81af42c9d7354eb08bfe95163c04ad20_clang_iw_build_json_add_seven_c267f2a7
    ([value i5])
    to i5))

It is then used like an ordinary function:

{program app@main
  (declare
    (function _81af42c9d7354eb08bfe95163c04ad20_clang_iw_build_json_add_seven_c267f2a7
      ([value i5])
      to i5))

  (function main ([args <array s3>]) to i5 in
    (_81af42c9d7354eb08bfe95163c04ad20_clang_iw_build_json_add_seven_c267f2a7 $35^i5))
}

4.2 C-side function signature

The current C ABI for declare is the iw_value_t ABI. A C function must directly receive and return iw_value_t:

#include <stdint.h>

typedef intptr_t iw_value_t;

static inline int64_t iw_as_i64(iw_value_t value) {
    return ((int64_t)value) >> 1;
}

static inline iw_value_t iw_from_i64(int64_t value) {
    return (iw_value_t)(intptr_t)((((uint64_t)value) << 1) | 1ULL);
}

iw_value_t _81af42c9d7354eb08bfe95163c04ad20_clang_iw_build_json_add_seven_c267f2a7(iw_value_t value) {
    int32_t raw = (int32_t)iw_as_i64(value);
    uint32_t wrapped = ((uint32_t)raw) + 7u;
    return iw_from_i64((int64_t)(int32_t)wrapped);
}

Note: iw_as_i64 / iw_from_i64 on iw_value_t describe the carrying format of the tagged immediate, not the semantic width of every integer type. The semantic width of i5 is 32-bit. If a declared C function wants to use an i5 as a native number, it must first explicitly convert it to int32_t on the C side before doing arithmetic.

4.3 unit return values

In the C ABI, Ironwall unit is still represented as iw_value_t. The C side should return iw_from_i64(0):

(declare
  (function _5e8f0a4c71d24b6fa39ce2158bd7f043_clang_iw_sys_fd_close_a14b05cf
    ([fd i5])
    to unit))
iw_value_t _5e8f0a4c71d24b6fa39ce2158bd7f043_clang_iw_sys_fd_close_a14b05cf(iw_value_t raw_fd) {
    int fd = (int)iw_as_i64(raw_fd);
    close(fd);
    return iw_from_i64(0);
}

4.4 building heap values on the C side

When a declared C function needs to return s3, <array i5>, or <array s3>, the backend provides helpers when needed:

static inline iw_value_t make_iw_s3(const char *data);
static inline iw_value_t make_iw_array_i5(int64_t length);
static inline iw_value_t make_iw_array_s3(int64_t length);

static inline int32_t _iw_array_i5_get(iw_value_t raw_value, int64_t index);
static inline void _iw_array_i5_set(iw_value_t raw_value, int64_t index, int32_t element_value);
static inline int64_t _iw_array_i5_length(iw_value_t raw_value);

static inline iw_value_t _iw_array_s3_get(iw_value_t raw_value, int64_t index);
static inline void _iw_array_s3_set(iw_value_t raw_value, int64_t index, iw_value_t element_value);
static inline int64_t _iw_array_s3_length(iw_value_t raw_value);

Example:

{program app@main
  (declare
    (function _9a4c2e1f6b7d8c0a1234567890abcdef_clang_iw_ffi_make_array_i5_dfb65f00
      ()
      to <array i5>))

  (function main ([args <array s3>]) to i5 in
    (array_get (_9a4c2e1f6b7d8c0a1234567890abcdef_clang_iw_ffi_make_array_i5_dfb65f00) $0^i5))
}
iw_value_t _9a4c2e1f6b7d8c0a1234567890abcdef_clang_iw_ffi_make_array_i5_dfb65f00(void) {
    iw_value_t value = make_iw_array_i5(3);
    _iw_array_i5_set(value, 0, 7);
    _iw_array_i5_set(value, 1, 11);
    _iw_array_i5_set(value, 2, 13);
    return value;
}

4.5 portable declared-C types

The declare ... clang ... direction may currently use value types from ordinary Ironwall function types, but the formal, portable, and recommended set is:

  • unit
  • bool
  • i5
  • i6
  • i7
  • u5
  • u6
  • u7
  • f5
  • f6
  • f7
  • c3
  • c4
  • c5
  • s3
  • s4
  • s5
  • z5
  • z6
  • z7
  • <array i5>
  • <array s3>

Where:

  • All values cross the declared-C ABI as iw_value_t
  • Integer, unsigned, bool, and unit are immediate iw_value_t
  • Float, text, complex, and array values are heap/reference iw_value_t
  • <array i5> and <array s3> have stable helpers
  • Other heap types may exist internally as iw_value_t, but should not be used as public C FFI API types

The following are not recommended across the declared-C boundary:

  • Class instances
  • Closures
  • Unions
  • Nested arrays
  • Generic class instances other than <array i5> / <array s3>

These types would tie C to Ironwall's internal layout, GC metadata, and runtime tags, which is poor for both safety and compatibility.

5. C Calling Ironwall

5.1 export mode

If an Ironwall function is to be exported to C, its function name must use the iwlang naming rule:

{program app@main
  (function _4a8b9c0d1e2f34567890abcdef123456_iwlang_iw_export_i5_roundtrip_bca9013a
    ([value i5])
    to i5
    in
    (add value $1^i5))
}

The backend will generate a C wrapper for that function.

5.2 C-visible signatures

When C calls Ironwall, it does not use the declared-C iw_value_t ABI directly. Instead it uses a host-friendly ABI:

Ironwall typeC parameter typeC return type
i5int32_tint32_t
s3const char *char *
<array i5>iw_host_array_i5_tiw_host_array_i5_t
<array s3>iw_host_array_s3_tiw_host_array_s3_t

At present, only the types in the table above are supported for C calling Ironwall. Other types must not be used as parameters or return values of exported Ironwall functions.

Host array ABI:

typedef struct iw_host_array_i5_t {
    int64_t length;
    int32_t *items;
} iw_host_array_i5_t;

typedef struct iw_host_array_s3_t {
    int64_t length;
    char **items;
} iw_host_array_s3_t;

5.3 memory ownership

Exported wrappers perform copying at the C/Ironwall boundary.

Rules:

  • When C passes const char *, the wrapper copies it into Ironwall s3
  • When C passes an array struct, the wrapper copies the array contents
  • When Ironwall returns s3, the wrapper allocates a C char *
  • When Ironwall returns <array i5> or <array s3>, the wrapper allocates the items field inside the C array struct
  • The C caller must free heap memory returned by the wrapper

The generated header/runtime provides free helpers:

static inline void iw_host_free_s3(char *value);
static inline void iw_host_free_array_i5(iw_host_array_i5_t value);
static inline void iw_host_free_array_s3(iw_host_array_s3_t value);

5.4 example of C calling Ironwall

Ironwall:

{program app@main
  (function _4a8b9c0d1e2f34567890abcdef123456_iwlang_iw_export_s3_roundtrip_d247d3be
    ([value s3])
    to s3
    in
    value)

  (function _4a8b9c0d1e2f34567890abcdef123456_iwlang_iw_export_array_i5_roundtrip_f3f8886c
    ([values <array i5>])
    to <array i5>
    in
    values)
}

C:

#include "ironwall-generated-ffi.h"
#include <stdio.h>

int main(void) {
    __iw_c_init_runtime();

    char *text = _4a8b9c0d1e2f34567890abcdef123456_iwlang_iw_export_s3_roundtrip_d247d3be("hello");
    puts(text);
    iw_host_free_s3(text);

    int32_t storage[3] = { 1, 2, 3 };
    iw_host_array_i5_t input = { 3, storage };
    iw_host_array_i5_t output =
        _4a8b9c0d1e2f34567890abcdef123456_iwlang_iw_export_array_i5_roundtrip_f3f8886c(input);

    for (int64_t index = 0; index < output.length; index += 1) {
        printf("%d\n", (int)output.items[index]);
    }
    iw_host_free_array_i5(output);
    return 0;
}

6. GC and Safety Requirements

C FFI must obey the following rules:

  • C must not retain Ironwall heap pointers as long-lived state unless the spec explicitly allows it
  • C must not manually free Ironwall heap objects
  • C must not forge iw_value_t heap references
  • C must not modify the Ironwall heap header, runtime type tag, GC tag, or metadata table
  • If C needs to construct Ironwall heap values, it must use runtime/helper functions
  • If C needs to return unit, it must return iw_from_i64(0)
  • If the C side takes ownership of char * / host arrays returned by an exported wrapper, it must release them using the corresponding iw_host_free_* helper

With respect to GC:

  • FFI does not change Ironwall's explicit-GC position
  • The C side should not manipulate GC directly unless it understands the thread-attach/root rules
  • Exported Ironwall wrappers handle current-thread attach and necessary roots; a C caller should not bypass the wrapper and directly call an internal lowered function

7. Discouraged Patterns

The following patterns do not fit Ironwall's safety position:

  • Writing large amounts of business logic in C while treating Ironwall only as glue
  • Using C to directly manipulate internal layouts of Ironwall class / closure / union values
  • Passing raw pointers, integerized addresses, or unmarked ownership across FFI
  • Treating C global state as shared mutable state outside the Ironwall type system
  • Depending on undocumented runtime struct layouts
  • Using FFI to bypass semantic rules around unit, array bounds, GC roots, or module identity

If a feature can be written in Ironwall, it should be written in Ironwall first. FFI should serve only as a narrow bridge across an unsafe external world.

Ironwall Base-Lib Specification

This document defines how Ironwall's builtin standard library is loaded, how its packages are structured, and where its public API boundary lies.

1. Overall Principles

  • Base-lib source units are not special syntax units, and they are not injected fragments in the static-check or code-generation stages
  • The base lib must fully obey the package-system specification: canonical file names, canonical program headers, ordinary import, ordinary export

2. Loading Model

  • Standard-library source units and user source units go through the same unit-id validation, package export, and static-check pipeline
  • It is not allowed to skip package rules, reserved-name rules, or overload rules merely because a unit comes from the base lib

3. Package Split

Builtin standard packages:

  • std~box
  • std~option
  • std~array
  • std~list
  • std~set
  • std~dict
  • std~pair
  • std~eq
  • std~ord
  • std~hash
  • std~io
  • std~linux~sys
  • std~windows~sys
  • std~math
  • std~string

There is no requirement that a single aggregate package std exist. If a user wants to use names from the standard library, they must explicitly import the corresponding std~... package just as they would import any other package.

4. Support-Type Packages

4.1 std~box

std~box provides the smallest generic single-value wrapper:

  • <Box T>
  • <box_unwrap T>

Box<T> is represented as an ordinary generic class with one value property and exposes the inner value through an unwrap method.

4.2 std~option

std~option provides the smallest generic maybe-value wrapper:

  • <Option T>
  • <option_some T>
  • <option_none T>
  • <option_is_some T>
  • <option_is_none T>
  • <option_unwrap T>

Option<T> is represented as an ordinary generic class containing one <union unit T> payload.

  • is_some / is_none perform an explicit union test on the payload
  • unwrap returns the inner value in the Some branch and performs an explicit runtime abort in the None branch

4.3 std~array

std~array provides the first nominal wrapper layer around builtin <array T>:

  • <Array T>
  • <ArrayBuilder T>
  • <array_new_fill T>
  • <array_builder_new T>
  • <array_wrap T>
  • <Array_len T>
  • <ArrayBuilder_len T>
  • <Array_contains T>
  • <Array_filter_into T>
  • <Array_concat T>
  • <Array_concat_into T>
  • <Array_sorted T>
  • <Array_reversed T>
  • <Array_max T>
  • <Array_min T>

Array<T> publicly provides the following methods:

  • get
  • set
  • fill
  • copy
  • count
  • index
  • reverse
  • sort

ArrayBuilder<T> publicly provides the following methods:

  • append
  • build

Where:

  • count / index / Array_contains depend on an explicit Eq<T> support object
  • sort / Array_sorted / Array_max / Array_min depend on an explicit Ord<T> support object
  • Array_filter_into appends matching values into a caller-managed ArrayBuilder<T> in a single pass
  • Array_concat_into appends all items from an Array<T> into a caller-managed ArrayBuilder<T>
  • Array_reversed returns a new Array<T> snapshot, not an iterator/view
  • index performs an explicit runtime abort if the element is not found
  • copy / Array_concat must stably allocate the result array in generic cases

4.4 std~list

std~list provides the first nominal wrapper layer around a dynamic-length ordered container:

  • <List T>
  • <list_new T>
  • <list_len T>
  • <list_contains T>
  • <list_concat T>
  • <list_repeat T>
  • <list_pop T>
  • <list_sorted T>
  • <list_reversed T>
  • <list_max T>
  • <list_min T>

List<T> publicly provides the following methods:

  • get
  • set
  • append
  • insert
  • remove
  • pop
  • clear
  • copy
  • count
  • index
  • reverse

Where:

  • List<T> is represented as an ordinary generic class containing three properties: items, seed, and length; items uses a recursive <union unit <ListNode T>> chain rather than a dynamic vector buffer
  • get / set provide random-access behavior
  • append / insert / remove / pop / reverse rebuild the recursive node chain; clear directly resets items back to unit
  • count / index / remove / list_contains depend on an explicit Eq<T> support object
  • insert accepts only indices in 0..len; get / set / pop accept only indices in 0..len-1; out-of-range access performs an explicit runtime abort
  • The List.pop method form takes an explicit index; the top-level list_pop(list) helper provides the "pop last element" form
  • list_sorted / list_max / list_min depend on an explicit Ord<T> support object; list_reversed returns a new List<T> snapshot

4.5 std~set

std~set provides a mutable set wrapper represented as a recursive node chain:

  • <Set T>
  • <set_new T>
  • <set_len T>
  • <set_contains T>
  • <set_union T>
  • <set_intersection T>
  • <set_difference T>
  • <set_symmetric_difference T>

Set<T> publicly provides the following methods:

  • add
  • remove
  • discard
  • pop
  • clear
  • copy
  • union
  • intersection
  • difference
  • symmetric_difference
  • update
  • intersection_update
  • difference_update
  • symmetric_difference_update
  • isdisjoint
  • issubset
  • issuperset

Where:

  • Set<T> is represented as an ordinary generic class containing hash_rule, eq_rule, and a recursive items chain; it is not a dynamic array and does not use open addressing
  • Every node stores value, its value_hash, and next, so membership first compares hashes and falls back to Eq<T> only on collision
  • add / remove / discard / pop / clear and the four *_update methods mutate in place; copy and union / intersection / difference / symmetric_difference return new Set<T> values while preserving the recursive-node representation
  • remove performs an explicit runtime abort when the element is not found; discard does nothing when it is not found; pop also performs a runtime abort on an empty set
  • isdisjoint / issubset / issuperset and set operations depend on per-element membership checks against another Set<T>

4.6 std~pair

std~pair provides the smallest generic two-tuple nominal wrapper:

  • <Pair K V>
  • <pair_first K V>
  • <pair_second K V>

Pair<K, V> is represented as an ordinary generic class with two properties, first and second.

4.7 std~eq

std~eq provides an explicit equality support object:

  • <Eq T>
  • <eq_apply T>

Eq<T> wraps one comparator of type <to bool from T T> and exposes calls through the equals method.

4.8 std~ord

std~ord provides an explicit ordering support object:

  • <Ord T>
  • <ord_compare T>

Ord<T> wraps one comparator of type <to i5 from T T>. Its compare method follows the negative / zero / positive return convention.

4.9 std~hash

std~hash provides an explicit hashing support object:

  • <Hash T>
  • <hash_apply T>

Hash<T> wraps one hasher of type <to i5 from T> and exposes calls through the hash method.

5. std~io

std~io provides text output and flush APIs.

5.1 output overloads

The following names are ordinary top-level overloads, not builtins:

  • print : s3|s4|s5 -> unit
  • println : s3|s4|s5 -> unit
  • print_stderr : s3|s4|s5 -> unit
  • println_stderr : s3|s4|s5 -> unit

5.2 flush

  • flush : () -> unit
  • flusherr : () -> unit

5.3 Platform System Packages

System-boundary standard packages are explicit by target platform: std~linux~sys and std~windows~sys. Both are thin host-wrapper packages that abort directly on host-call failure instead of returning errno/result objects.

5.3.1 Linux (std~linux~sys)

The public std~linux~sys surface is grouped into three slices:

  • policy-aligned process / env / argv / time wrappers: sys_platform_name, sys_process_argc, sys_process_argv_s3, sys_env_get_s3, sys_env_set_s3, sys_env_unset_s3, sys_process_getpid, sys_process_spawn_s3, sys_process_spawn_stdio_s3, sys_process_wait, sys_process_kill, sys_process_id, sys_process_close, sys_process_exit, sys_process_exit_group, sys_time_unix_ms, sys_time_monotonic_ms, sys_sleep_ms
  • file / path / dir / stdio wrappers: sys_file_*, sys_fd_*, sys_path_*, sys_dir_open_s3, sys_dir_read_s3, sys_dir_close, sys_stdin_handle, sys_stdout_handle, sys_stderr_handle, sys_pipe_create, plus SysFileStat and sys_stat_* accessors
  • Linux-specific fd / network / readiness / signal primitives: sys_fd_readv_s3, sys_fd_writev_s3, sys_fd_sendfile, sys_fd_dup*, sys_fd_fcntl_*, sys_net_*, sys_epoll_*, sys_eventfd_*, sys_timerfd_*, sys_signalfd_*, sys_poll, sys_ppoll, sys_signal_*, sys_thread_gettid, sys_thread_yield

Where:

  • SysProcess is a pid-centric nominal wrapper. Callers should still pair it with sys_process_close() on Linux so cross-platform code keeps one symmetric lifecycle, even though the current Linux implementation is a no-op.
  • sys_file_* is the higher-level policy alias layer; sys_fd_* remains available for code that explicitly wants fd / offset / flag oriented operations.
  • sys_fd_pipe2() and sys_pipe_create() both return a length-2 <array i5> with index 0 as the read end and index 1 as the write end.
  • sys_fd_openat_* uses an explicit dir fd + relative child path model rather than an implicit AT_FDCWD helper.
  • sys_fd_fstat() / sys_file_stat() / sys_path_stat_s3() all produce nominal SysFileStat values. On Linux this structure preserves device/inode/mode/link_count/uid/gid/rdevice/size/block_size/block_count/atime_sec/mtime_sec/ctime_sec; common file-type checks should prefer sys_stat_is_regular / sys_stat_is_dir.
  • std~linux~sys does not export sys_process_fork, sys_process_execve_s3, sys_process_wait4, or sys_thread_tgkill as public wrappers. The Linux runtime may still use lower-level host primitives internally to implement spawn / wait behavior.

5.3.2 Windows (std~windows~sys)

std~windows~sys follows the same cross-platform policy slice and adds the Windows-side handle, event, wait, and TCP socket wrappers:

  • platform / env / argv / process / time wrappers: sys_platform_name, sys_process_argc, sys_process_argv_s3, sys_env_get_s3, sys_env_set_s3, sys_env_unset_s3, sys_process_getpid, sys_process_spawn_s3, sys_process_spawn_stdio_s3, sys_process_wait, sys_process_kill, sys_process_id, sys_process_close, sys_process_exit, sys_process_abort, sys_time_unix_ms, sys_time_monotonic_ms, sys_sleep_ms
  • file / path / dir / stdio wrappers: sys_file_*, sys_fd_*, sys_path_*, sys_dir_open_s3, sys_dir_read_s3, sys_dir_close, sys_stdin_handle, sys_stdout_handle, sys_stderr_handle, sys_pipe_create, SysFileStat, sys_stat_size, sys_stat_is_regular, sys_stat_is_dir
  • Windows network / event / wait wrappers: sys_net_startup, sys_net_cleanup, sys_net_*, sys_event_create_manual, sys_event_create_auto, sys_event_set, sys_event_reset, sys_event_close, sys_wait_one, sys_wait_many, sys_wait_timeout_code
  • thread identity wrapper: sys_thread_gettid

Where:

  • Windows shares the same high-level platform/env/path/process/time policy model, but sys_process_close() really closes native handles on Windows, so portable code should not omit it.
  • Windows TCP wrappers follow the Winsock lifecycle, so callers use sys_net_startup() / sys_net_cleanup(); socket close goes through sys_net_close(), while ordinary handle/event close goes through sys_event_close() or the internal handle-close wrapper.
  • std~windows~sys does not expose Linux-only raw primitives such as fork / execve / wait4 / tgkill, and it does not expose Linux-specific epoll / eventfd / timerfd / signalfd / poll surfaces.
  • Portable code should target the shared policy slice first, and depend on std~linux~sys explicitly only when Linux readiness / signal primitives are actually required.

6. std~math

std~math provides floating-point, complex-number, and scalar-conversion APIs.

6.1 constants

Use explicit typed variants rather than overloading on return type:

  • pi_f5, pi_f6, pi_f7
  • tau_f5, tau_f6, tau_f7

6.2 floating-point API

The following names form ordinary overloads on f5 / f6 / f7:

  • abs
  • round
  • floor
  • ceil
  • trunc
  • sin
  • cos
  • sqrt
  • hypot
  • atan2

Where:

  • abs / sin / cos / sqrt / hypot / atan2 return floating-point values of the same type
  • round / floor / ceil / trunc return i5

6.3 complex-number API

The following names form ordinary overloads on z5 / z6 / z7:

  • znew
  • zrect
  • zreal
  • zimg
  • zadd
  • zsub
  • zmul
  • zabs
  • zarg
  • zconj
  • zproj
  • zexp
  • zlog
  • zsqrt
  • zpow

6.4 scalar-conversion API

std~math divides scalar conversion into two naming families:

  • val_to_i5, val_to_i6, val_to_i7
  • val_to_u5, val_to_u6, val_to_u7
  • val_to_f5, val_to_f6, val_to_f7
  • bin_to_i5, bin_to_i6, bin_to_i7
  • bin_to_u5, bin_to_u6, bin_to_u7
  • bin_to_f5, bin_to_f6, bin_to_f7

Every target family must provide ordinary overloads for the following source types:

  • i5, i6, i7
  • u5, u6, u7
  • f5, f6, f7

In addition:

  • val_to_i5, val_to_u5, bin_to_i5, and bin_to_u5 additionally support c3, c4, c5

The semantic split is:

  • val_to_* follows numeric semantics and tries to preserve the numeric value as much as possible
  • Float-to-integer conversion first discards the fractional part; if the result exceeds the target integer width, the high bits of the truncated integer are then discarded
  • Integer-to-float conversion uses an ordinary numeric cast and may lose precision
  • bin_to_* uses binary-copy semantics and retains only the low bits of the source representation; if the target is wider, the remaining high bits are zero-filled
  • For c3/c4/c5 -> i5/u5, single code-unit / byte semantics still apply; for these source/target pairs, val_to_* and bin_to_* produce the same result

Therefore val_to_i5, val_to_u5, bin_to_i5, and bin_to_u5 each have 12 overloads, while every other target family has 9 overloads. Overload resolution may depend only on name and parameter type, never on return type.

7. std~string

std~string provides the first nominal wrapper layer around the text primitive families:

  • StringS3
  • StringS4
  • StringS5
  • StringBuilderS3
  • StringBuilderS4
  • StringBuilderS5
  • string_len
  • string_builder_new
  • string_builder_len
  • string_contains
  • string_concat
  • string_repeat
  • string_reversed

It publicly provides the following query methods:

  • find
  • count
  • startswith
  • endswith

It publicly provides the following builder methods:

  • append
  • build

Where:

  • StringS3 / StringS4 / StringS5 wrap s3 / s4 / s5 respectively
  • StringBuilderS3 / StringBuilderS4 / StringBuilderS5 accumulate string chunks and materialize the final string on build
  • string_builder_new derives the empty-string seed from the provided wrapper value so an empty builder can still build a correctly typed empty string
  • string_builder_len reports the total character length accumulated in the builder
  • find / count / startswith / endswith / string_contains are implemented through per-character c3/c4/c5 comparison and do not rely on whole-sN equality builtins
  • string_concat / string_repeat / string_reversed use explicit reconstruction through sN_new / sN_set / sN_get; string_reversed returns a reversed snapshot string, not an iterator
  • Semantics are defined by a single code-unit / byte text model; there is no Unicode normalization or grapheme-cluster handling
  • find returns -1 when the substring is not found; count follows Python-style len + 1 semantics for an empty needle

8. Builtin Boundary

  • std~... packages are ordinary packages, not part of the builtin name set
  • They may wrap runtime helpers exposed through declare, and may wrap language primitives, but the top-level names they expose after wrapping are still ordinary package exports
  • These names are usable only when visible through the current package or an imported package

9. Compatibility Requirements

  • Future standard-library evolution should happen primarily by adding new std~... packages or new ordinary exports to existing std~... packages
  • Synthetic std injection, base-lib AST injection, or special static-check / codegen branches for the base lib should not be introduced