This article provides a comprehensive technical exploration of the ZINC Abstract Machine (ZAM), the execution engine behind Caml Light and the foundation of modern OCaml. Drawing from Xavier Leroy's retrospective and the original Caml Light source code, we examine the complete compilation pipeline from lambda calculus to bytecode, the innovative push-enter evaluation model, and the sophisticated environment representation that enables efficient execution of strict functional languages.
Introduction: The Quest for Efficient Functional Language Implementation
The implementation of strict functional languages presents unique challenges: higher-order functions, currying, partial application, and the need for tail-call optimization. The Caml Light system, developed by Xavier Leroy and Damien Doligez at INRIA in 1990, represented a significant advancement through its ZINC Abstract Machine—a call-by-value variant of Krivine's machine that addressed the inefficiencies of earlier approaches like the CAM (Categorical Abstract Machine).
The fundamental insight, as noted by Pierre Crégut, was recognizing that the ZAM was essentially "Krivine's machine with some extra call-by-value baggage" . This article unpacks that baggage to reveal the engineering decisions that made Caml Light both elegant and efficient.
The Problem: Currying and Multiple-Argument Functions
Traditional abstract machines like the CAM, SECD, and FAM handle curried functions through nested closures:
C[[f a₁...aₙ]] = C[[f]]; C[[a₁]]; APPLY; ...; C[[aₙ]]; APPLY
C[[λⁿb]] = CLOSURE(...(CLOSURE(C[[b]]; RETURN)...); RETURN)
This approach suffers from two major inefficiencies :
n-1 intermediate closures are constructed before the function body executes
n-1 immediate returns occur as each curried layer consumes one argument
For a function of n arguments, we pay O(n²) in closure allocation and indirect jumps.
Stack-based languages (Forth, PostScript) suggest a better approach:
Caller pushes n arguments on the stack
Caller jumps to function code
Function accesses arguments from the stack
Function pops arguments, pushes result, returns
The challenge for functional languages: functions are first-class values, so the callee might receive fewer arguments (partial application) or more arguments (over-application) than expected.
The ZAM Solution: Push-Enter with Arity Checking
The ZAM introduces distinct compilation modes :
C mode (computes to value):
C[[n]] = ACCESS(n)
C[[λa]] = CLOSURE(T[[λa]])
C[[b a₁...aₙ]] = PUSHMARK; C[[aₙ]]; ...; C[[a₁]]; C[[b]]; APPLY
T mode (tail position):
T[[n]] = ACCESS(n); RETURN
T[[λa]] = GRAB; T[[a]]
T[[b a₁...aₙ]] = C[[aₙ]]; ...; C[[a₁]]; C[[b]]; TAILAPPLY
Notice the right-to-left evaluation order for arguments, crucial for correct stack discipline.
The GRAB instruction handles arity mismatches:
| Situation | Action |
|---|---|
Argument available (v :: s) | Pop v into environment, continue |
Marker reached (ε :: (c',e') :: r) | Partial application: create closure and return |
| Multiple arguments | Pop all expected, tail-call result with remainder |
The PUSHMARK instruction places a sentinel ε on the stack, enabling GRAB to detect when arguments are exhausted.
Complete ZAM Instruction Set and Semantics
Based on the Caml Light source code , we present the complete instruction set with formal transition rules.
State = (Code, Env, AStk, RStk)
Code: Instruction sequence
Env: Persistent environment (heap-allocated linked list)
AStk: Argument stack (value stack, including marker ε)
RStk: Return stack (saved Code × Env pairs)
| Instruction | Transition | Description |
|---|---|---|
QUOTE(c) | (c', e, s, r) → (c', e, c::s, r) | Load constant |
GETGLOBAL(q) | (c', e, s, r) → (c', e, global[q]::s, r) | Read global |
SETGLOBAL(q) | (c', e, v::s, r) → (c', e, unit::s, r) | Write global |
| Instruction | Transition | Description |
|---|---|---|
ACCESS(n) | (c', e, s, r) → (c', e, e[n]::s, r) | Environment lookup |
ENVACC(n) | (c', e, s, r) → (c', e, envacc(n,e)::s, r) | Direct environment access |
| Instruction | Transition | Description |
|---|---|---|
PUSH | (c', e, v::s, r) → (c', e, v::v::s, r) | Duplicate accumulator |
PUSHMARK | (c', e, s, r) → (c', e, ε::s, r) | Application boundary mark |
LET | (c', e, v::s, r) → (c', v::e, s, r) | Extend environment |
ENDLET(n) | (c', e, s, r) → (c', drop(n,e), s, r) | Shrink environment |
| Instruction | Transition | Description |
|---|---|---|
APPLY | (c', e, [lbl,e']::s, r) → (lbl, e', s, (c',e)::r) | Save return, jump |
TAILAPPLY | (c', e, [lbl,e']::s, r) → (lbl, e', s, r) | Jump without saving |
GRAB (has arg) | (c', e, v::s, r) → (c', v::e, s, r) | Pop argument to environment |
GRAB (partial) | (c', e, ε::s, (c'',e'')::r) → (c'', e'', [(RESTART::c', e)]::s, r) | Create closure, return to caller |
RETURN (normal) | (c', e, v::ε::(c'',e'')::r, s) → (c'', e'', v::s, r) | Restore caller |
RETURN (closure) | (c', e, [lbl,e']::s, r) → (lbl, e', s, r) | Continue with closure |
CLOSURE(lbl) | (c', e, s, r) → (c', e, [lbl,e]::s, r) | Create closure |
RESTART | (c', [v₁...vₖ,e], s, r) → (c', e, v₁::...::vₖ::s, r) | Restore partial args |
| Instruction | Transition | Description |
|---|---|---|
MAKEBLOCK(tag,n) | (c', e, v₁::...::vₙ::s, r) → (c', e, Block(tag,[v₁...vₙ])::s, r) | Allocate block |
GETFIELD(n) | (c', e, Block(_,fs)::s, r) → (c', e, fs[n]::s, r) | Field access |
SETFIELD(n) | (c', e, v::Block(_,fs)::s, r) → (c', e, unit::s, r) | Field update |
| Instruction | Transition | Description |
|---|---|---|
BRANCH(lbl) | c', e, s, r) → (find_label(lbl, c'), e, s, r) | Skip ahead to label |
BRANCHIF(lbl) | (c', e, 0::s, r) → (c', e, s, r) | Fall through if zero |
BRANCHIF(lbl) | (c', e, v::s, r) → (find_label(lbl, c'), e, s, r) | Jump if non-zero |
BRANCHIFNOT(lbl) | Inverse of BRANCHIF | |
SWITCH(table) | (c', e, v::s, r) → (find_label(table[select(v)], c'), e, s, r) | Jump table |
PUSHTARP(lbl) | (c', e, s, r) → (c', e, s, (lbl,e,sp)::r) | Push exception handler |
POPTRAP | (c', e, s, (lbl,e',sp')::r) → (c', e, s, r) | Pop handler |
Where find_label(lbl, code) scans forward in the instruction list to find Klabel lbl, with semantic:
find_label(lbl, []) = error "label not found"
find_label(lbl, Klabel lbl :: rest) = rest -- found, return code after label
find_label(lbl, _ :: rest) = find_label(lbl, rest) -- skip, continue searching
select(v) = if is_int(v) then int_val(v)
else table_size/2 + tag_val(v)
This allows a single table to handle both constant and constructor tags.
The Mixed Environment Representation
Simple abstract machines represent environments as heap-allocated linked lists, causing:
Inefficiency 1: Many environments are never saved in closures, yet heap-allocated
Inefficiency 2: Linked list access is O(n)
The ZAM uses a mixed representation :
Environment = Volatile Part S (stack) + Persistent Part E (heap)
add(v₁...vₙ, (S,E)) = (v₁...vₙ.S, E)
accessₙ(S,E) = S[n] if n < |S|
= accessₙ₋|ₛ|(E) if n ≥ |S|
close(c, (v₁...vₙ.ε, E)) = ([c, E'], (ε, E'))
where E' = add(v₁...vₙ, E)
In the common case of an n-argument function, fully applied, that doesn't build closures:
Arguments and let-bound variables reside in the volatile stack part
No heap allocation occurs during function execution
Measurements showed a 4× reduction in heap allocations and GC time .
From Lambda to ZAM: The Compilation Pipeline
The intermediate language uses de Bruijn indices for variable binding :
type lambda =
Lvar of int (* de Bruijn index *)
| Lconst of struct_constant
| Lapply of lambda * lambda list (* function + arguments *)
| Lfunction of lambda (* abstraction *)
| Llet of lambda list * lambda (* multi-binding let *)
| Lletrec of (lambda * int) list * lambda (* recursive bindings *)
| Lprim of primitive * lambda list (* primitive operations *)
| Lcond of lambda * (atomic_constant * lambda) list (* constant switch *)
| Lswitch of int * lambda * (constr_tag * lambda) list (* constructor switch *)
| Lstaticfail of int (* escape continuation *)
| Lstatichandle of lambda * lambda (* static exception handler *)
| Lhandle of lambda * lambda (* dynamic exception handler *)
| Lifthenelse of lambda * lambda * lambda
| Lsequence of lambda * lambda
| Lwhile of lambda * lambda
| Lfor of lambda * lambda * bool * lambda
| Lshared of lambda * int ref (* shared code *)
| Levent of event * lambda (* debugging event *)
The core compilation function compExpr : Label -> Lambda -> Code -> CompileM Code transforms Lambda IR to ZAM instructions, taking:
staticFail: Label for static failure (pattern match failure)
Lambda: Expression to compile
Code: Continuation (code that follows)
isReturn :: Code -> Bool
isReturn (KReturn : _) = True
isReturn (KLabel _ : c) = isReturn c
isReturn (KEvent _ : c) = isReturn c
isReturn _ = False
This simple function drives the tail-call optimization: when isReturn code = True, the compiler generates KTermApply instead of KApply, avoiding stack growth.
| Lambda Form | ZAM Code | Notes |
|---|---|---|
LVar n | KAccess n | Direct environment access |
LConst c | KQuote c | With constant-folding optimization |
LApply f args (tail) | args... KPush f KTermApply | No stack growth |
LApply f args (non-tail) | KPushMark args... KPush f KApply | Save return point |
LFunction body (tail) | KGrab body | Direct entry |
LFunction body (non-tail) | KClosure L | Delay compilation to L |
LLet args body | args... KLet body KEndLet(n) | Environment extension |
LIfThenElse c t e | c KBranchIfNot Lt t KBranch Lf KLabel Lt e KLabel Lf | Branch fusion |
LWhile c b | Lloop KCheckSignals c KBranchIfNot Lexit b KBranch Lloop KLabel Lexit | With signal check |
Recursive let rec uses backpatching with dummy values:
-- For: let rec f = λx... and g = λy... in body
KPrim(PDummy sz_f); KLet; -- Create dummy for f
KPrim(PDummy sz_g); KLet; -- Create dummy for g
compile f (KPush; KAccess 1; KPrim PUpdate; -- Patch f
compile g (KPush; KAccess 0; KPrim PUpdate; -- Patch g
body))
KEndLet 2
The PDummy primitive creates uninitialized heap blocks, PUpdate fills them after the recursive definitions are computed.
Advanced Features
The ZAM supports both static (Lstatichandle) and dynamic (Lhandle) exceptions:
Static (pattern match failure):
Compile-time known jump targets
Implemented with KBranch to handler label
No runtime overhead for normal path
Dynamic (try/with):
KPushTrap lbl saves handler address and stack state
KPopTrap removes handler on normal exit
PRaise unwinds to nearest handler, restoring saved state
Pattern matching compiles to decision trees for efficiency:
-- Integer/character switches use interval trees
compileNBranch :: [(AtomicConstant, Lambda)] -> DecisionTree
-- Result: balanced binary tree of KBranchInterval tests
-- O(log n) comparisons instead of O(n) sequential tests
For dense switches, a KSwitch jump table provides O(1) dispatch.
Comparisons fuse with branches to avoid explicit boolean values:
-- e1 < e2 && e3
-- Becomes: e1 e2 KTest(PTlt, Lfalse) e3 ...
-- Rather than: e1 e2 KPrim(PIntComp PTlt) KPrim PNot KBranchIfNot Lfalse ...
-- Similarly for ||, not, etc.
The ZAM2 Evolution: Lessons for Objective Caml
The ZAM2 , used in Objective Caml's bytecode interpreter, introduced further refinements:
Merged argument and return stacks, with return addresses pushed before arguments:
PUSHRETADDR(k); args... TAILAPPLY
Eliminated dynamic tests at variable access by compile-time determination of stack vs. heap variable placement:
STACKACCESS(n) -- for n < threshold
ENVACCESS(n) -- for n ≥ threshold
Closures contain only actually free variables, not entire environments, preventing memory leaks.
Push-Enter vs. Eval-Apply: A Retrospective
The ZAM embodies the push-enter model, contrasting with eval-apply :
| Aspect | Push-Enter (ZAM) | Eval-Apply (CAM/GHC) |
|---|---|---|
| Arity handling | Callee checks argument count | Caller ensures correct count |
| Partial application | Callee creates closure | Caller generates papp combinator |
| Over-application | Callee tail-calls result | Caller generates apply combinator |
| Known calls | Still indirect | Can be direct (major win) |
| Register usage | Constrained by abstract machine | Flexible calling conventions |
Surprisingly, eval-apply wins for optimizing native-code compilers :
80% of calls are to known functions
Direct calls are 5-20× faster than indirect
Register-based calling conventions beat stack discipline
The ZAM's push-enter design was optimal for bytecode interpretation but not for native code generation.
Conclusion
The ZINC Abstract Machine represents a pivotal point in functional language implementation:
Theoretical elegance: A call-by-value Krivine machine with minimal extensions
Engineering pragmatism: Mixed environments, arity checking, and tail-call optimization
Educational value: Exposed implementation concepts implicit in λ-calculus
Historical bridge: From abstract machines to modern optimizing compilers
As Xavier Leroy noted , "the journey could have been shorter by not going through abstract machines at all. But maybe it would never have taken place without the 'detour' through abstract machines."
The Caml Light implementation, with its ~200KB runtime and 2MB self-hosting memory footprint, proved that efficient functional language implementation was possible on modest hardware—and laid the groundwork for the industrial-strength OCaml we use today.
: Xavier Leroy, "From Krivine's machine to the Caml implementations," INRIA Rocquencourt.
: lambda.ml — Caml Light Lambda IR definition (Module Lambda).
: instruct.ml — Caml Light ZAM instruction definition (Module Instruct).
: back.ml — Caml Light Lambda-to-ZAM compiler (Module Back).
: Xavier Leroy, "The ZINC experiment: An economical implementation of the ML language," Technical Report 117, INRIA Rocquencourt, 1990.