From Krivine's Machine to Caml Light: A Deep Dive into the ZINC Abstract Machine

Abstract

This article provides a comprehensive technical exploration of the ZINC Abstract Machine (ZAM), the execution engine behind Caml Light and the foundation of modern OCaml. Drawing from Xavier Leroy's retrospective and the original Caml Light source code, we examine the complete compilation pipeline from lambda calculus to bytecode, the innovative push-enter evaluation model, and the sophisticated environment representation that enables efficient execution of strict functional languages.

  1. Introduction: The Quest for Efficient Functional Language Implementation

The implementation of strict functional languages presents unique challenges: higher-order functions, currying, partial application, and the need for tail-call optimization. The Caml Light system, developed by Xavier Leroy and Damien Doligez at INRIA in 1990, represented a significant advancement through its ZINC Abstract Machine—a call-by-value variant of Krivine's machine that addressed the inefficiencies of earlier approaches like the CAM (Categorical Abstract Machine).

The fundamental insight, as noted by Pierre Crégut, was recognizing that the ZAM was essentially "Krivine's machine with some extra call-by-value baggage" . This article unpacks that baggage to reveal the engineering decisions that made Caml Light both elegant and efficient.

  1. The Problem: Currying and Multiple-Argument Functions

2.1 The CAM Approach and Its Limitations

Traditional abstract machines like the CAM, SECD, and FAM handle curried functions through nested closures:

C[[f a₁...aₙ]] = C[[f]]; C[[a₁]]; APPLY; ...; C[[aₙ]]; APPLY
C[[λⁿb]] = CLOSURE(...(CLOSURE(C[[b]]; RETURN)...); RETURN)

This approach suffers from two major inefficiencies :

  1. n-1 intermediate closures are constructed before the function body executes

  2. n-1 immediate returns occur as each curried layer consumes one argument

For a function of n arguments, we pay O(n²) in closure allocation and indirect jumps.

2.2 The Stack-Based Alternative

Stack-based languages (Forth, PostScript) suggest a better approach:

The challenge for functional languages: functions are first-class values, so the callee might receive fewer arguments (partial application) or more arguments (over-application) than expected.

  1. The ZAM Solution: Push-Enter with Arity Checking

3.1 Core Compilation Schemes

The ZAM introduces distinct compilation modes :

C mode (computes to value):

C[[n]] = ACCESS(n)
C[[λa]] = CLOSURE(T[[λa]])
C[[b a₁...aₙ]] = PUSHMARK; C[[aₙ]]; ...; C[[a₁]]; C[[b]]; APPLY

T mode (tail position):

T[[n]] = ACCESS(n); RETURN
T[[λa]] = GRAB; T[[a]]
T[[b a₁...aₙ]] = C[[aₙ]]; ...; C[[a₁]]; C[[b]]; TAILAPPLY

Notice the right-to-left evaluation order for arguments, crucial for correct stack discipline.

3.2 The Key Innovation: GRAB and PUSHMARK

The GRAB instruction handles arity mismatches:

SituationAction
Argument available (v :: s)Pop v into environment, continue
Marker reached (ε :: (c',e') :: r)Partial application: create closure and return
Multiple argumentsPop all expected, tail-call result with remainder

The PUSHMARK instruction places a sentinel ε on the stack, enabling GRAB to detect when arguments are exhausted.

  1. Complete ZAM Instruction Set and Semantics

Based on the Caml Light source code , we present the complete instruction set with formal transition rules.

4.1 State Representation

State = (Code, Env, AStk, RStk)

Code: Instruction sequence
Env:  Persistent environment (heap-allocated linked list)
AStk: Argument stack (value stack, including marker ε)
RStk: Return stack (saved Code × Env pairs)

4.2 Instruction Transitions

Constants and Globals

InstructionTransitionDescription
QUOTE(c)(c', e, s, r) → (c', e, c::s, r)Load constant
GETGLOBAL(q)(c', e, s, r) → (c', e, global[q]::s, r)Read global
SETGLOBAL(q)(c', e, v::s, r) → (c', e, unit::s, r)Write global

Variable Access

InstructionTransitionDescription
ACCESS(n)(c', e, s, r) → (c', e, e[n]::s, r)Environment lookup
ENVACC(n)(c', e, s, r) → (c', e, envacc(n,e)::s, r)Direct environment access

Stack Operations

InstructionTransitionDescription
PUSH(c', e, v::s, r) → (c', e, v::v::s, r)Duplicate accumulator
PUSHMARK(c', e, s, r) → (c', e, ε::s, r)Application boundary mark
LET(c', e, v::s, r) → (c', v::e, s, r)Extend environment
ENDLET(n)(c', e, s, r) → (c', drop(n,e), s, r)Shrink environment

Function Application

InstructionTransitionDescription
APPLY(c', e, [lbl,e']::s, r) → (lbl, e', s, (c',e)::r)Save return, jump
TAILAPPLY(c', e, [lbl,e']::s, r) → (lbl, e', s, r)Jump without saving
GRAB (has arg)(c', e, v::s, r) → (c', v::e, s, r)Pop argument to environment
GRAB (partial)(c', e, ε::s, (c'',e'')::r) → (c'', e'', [(RESTART::c', e)]::s, r)Create closure, return to caller
RETURN (normal)(c', e, v::ε::(c'',e'')::r, s) → (c'', e'', v::s, r)Restore caller
RETURN (closure)(c', e, [lbl,e']::s, r) → (lbl, e', s, r)Continue with closure
CLOSURE(lbl)(c', e, s, r) → (c', e, [lbl,e]::s, r)Create closure
RESTART(c', [v₁...vₖ,e], s, r) → (c', e, v₁::...::vₖ::s, r)Restore partial args

Data Construction

InstructionTransitionDescription
MAKEBLOCK(tag,n)(c', e, v₁::...::vₙ::s, r) → (c', e, Block(tag,[v₁...vₙ])::s, r)Allocate block
GETFIELD(n)(c', e, Block(_,fs)::s, r) → (c', e, fs[n]::s, r)Field access
SETFIELD(n)(c', e, v::Block(_,fs)::s, r) → (c', e, unit::s, r)Field update

Control Flow

InstructionTransitionDescription
BRANCH(lbl)c', e, s, r) → (find_label(lbl, c'), e, s, r)Skip ahead to label
BRANCHIF(lbl)(c', e, 0::s, r) → (c', e, s, r)Fall through if zero
BRANCHIF(lbl)(c', e, v::s, r) → (find_label(lbl, c'), e, s, r)Jump if non-zero
BRANCHIFNOT(lbl)Inverse of BRANCHIF
SWITCH(table)(c', e, v::s, r) → (find_label(table[select(v)], c'), e, s, r)Jump table
PUSHTARP(lbl)(c', e, s, r) → (c', e, s, (lbl,e,sp)::r)Push exception handler
POPTRAP(c', e, s, (lbl,e',sp')::r) → (c', e, s, r)Pop handler

Where find_label(lbl, code) scans forward in the instruction list to find Klabel lbl, with semantic:

find_label(lbl, []) = error "label not found"
find_label(lbl, Klabel lbl :: rest) = rest   -- found, return code after label
find_label(lbl, _ :: rest) = find_label(lbl, rest)  -- skip, continue searching

4.3 The select function for SWITCH

select(v) = if is_int(v) then int_val(v)
            else table_size/2 + tag_val(v)

This allows a single table to handle both constant and constructor tags.

  1. The Mixed Environment Representation

5.1 Motivation

Simple abstract machines represent environments as heap-allocated linked lists, causing:

  1. Inefficiency 1: Many environments are never saved in closures, yet heap-allocated

  2. Inefficiency 2: Linked list access is O(n)

5.2 The ZAM Solution

The ZAM uses a mixed representation :

Environment = Volatile Part S (stack) + Persistent Part E (heap)

add(v₁...vₙ, (S,E)) = (v₁...vₙ.S, E)
accessₙ(S,E) = S[n] if n < |S|
             = accessₙ₋|ₛ|(E) if n ≥ |S|
close(c, (v₁...vₙ.ε, E)) = ([c, E'], (ε, E'))
  where E' = add(v₁...vₙ, E)

5.3 Impact

In the common case of an n-argument function, fully applied, that doesn't build closures:

Measurements showed a 4× reduction in heap allocations and GC time .

  1. From Lambda to ZAM: The Compilation Pipeline

6.1 Lambda IR

The intermediate language uses de Bruijn indices for variable binding :

type lambda =
    Lvar of int                          (* de Bruijn index *)
  | Lconst of struct_constant
  | Lapply of lambda * lambda list       (* function + arguments *)
  | Lfunction of lambda                  (* abstraction *)
  | Llet of lambda list * lambda         (* multi-binding let *)
  | Lletrec of (lambda * int) list * lambda  (* recursive bindings *)
  | Lprim of primitive * lambda list     (* primitive operations *)
  | Lcond of lambda * (atomic_constant * lambda) list  (* constant switch *)
  | Lswitch of int * lambda * (constr_tag * lambda) list  (* constructor switch *)
  | Lstaticfail of int                   (* escape continuation *)
  | Lstatichandle of lambda * lambda     (* static exception handler *)
  | Lhandle of lambda * lambda           (* dynamic exception handler *)
  | Lifthenelse of lambda * lambda * lambda
  | Lsequence of lambda * lambda
  | Lwhile of lambda * lambda
  | Lfor of lambda * lambda * bool * lambda
  | Lshared of lambda * int ref          (* shared code *)
  | Levent of event * lambda             (* debugging event *)

6.2 The compExpr Function

The core compilation function compExpr : Label -> Lambda -> Code -> CompileM Code transforms Lambda IR to ZAM instructions, taking:

Key Pattern: Tail Call Detection

isReturn :: Code -> Bool
isReturn (KReturn : _) = True
isReturn (KLabel _ : c) = isReturn c
isReturn (KEvent _ : c) = isReturn c
isReturn _ = False

This simple function drives the tail-call optimization: when isReturn code = True, the compiler generates KTermApply instead of KApply, avoiding stack growth.

Compilation Patterns

Lambda FormZAM CodeNotes
LVar nKAccess nDirect environment access
LConst cKQuote cWith constant-folding optimization
LApply f args (tail)args... KPush f KTermApplyNo stack growth
LApply f args (non-tail)KPushMark args... KPush f KApplySave return point
LFunction body (tail)KGrab bodyDirect entry
LFunction body (non-tail)KClosure LDelay compilation to L
LLet args bodyargs... KLet body KEndLet(n)Environment extension
LIfThenElse c t ec KBranchIfNot Lt t KBranch Lf KLabel Lt e KLabel LfBranch fusion
LWhile c bLloop KCheckSignals c KBranchIfNot Lexit b KBranch Lloop KLabel LexitWith signal check

6.3 Recursive Binding Compilation

Recursive let rec uses backpatching with dummy values:

-- For: let rec f = λx... and g = λy... in body
KPrim(PDummy sz_f); KLet;           -- Create dummy for f
KPrim(PDummy sz_g); KLet;           -- Create dummy for g
  compile f (KPush; KAccess 1; KPrim PUpdate;  -- Patch f
  compile g (KPush; KAccess 0; KPrim PUpdate;  -- Patch g
    body))
KEndLet 2

The PDummy primitive creates uninitialized heap blocks, PUpdate fills them after the recursive definitions are computed.

  1. Advanced Features

7.1 Exception Handling

The ZAM supports both static (Lstatichandle) and dynamic (Lhandle) exceptions:

Static (pattern match failure):

Dynamic (try/with):

7.2 Pattern Matching Compilation

Pattern matching compiles to decision trees for efficiency:

-- Integer/character switches use interval trees
compileNBranch :: [(AtomicConstant, Lambda)] -> DecisionTree

-- Result: balanced binary tree of KBranchInterval tests
-- O(log n) comparisons instead of O(n) sequential tests

For dense switches, a KSwitch jump table provides O(1) dispatch.

7.3 Boolean Test Fusion

Comparisons fuse with branches to avoid explicit boolean values:

-- e1 < e2 && e3
-- Becomes: e1 e2 KTest(PTlt, Lfalse) e3 ...
-- Rather than: e1 e2 KPrim(PIntComp PTlt) KPrim PNot KBranchIfNot Lfalse ...

-- Similarly for ||, not, etc.

  1. The ZAM2 Evolution: Lessons for Objective Caml

The ZAM2 , used in Objective Caml's bytecode interpreter, introduced further refinements:

8.1 Unified Stack

Merged argument and return stacks, with return addresses pushed before arguments:

PUSHRETADDR(k); args... TAILAPPLY

8.2 Static Environment Shape

Eliminated dynamic tests at variable access by compile-time determination of stack vs. heap variable placement:

STACKACCESS(n)  -- for n < threshold
ENVACCESS(n)    -- for n ≥ threshold

8.3 Minimal Closures

Closures contain only actually free variables, not entire environments, preventing memory leaks.

  1. Push-Enter vs. Eval-Apply: A Retrospective

The ZAM embodies the push-enter model, contrasting with eval-apply :

AspectPush-Enter (ZAM)Eval-Apply (CAM/GHC)
Arity handlingCallee checks argument countCaller ensures correct count
Partial applicationCallee creates closureCaller generates papp combinator
Over-applicationCallee tail-calls resultCaller generates apply combinator
Known callsStill indirectCan be direct (major win)
Register usageConstrained by abstract machineFlexible calling conventions

Surprisingly, eval-apply wins for optimizing native-code compilers :

The ZAM's push-enter design was optimal for bytecode interpretation but not for native code generation.

  1. Conclusion

The ZINC Abstract Machine represents a pivotal point in functional language implementation:

  1. Theoretical elegance: A call-by-value Krivine machine with minimal extensions

  2. Engineering pragmatism: Mixed environments, arity checking, and tail-call optimization

  3. Educational value: Exposed implementation concepts implicit in λ-calculus

  4. Historical bridge: From abstract machines to modern optimizing compilers

As Xavier Leroy noted , "the journey could have been shorter by not going through abstract machines at all. But maybe it would never have taken place without the 'detour' through abstract machines."

The Caml Light implementation, with its ~200KB runtime and 2MB self-hosting memory footprint, proved that efficient functional language implementation was possible on modest hardware—and laid the groundwork for the industrial-strength OCaml we use today.


References

: Xavier Leroy, "From Krivine's machine to the Caml implementations," INRIA Rocquencourt.

: lambda.ml — Caml Light Lambda IR definition (Module Lambda).

: instruct.ml — Caml Light ZAM instruction definition (Module Instruct).

: back.ml — Caml Light Lambda-to-ZAM compiler (Module Back).

: Xavier Leroy, "The ZINC experiment: An economical implementation of the ML language," Technical Report 117, INRIA Rocquencourt, 1990.

CC BY-SA 4.0 Septimia Zenobia. Last modified: April 21, 2026. Website built with Franklin.jl and the Julia programming language.