Terms, also known as expressions, are the fundamental units of meaning in Lean's core language.
They are produced from user-written syntax by the elaborator.
Lean's type system relates terms to their types, which are also themselves terms.
Types can be thought of as denoting sets, while terms denote individual elements of these sets.
A term is well-typed if it has a type under the rules of Lean's type theory.
Only well-typed terms have a meaning.
Terms are a dependently typed λ-calculus: they include function abstraction, application, variables, and let-bindings.
In addition to bound variables, variables in the term language may refer to constructors, type constructors, recursors, defined constants, or opaque constants.
Constructors, type constructors, recursors, and opaque constants are not subject to substitution, while defined constants may be replaced with their definitions.
A derivation demonstrates the well-typedness of a term by explicitly indicating the precise inference rules that are used.
Implicitly, well-typed terms can stand in for the derivations that demonstrate their well-typedness.
Lean's type theory is explicit enough that derivations can be reconstructed from well-typed terms, which greatly reduces the overhead that would be incurred from storing a complete derivation, while still being expressive enough to represent modern research mathematics.
This means that proof terms are sufficient evidence of the truth of a theorem and are amenable to independent verification.
In addition to having types, terms are also related by definitional equality.
This is the mechanically-checkable relation that equates terms modulo their computational behavior.
Definitional equality includes the following forms of reduction:
β (beta)
Applying a function abstraction to an argument by substitution for the bound variable
δ (delta)
Replacing occurrences of defined constants by the definition's value
ι (iota)
Reduction of recursors whose targets are constructors (primitive recursion)
ζ (zeta)
Replacement of let-bound variables by their defined values
Terms in which all possible reductions have been carried out are in normal form.
Definitional equality includes η-equivalence of functions and single-constructor inductive types.
That is, funx=>fx is definitionally equal to f, and S.mkx.f1x.f2 is definitionally equal to x, if S is a structure with fields f1 and f2.
It also features proof irrelevance, so any two proofs of the same proposition are definitionally equal.
It is reflexive, symmetric, and a congruence.
Definitional equality is used by conversion: if two terms are definitionally equal, and a given term has one of them as its type, then it also has the other as its type.
Because definitional equality includes reduction, types can result from computations over data.
Computing types
When passed a natural number, the function LengthList computes a type that corresponds to a list with precisely that many entries in it:
If the length does not match the number of entries, then the computed type will not match the term:
example:LengthListString5:=("Wrong","number",application type mismatch
("number", ())
argument
()
has type
Unit : Type
but is expected to have type
LengthList String 3 : Type())
application type mismatch
("number", ())
argument
()
has type
Unit : Type
but is expected to have type
LengthList String 3 : Type
Function types are a built-in feature of Lean.
Functions map the values of one type (the domain) into those of another type (the range), and function types specify the domain and range of functions.
There are two kinds of function type:
Dependent
Dependent function types explicitly name the parameter, and the function's range may refer explicitly to this name.
Because types can be computed from values, a dependent function may return values from any number of different types, depending on its argument.Dependent functions are sometimes referred to as dependent products, because they correspond to an indexed product of sets.
Non-Dependent
Non-dependent function types do not include a name for the parameter, and the range does not vary based on the specific argument provided.
Dependent Function Types
The function two returns values in different types, depending on which argument it is called with:
The body of the function cannot be written with if...then...else... because it does not refine types the same way that Lean.Parser.Term.match : termPattern matching. `match e, ... with | p, ... => f | ...` matches each given
term `e` against each pattern `p` of a match alternative. When all patterns
of an alternative match, the `match` term evaluates to the value of the
corresponding right-hand side `f` with the pattern variables bound to the
respective matched values.
If used as `match h : e, ... with | p, ... => f | ...`, `h : e = p` is available
within `f`.
When not constructing a proof, `match` does not automatically substitute variables
matched on in dependent variables' types. Use `match (generalizing := true) ...` to
enforce this.
Syntax quotations can also be used in a pattern match.
This matches a `Syntax` value against quotations, pattern variables, or `_`.
Quoted identifiers only match identical identifiers - custom matching such as by the preresolved
names only should be done explicitly.
`Syntax.atom`s are ignored during matching by default except when part of a built-in literal.
For users introducing new atoms, we recommend wrapping them in dedicated syntax kinds if they
should participate in matching.
For example, in
```lean
syntax "c" ("foo" <|> "bar") ...
```
`foo` and `bar` are indistinguishable during matching, but in
```lean
syntax foo := "foo"
syntax "c" (foo <|> "bar") ...
```
they are not.
match does.
In Lean's core language, all function types are dependent: non-dependent function types are dependent function types in which the parameter name does not occur in the range.
Additionally, two dependent function types that have different parameter names may be definitionally equal if renaming the parameter makes them equal.
However, the Lean elaborator does not introduce a local binding for non-dependent functions' parameters.
Definitional Equality of Dependent and Non-Dependent Functions
This is because the elaborator for array access requires a proof that the index is in bounds.
The non-dependent version of the statement does not introduce this assumption:
defAllNonZero(xs:ArrayNat):Prop:=(i:Nat)→(i<xs.size)→failed to prove index is valid, possible solutions:
- Use `have`-expressions to prove the index is valid
- Use `a[i]!` notation instead, runtime check is performed, and 'Panic' error message is produced if index is not valid
- Use `a[i]?` notation instead, result is an `Option` type
- Use `a[i]'h` notation instead, where `h` is a proof that index is valid
xs : Array Nat
i : Nat
⊢ i < xs.sizexs[i]≠0
failed to prove index is valid, possible solutions:
- Use `have`-expressions to prove the index is valid
- Use `a[i]!` notation instead, runtime check is performed, and 'Panic' error message is produced if index is not valid
- Use `a[i]?` notation instead, result is an `Option` type
- Use `a[i]'h` notation instead, where `h` is a proof that index is valid
xs : Array Nat
i : Nat
⊢ i < xs.size
While the core type theory does not feature implicit parameters, function types do include an indication of whether the parameter is implicit.
This information is used by the Lean elaborator, but it does not affect type checking or definitional equality in the core theory and can be ignored when thinking only about the core type theory.
Definitional Equality of Implicit and Explicit Function Types
The types {α:Type}→(x:α)→α and (α:Type)→(x:α)→α are definitionally equal, even though the first parameter is implicit in one and explicit in the other.
In Lean's type theory, functions are created using function abstractions that bind a variable.
In various communities, function abstractions are also known as lambdas, due to Alonzo Church's notation for them, or anonymous functions because they don't need to be defined with a name in the global environment.
When the function is applied, the result is found by β-reduction: substituting the argument for the bound variable.
In compiled code, this happens strictly: the argument must already be a value.
When type checking, there are no such restrictions; the equational theory of definitional equality allows β-reduction with any term.
In Lean's term language, function abstractions may take multiple parameters or use pattern matching.
These features are translated to simpler operations in the core language, where all functions abstractions take exactly one parameter.
Not all functions originate from abstractions: type constructors, constructors, and recursors may have function types, but they cannot be defined using function abstractions alone.
In Lean's core type theory, every function maps each element of the domain to a single element of the range.
In other words, every function expects exactly one parameter.
Multiple-parameter functions are implemented by defining higher-order functions that, when supplied with the first parameter, return a new function that expects the remaining parameters.
This encoding is called currying, popularized by and named after Haskell B. Curry.
Lean's syntax for defining functions, specifying their types, and applying them creates the illusion of multiple-parameter functions, but the result of elaboration contains only single-parameter functions.
Definitional equality of functions in Lean is intensional.
This means that definitional equality is defined syntactically, modulo renaming of bound variables and reduction.
To a first approximation, this means that two functions are definitionally equal if they implement the same algorithm, rather than the usual mathematical notion of equality that states that two functions are equal if they map equal elements of the domain to equal elements of the range.
Definitional equality is used by the type checker, so it's important that it be predictable.
The syntactic character of intensional equality means that the algorithm to check it can be feasibly specified.
Checking extensional equality involves proving essentially arbitrary theorems about equality of functions, and there is no clear specification for an algorithm to check it.
This makes extensional equality a poor choice for a type checker.
Function extensionality is instead made available as a reasoning principle that can be invoked when proving the proposition that two functions are equal.
In addition to reduction and renaming of bound variables, definitional equality does support one limited form of extensionality, called η-equivalence, in which functions are equal to abstractions whose bodies apply them to the argument.
Given f with type (x:α)→βx, f is definitionally equal to funx=>fx.
When reasoning about functions, the theorem funextUnlike some intensional type theories, funext is a theorem in Lean. It can be proved using quotient types. or the corresponding tactics funext or ext can be used to prove that two functions are equal if they map equal inputs to equal outputs.
funext.{u, v} {α : Sort u} {β : α → Sort v}
{f g : (x : α) → βx}
(h : ∀ (x : α), fx = gx) : f = g
Function extensionality is the statement that if two functions take equal values
every point, then the functions themselves are equal: (∀ x, f x = g x) → f = g.
It is called "extensionality" because it talks about how to prove two objects are equal
based on the properties of the object (compare with set extensionality,
which is (∀ x, x ∈ s ↔ x ∈ t) → s = t).
This is often an axiom in dependent type theory systems, because it cannot be proved
from the core logic alone. However in lean's type theory this follows from the existence
of quotient types (note the Quot.sound in the proof, as well as the show line
which makes use of the definitional equality Quot.lift f h (Quot.mk x) = f x).
Functions can be defined recursively using Lean.Parser.Command.declaration : commanddef.
From the perspective of Lean's logic, all functions are total, meaning that they map each element of the domain to an element of the range in finite time.Some programming language communities use the term total in a different sense, where functions are considered total if they do not crash due to unhandled cases but non-termination is ignored.
The values of total functions are defined for all type-correct arguments, and they cannot fail to terminate or crash due to a missing case in a pattern match.
While the logical model of Lean considers all functions to be total, Lean is also a practical programming language that provides certain "escape hatches".
Functions that have not been proven to terminate can still be used in Lean's logic as long as their range is proven nonempty.
These functions are treated as uninterpreted functions by Lean's logic, and their computational behavior is ignored.
In compiled code, these functions are treated just like any others.
Other functions may be marked unsafe; these functions are not available to Lean's logic at all.
The section on partial and unsafe function definitions contains more detail on programming with recursive functions.
Similarly, operations that should fail at runtime in compiled code, such as out-of-bounds access to an array, can only be used when the resulting type is known to be inhabited.
These operations result in an arbitrarily chosen inhabitant of the type in Lean's logic (specifically, the one specified in the type's Inhabited instance).
Panic
The function thirdChar extracts the third element of an array, or panics if the array has two or fewer elements:
Propositions are meaningful statements that admit proof.
Nonsensical statements are not propositions, but false statements are.
All propositions are classified by Prop.
Propositions have the following properties:
Definitional proof irrelevance
Any two proofs of the same proposition are completely interchangeable.
Run-time irrelevance
Propositions are erased from compiled code.
Impredicativity
Propositions may quantify over types from any universe whatsoever.
Restricted Elimination
With the exception of subsingletons, propositions cannot be eliminated into non-proposition types.
Extensionality
Any two logically equivalent propositions can be proven to be equal with the propext axiom.
The axiom of propositional extensionality. It asserts that if propositions
a and b are logically equivalent (i.e. we can prove a from b and vice versa),
then a and b are equal, meaning that we can replace a with b in all
contexts.
For simple expressions like a ∧ c ∨ d → e we can prove that because all the logical
connectives respect logical equivalence, we can replace a with b in this expression
without using propext. However, for higher order expressions like P a where
P : Prop → Prop is unknown, or indeed for a = b itself, we cannot replace a with b
without an axiom which says exactly this.
This is a relatively uncontroversial axiom, which is intuitionistically valid.
It does however block computation when using #reduce to reduce proofs directly
(which is not recommended), meaning that canonicity,
the property that all closed terms of type Nat normalize to numerals,
fails to hold when this (or any) axiom is used:
set_option pp.proofs true
def foo : Nat := by
have : (True → True) ↔ True := ⟨λ _ => trivial, λ _ _ => trivial⟩
have := propext this ▸ (2 : Nat)
exact this
#reduce foo
-- propext { mp := fun x x => True.intro, mpr := fun x => True.intro } ▸ 2
#eval foo -- 2
#eval can evaluate it to a numeral because the compiler erases casts and
does not evaluate proofs, so propext, whose return type is a proposition,
can never block it.
3.2.3. Universes
Types are classified by universes.
Each universe has a level, which is a natural number.
The Sort operator constructs a universe from a given level.
If the level of a universe is smaller than that of another, the universe itself is said to be smaller.
With the exception of propositions (described later in this chapter), types in a given universe may only quantify over types in smaller universes.
Sort0 is the type of propositions, while each Sort (u + 1) is a type that describes data.
Every universe is an element of every strictly larger universe, so Sort5 includes Sort4.
This means that the following examples are accepted:
example:Sort5:=Sort4example:Sort2:=Sort1
On the other hand, Sort3 is not an element of Sort5:
example:Sort5:=type mismatch
Type 2
has type
Type 3 : Type 4
but is expected to have type
Type 4 : Type 5Sort3
type mismatch
Type 2
has type
Type 3 : Type 4
but is expected to have type
Type 4 : Type 5
Similarly, because Unit is in Sort1, it is not in Sort2:
example:Sort1:=Unitexample:Sort2:=type mismatch
Unit
has type
Type : Type 1
but is expected to have type
Type 1 : Type 2Unit
type mismatch
Unit
has type
Type : Type 1
but is expected to have type
Type 1 : Type 2
Because propositions and data are used differently and are governed by different rules, the abbreviations Type and Prop are provided to make the distinction more convenient.
Type u is an abbreviation for Sort (u + 1), so Type0 is Sort1 and Type3 is Sort4.
Type0 can also be abbreviated Type, so Unit : Type and Type : Type 1.
Prop is an abbreviation for Sort0.
3.2.3.1. Predicativity
Each universe contains dependent function types, which additionally represent universal quantification and implication.
A function type's universe is determined by the universes of its argument and return types.
The specific rules depend on whether the return type of the function is a proposition.
Predicates, which are functions that return propositions (that is, where the result of the function is some type in Prop) may have argument types in any universe whatsoever, but the function type itself remains in Prop.
In other words, propositions feature impredicative quantification, because propositions can themselves be statements about all propositions (and all other types).
Impredicativity
Proof irrelevance can be written as a proposition that quantifies over all propositions:
example:Prop:=∀(P:Prop)(p1p2:P),p1=p2
A proposition may also quantify over all types, at any given level:
For universes at level1 and higher (that is, the Type u hierarchy), quantification is predicative.
For these universes, the universe of a function type is the least upper bound of the argument and return types' universes.
Universe levels of function types
Both of these types are in Type2:
example(α:Type1)(β:Type2):Type2:=α→βexample(α:Type2)(β:Type1):Type2:=α→βPredicativity of Type
This example is not accepted, because α's level is greater than 1. In other words, the annotated universe is smaller than the function type's universe:
example(α:Type2)(β:Type1):Type1:=type mismatch
α → β
has type
Type 2 : Type 3
but is expected to have type
Type 1 : Type 2α→β
type mismatch
α → β
has type
Type 2 : Type 3
but is expected to have type
Type 1 : Type 2
Lean's universes are not cumulative; a type in Type u is not automatically also in Type (u + 1).
Each type inhabits precisely one universe.
No cumulativity
This example is not accepted because the annotated universe is larger than the function type's universe:
example(α:Type2)(β:Type1):Type3:=type mismatch
α → β
has type
Type 2 : Type 3
but is expected to have type
Type 3 : Type 4α→β
type mismatch
α → β
has type
Type 2 : Type 3
but is expected to have type
Type 3 : Type 4
3.2.3.2. Polymorphism
Lean supports universe polymorphism, which means that constants defined in the Lean environment can take universe parameters.
These parameters can then be instantiated with universe levels when the constant is used.
Universe parameters are written in curly braces following a dot after a constant name.
Universe-polymorphic identity function
When fully explicit, the identity function takes a universe parameter u. Its signature is:
id.{u}{α:Sortu}(x:α):α
Universe variables may additionally occur in universe level expressions, which provide specific universe levels in definitions.
When the polymorphic definition is instantiated with concrete levels, these universe level expressions are also evaluated to yield concrete levels.
Universe level expressions
In this example, Codec is in a universe that is one greater than the universe of the type it contains:
Lean automatically infers most level parameters.
In the following example, it is not necessary to annotate the type as Codec.{0}, because Char's type is Type0, so u must be 0:
Universe-polymorphic definitions in fact create a schematic definition that can be instantiated at a variety of levels, and different instantiations of universes create incompatible values.
Universe polymorphism and definitional equality
This can be seen in the following example, in which T is a gratuitously-universe-polymorphic function that always returns true.
Because it is marked Lean.Parser.Command.declaration : commandopaque, Lean can't check equality by unfolding the definitions.
Both instantiations of T have the parameters and the same type, but their differing universe instantiations make them incompatible.
opaqueT.{u}(_:Nat):Bool:=(fun(α:Sortu)=>true)PUnit.{u}set_optionpp.universestruedeftest.{u,v}:T.{u}0=T.{v}0:=type mismatch
rfl.{?u.27}
has type
Eq.{?u.27} ?m.29 ?m.29 : Prop
but is expected to have type
Eq.{1} (T.{u} 0) (T.{v} 0) : Proprfl
type mismatch
rfl.{?u.27}
has type
Eq.{?u.27} ?m.29 ?m.29 : Prop
but is expected to have type
Eq.{1} (T.{u} 0) (T.{v} 0) : Prop
Auto-bound implicit arguments are as universe-polymorphic as possible.
Defining the identity function as follows:
defid'(x:α):=x
results in the signature:
id'.{u}{α:Sortu}(x:α):αUniverse monomorphism in auto-bound implicit parameters
On the other hand, because Nat is in universe Type0, this function automatically ends up with a concrete universe level for α, because m is applied to both Nat and α, so both must have the same type and thus be in the same universe:
Levels that occur in a definition are not restricted to just variables and addition of constants.
More complex relationships between universes can be defined using level expressions.
Level ::= 0 | 1 | 2 | ... -- Concrete levels
| u, v -- Variables
| Level + n -- Addition of constants
| max Level Level -- Least upper bound
| imax Level Level -- Impredicative LUB
Given an assignment of level variables to concrete numbers, evaluating these expressions follows the usual rules of arithmetic.
The imax operation is defined as follows:
imax is used to implement impredicative quantification for Prop.
In particular, if A : Sort u and B : Sort v, then (x : A) → B : Sort (imax u v).
If B : Prop, then the function type is itself a Prop; otherwise, the function type's level is the maximum of u and v.
3.2.3.2.2. Universe Variable Bindings
Universe-polymorphic definitions bind universe variables.
These bindings may be either explicit or implicit.
Explicit universe variable binding and instantiation occurs as a suffix to the definition's name.
Universe parameters are defined or provided by suffixing the name of a constant with a period (.) followed by a comma-separated sequence of universe variables between curly braces.
Universe-polymorphic map
The following declaration of map declares two universe parameters (u and v) and instantiates the polymorphic List with each in turn:
Just as Lean automatically instantiates implicit parameters, it also automatically instantiates universe parameters.
When the describe this option and add xrefautoImplicit option is set to true (the default), it is not necessary to explicitly bind universe variables.
When it is set to false, then they must be added explicitly or declared using the universe command. xref
Auto-implicits and universe polymorphism
When autoImplicit is true (which is the default setting), this definition is accepted even though it does not bind its universe parameters:
In addition to using autoImplicit, particular identifiers can be declared as universe variables in a particular section scope using the universe command.
syntax
command::= ...
|Declares one or more universe variables.
`universe u v`
`Prop`, `Type`, `Type u` and `Sort u` are types that classify other types, also known as
*universes*. In `Type u` and `Sort u`, the variable `u` stands for the universe's *level*, and a
universe at level `u` can only classify universes that are at levels lower than `u`. For more
details on type universes, please refer to [the relevant chapter of Theorem Proving in Lean][tpil
universes].
Just as type arguments allow polymorphic definitions to be used at many different types, universe
parameters, represented by universe variables, allow a definition to be used at any required level.
While Lean mostly handles universe levels automatically, declaring them explicitly can provide more
control when writing signatures. The `universe` keyword allows the declared universe variables to be
used in a collection of definitions, and Lean will ensure that these definitions use them
consistently.
[tpil universes]: https://lean-lang.org/theorem_proving_in_lean4/dependent_type_theory.html#types-as-objects
(Type universes on Theorem Proving in Lean)
```lean
/- Explicit type-universe parameter. -/
def id₁.{u} (α : Type u) (a : α) := a
/- Implicit type-universe parameter, equivalent to `id₁`.
Requires option `autoImplicit true`, which is the default. -/
def id₂ (α : Type u) (a : α) := a
/- Explicit standalone universe variable declaration, equivalent to `id₁` and `id₂`. -/
universe u
def id₃ (α : Type u) (a : α) := a
```
On a more technical note, using a universe variable only in the right-hand side of a definition
causes an error if the universe has not been declared previously.
```lean
def L₁.{u} := List (Type u)
-- def L₂ := List (Type u) -- error: `unknown universe level 'u'`
universe u
def L₃ := List (Type u)
```
## Examples
```lean
universe u v w
structure Pair (α : Type u) (β : Type v) : Type (max u v) where
a : α
b : β
#check Pair.{v, w}
-- Pair : Type v → Type w → Type (max v w)
```
universeidentident*
Declares one or more universe variables for the extent of the current scope.
Just as the variable command causes a particular identifier to be treated as a parameter with a particular type, the universe command causes the subsequent identifiers to be implicitly quantified as as universe parameters in declarations that mention them, even if the option autoImplicit is false.
The universe command when autoImplicit is falseset_optionautoImplicitfalseuniverseudefid₃(α:Typeu)(a:α):=a
Because the automatic implicit parameter feature only insert parameters that are used in the declaration's header, universe variables that occur only on the right-hand side of a definition are not inserted as arguments unless they have been declared with universe even when autoImplicit is true.
Automatic universe parameters and the universe command
This definition with an explicit universe parameter is accepted:
The resulting definition of L is universe-polymorphic, with u inserted as a universe parameter.
Declarations in the scope of a universe command are not made polymorphic if the universe variables do not occur in them or in other automatically-inserted arguments.
Universe lifting operation from a lower Type universe to a higher one.
To express this using level variables, the input is Type s and the output is
Type (max s r), so if s ≤ r then the latter is (definitionally) Type r.
The universe variable r is written first so that ULift.{r} α can be used
when s can be inferred from the type of α.
Inductive types are the primary means of introducing new types to Lean.
While universes and functions are built-in primitives that could not be added by users, every other type in Lean is either an inductive type or defined in terms of universes, functions, and inductive types.
Inductive types are specified by their type constructors and their constructors; their other properties are derived from these.
Each inductive type has a single type constructor, which may take both universe parameters and ordinary parameters.
Inductive types may have any number of constructors; these constructors introduce new values whose types are headed by the inductive type's type constructor.
Based on the type constructor and the constructors for an inductive type, Lean derives a recursor.
Logically, recursors represent induction principles or elimination rules; computationally, they represent primitive recursive computations.
The termination of recursive functions is justified by translating them into uses of the recursors, so Lean's kernel only needs to perform type checking of recursor applications, rather than including a separate termination analysis.
Lean additionally produces a number of helper constructions based on the recursor,The term recursor is always used, even for non-recursive types. which are used elsewhere in the system.
Structures are a special case of inductive types that have exactly one constructor.
When a structure is declared, Lean generates helpers that enable additional language features to be used with the new structure.
This section describes the specific details of the syntax used to specify both inductive types and structures, the new constants and definitions in the environment that result from inductive type declarations, and the run-time representation of inductive types' values in compiled code.
command::= ...
|`declModifiers` is the collection of modifiers on a declaration:
* a doc comment `/-- ... -/`
* a list of attributes `@[attr1, attr2]`
* a visibility specifier, `private` or `protected`
* `noncomputable`
* `unsafe`
* `partial` or `nonrec`
All modifiers are optional, and have to come in the listed order.
`nestedDeclModifiers` is the same as `declModifiers`, but attributes are printed
on the same line as the declaration. It is used for declarations nested inside other syntax,
such as inductive constructors, structure projections, and `let rec` / `where` definitions. declModifiersIn Lean, every concrete type other than the universes
and every type constructor other than dependent arrows
is an instance of a general family of type constructions known as inductive types.
It is remarkable that it is possible to construct a substantial edifice of mathematics
based on nothing more than the type universes, dependent arrow types, and inductive types;
everything else follows from those.
Intuitively, an inductive type is built up from a specified list of constructors.
For example, `List α` is the list of elements of type `α`, and is defined as follows:
```
inductive List (α : Type u) where
| nil
| cons (head : α) (tail : List α)
```
A list of elements of type `α` is either the empty list, `nil`,
or an element `head : α` followed by a list `tail : List α`.
See [Inductive types](https://lean-lang.org/theorem_proving_in_lean4/inductive_types.html)
for more information.
inductive`declId` matches `foo` or `foo.{u,v}`: an identifier possibly followed by a list of universe names declId`optDeclSig` matches the signature of a declaration with optional type: a list of binders and then possibly `: type` optDeclSigwhere(|`declModifiers` is the collection of modifiers on a declaration:
* a doc comment `/-- ... -/`
* a list of attributes `@[attr1, attr2]`
* a visibility specifier, `private` or `protected`
* `noncomputable`
* `unsafe`
* `partial` or `nonrec`
All modifiers are optional, and have to come in the listed order.
`nestedDeclModifiers` is the same as `declModifiers`, but attributes are printed
on the same line as the declaration. It is used for declarations nested inside other syntax,
such as inductive constructors, structure projections, and `let rec` / `where` definitions. declModifiersident`optDeclSig` matches the signature of a declaration with optional type: a list of binders and then possibly `: type` optDeclSig)*(derivingident,*)?
After declaring an inductive type, its type constructor, constructors, and recursor are present in the environment.
New inductive types extend Lean's core logic—they are not encoded or represented by some other already-present data.
Inductive type declarations must satisfy a number of well-formedness requirements to ensure that the logic remains consistent.
The first line of the declaration, from Lean.Parser.Command.inductive : commandinductive to Lean.Parser.Command.inductive : commandwhere, specifies the new type constructor's name and type.
If a type signature for the type constructor is provided, then its result type must be a universe, but the parameters do not need to be types.
If no signature is provided, then Lean will attempt to infer a universe that's just big enough to contain the resulting type.
In some situations, this process may fail to find a minimal universe or fail to find one at all, necessitating an annotation.
The constructor specifications follow Lean.Parser.Command.inductive : commandwhere.
Constructors are not mandatory, as constructorless inductive types such as False and Empty are perfectly sensible.
Each constructor specification begins with a vertical bar ('|', Unicode 'VERTICAL BAR' (U+007c)), declaration modifiers, and a name.
The name is a raw identifier.
A declaration signature follows the name.
The signature may specify any parameters, modulo the well-formedness requirements for inductive type declarations, but the return type in the signature must be a saturated application of the type constructor of the inductive type being specified.
If no signature is provided, then the constructor's type is inferred by inserting sufficient implicit parameters to construct a well-formed return type.
The new inductive type's name is defined in the current namespace.
Each constructor's name is in the inductive type's namespace.
Type constructors may take two kinds of arguments: parameters and indices.
Parameters must be used consistently in the entire definition; all occurrences of the type constructor in each constructor in the declaration must take precisely the same argument.
Indices may vary among the occurrences of the type constructor.
All parameters must precede all indices in the type constructor's signature.
Parameters that occur prior to the colon (':') in the type constructor's signature are considered parameters to the entire inductive type declaration.
They are always parameters that must be uniform throughout the type's definition.
Generally speaking, parameters that occur after the colon are indices that may vary throughout the definition of the type.
However, if the option inductive.autoPromoteIndices is true, then syntactic indices that could have been parameters are made into parameters.
An index could have been a parameter if all of its type dependencies are themselves parameters and it is used uniformly as an uninstantiated variable in all occurrences of the inductive type's type constructor in all constructors.
Promote indices to parameters in inductive types whenever possible.
Indices can be seen as defining a family of types.
Each choice of indices selects a type from the family, which has its own set of available constructors.
Type constructors with indices are said to specify indexed families of types.
It is an example of an inductive type in which the signatures have been omitted for both the type constructor and the constructor.
Lean assigns One to Type:
The constructor is named One.one, because constructor names are the type constructor's namespace.
Because One expects no arguments, the signature inferred for One.one is:
This example is not well typed because there are three entries in the list:
example:EvenOddListStringtrue:=type mismatch
EvenOddList.cons "a" (EvenOddList.cons "b" (EvenOddList.cons "c" EvenOddList.nil))
has type
EvenOddList String !!!true : Type
but is expected to have type
EvenOddList String true : Type.cons"a"(.cons"b"(.cons"c".nil))
type mismatch
EvenOddList.cons "a" (EvenOddList.cons "b" (EvenOddList.cons "c" EvenOddList.nil))
has type
EvenOddList String !!!true : Type
but is expected to have type
EvenOddList String true : Type
In this declaration, α is a parameter, because it is used consistently in all occurrences of EvenOddList.
b is an index, because different Bool values are used for it at different occurrences.
Parameters before and after the colon
In this example, both parameters are specified before the colon in Either's signature.
If an inductive type has just one constructor, then this constructor is eligible for anonymous constructor syntax.
Instead of writing the constructor's name applied to its arguments, the explicit arguments can be enclosed in angle brackets ('⟨' and '⟩', Unicode MATHEMATICAL LEFT ANGLE BRACKET (U+0x27e8) and MATHEMATICAL RIGHT ANGLE BRACKET (U+0x27e9)) and separated with commas.
This works in both pattern and expression contexts.
Providing arguments by name or converting all implicit parameters to explicit parameters with @ requires using the ordinary constructor syntax.
Anonymous constructors
The type AtLeastOneα is similar to List α, except there's always at least one element present:
The optional Lean.Parser.Command.inductive : commandderiving clause of an inductive type declaration can be used to derive instances of type classes.
Please refer to the section on instance deriving for more information.
command::= ...
|`declModifiers` is the collection of modifiers on a declaration:
* a doc comment `/-- ... -/`
* a list of attributes `@[attr1, attr2]`
* a visibility specifier, `private` or `protected`
* `noncomputable`
* `unsafe`
* `partial` or `nonrec`
All modifiers are optional, and have to come in the listed order.
`nestedDeclModifiers` is the same as `declModifiers`, but attributes are printed
on the same line as the declaration. It is used for declarations nested inside other syntax,
such as inductive constructors, structure projections, and `let rec` / `where` definitions. declModifiersstructure`declId` matches `foo` or `foo.{u,v}`: an identifier possibly followed by a list of universe names declIdbracketedBinder*(extendsterm,*)?(:term)?where(`declModifiers` is the collection of modifiers on a declaration:
* a doc comment `/-- ... -/`
* a list of attributes `@[attr1, attr2]`
* a visibility specifier, `private` or `protected`
* `noncomputable`
* `unsafe`
* `partial` or `nonrec`
All modifiers are optional, and have to come in the listed order.
`nestedDeclModifiers` is the same as `declModifiers`, but attributes are printed
on the same line as the declaration. It is used for declarations nested inside other syntax,
such as inductive constructors, structure projections, and `let rec` / `where` definitions. declModifiersident::)?structFields(derivingident,*)?
Declares a new structure type.
Structures are inductive types that have only a single constructor and no indices.
In exchange for these restrictions, Lean generates code for structures that offers a number of conveniences: projection functions are generated for each field, an additional constructor syntax based on field names rather than positional arguments is available, a similar syntax may be used to replace the values of certain named fields, and structures may extend other structures.
Just like other inductive types, structures may be recursive; they are subject to the same restrictions regarding strict positivity.
Structures do not add any expressive power to Lean; all of their features are implemented in terms of code generation.
Just like ordinary inductive type declarations, the header of the structure declaration contains a signature that may specify both parameters and a resulting universe.
Structures may not define indexed families.
Each field of a structure declaration corresponds to a parameter of the constructor.
Auto-implicit arguments are inserted in each field separately, even if their names coincide, and the fields become constructor parameters that quantify over types.
Auto-implicit parameters in structure fields
The structure MyStructure contains a field whose type is an auto-implicit parameter:
structureMyStructurewherefield1:αfield2:α
The type constructor MyStructure takes two universe parameters:
The resulting type is in Type rather than Sort because the constructor fields quantify over types in Sort. In particular, both fields in its constructor MyStructure.mk take an implicit type parameter:
For each field, a projection function is generated that extracts the field's value from the underlying type's constructor.
This function is in the structure's name's namespace.
Structure field projections are handled specially by the elaborator (as described in the section on structure inheritance), which performs extra steps beyond looking up a namespace.
When field types depend on prior fields, the types of the dependent projection functions are written in terms of earlier projections, rather than explicit pattern matching.
Dependent projection types
The structure ArraySized contains a field whose type depends on both a structure parameter and an earlier field:
The signature of the projection function size_eq_length takes the structure type's parameter as an implicit parameter and refers to the earlier field using the corresponding projection:
Structure fields may have default values, specified with :=.
These values are used if no explicit value is provided.
Default values
An adjacency list representation of a graph can be represented as an array of lists of Nat.
The size of the array indicates the number of vertices, and the outgoing edges from each vertex are stored in the array at the vertex's index.
Because the default value #[] is provided for the field adjacency, the empty graph Graph.empty can be constructed without providing any field values.
Structure constructors may be explicitly named by providing the constructor name and :: prior to the fields.
If no name is explicitly provided, then the constructor is named mk in the structure type's namespace.
Declaration modifiers may additionally be provided along with an explicit constructor name.
Non-default constructor name
The structure Palindrome contains a string and a proof that the string is the same when reversed:
The structure NatStringBimap maintains a finite bijection between natural numbers and strings.
It consists of a pair of maps, such that the keys each occur as values exactly once in the other map.
Because the constructor is private, code outside the defining module can't construct new instances and must use the provided API, which maintains the invariants of the type.
Additionally, providing the default constructor name explicitly is an opportunity to attach a documentation comment to the constructor.
Because structures are represented by single-constructor inductive types, their constructors can be invoked or matched against using anonymous constructor syntax.
Additionally, structures may be constructed or matched against using structure instance notation, which includes the names of the fields together with values for them.
syntaxStructure Instances
term::= ...
|Structure instance. `{ x := e, ... }` assigns `e` to field `x`, which may be
inherited. If `e` is itself a variable called `x`, it can be elided:
`fun y => { x := 1, y }`.
A *structure update* of an existing value can be given via `with`:
`{ point with x := 1 }`.
The structure type can be specified if not inferable:
`{ x := 1, y := 2 : Point }`.
{structInstField,*(:term)?}
Constructs a value of a constructor type given values for named fields.
Field specifiers may take two forms:
structInstField::= ...
|structInstLVal:=term
structInstField::= ...
|ident
A structInstLVal is a field name (an identifier), a field index (a natural number), or a term in square brackets, followed by a sequence of zero or more subfields.
Subfields are either a field name or index preceded by a dot, or a term in square brackets.
This syntax is elaborated to applications of structure constructors.
The values provided for fields are by name, and they may be provided in any order.
The values provided for subfields are used to initialize fields of constructors of structures that are themselves found in fields.
Terms in square brackets are not allowed when constructing a structure; they are used in structure updates.
Field specifiers that do not contain := are field abbreviations.
In this context, the identifier f is an abbreviation for f := f; that is, the value of f in the current scope is used to initialize the field f.
Every field that does not have a default value must be provided.
If a tactic is specified as the default argument, then it is run at elaboration time to construct the argument's value.
In a pattern context, field names are mapped to patterns that match the corresponding projection, and field abbreviations bind a pattern variable that is the field's name.
Default arguments are still present in patterns; if a pattern does not specify a value for a field with a default value, then the pattern only matches the default.
The optional type annotation allows the structure type to be specified in contexts where it is not otherwise determined.
Patterns and default values
The structure AugmentedIntList contains a list together with some extra information, which is empty if omitted:
When testing whether the list is empty, the function isEmpty is also testing whether the augmentation field is empty, because the omitted field's default value is also used in pattern contexts:
term::= ...
|Structure instance. `{ x := e, ... }` assigns `e` to field `x`, which may be
inherited. If `e` is itself a variable called `x`, it can be elided:
`fun y => { x := 1, y }`.
A *structure update* of an existing value can be given via `with`:
`{ point with x := 1 }`.
The structure type can be specified if not inferable:
`{ x := 1, y := 2 : Point }`.
{termwithstructInstField,*(:term)?}
Updates a value of a constructor type.
The term that precedes the Lean.Parser.Term.structInst : termStructure instance. `{ x := e, ... }` assigns `e` to field `x`, which may be
inherited. If `e` is itself a variable called `x`, it can be elided:
`fun y => { x := 1, y }`.
A *structure update* of an existing value can be given via `with`:
`{ point with x := 1 }`.
The structure type can be specified if not inferable:
`{ x := 1, y := 2 : Point }`.
with clause is expected to have a structure type; it is the value that is being updated.
A new instance of the structure is created in which every field not specified is copied from the value that is being updated, and the specified fields are replaced with their new values.
When updating a structure, array values may also be replaced by including the index to be updated in square brackets.
This updating does not require that the index expression be in bounds for the array, and out-of-bounds updates are discarded.
Updating arrays
Updating structures may use array indices as well as projection names.
Updates at indices that are out of bounds are ignored:
Values of structure types may also be declared using Lean.Parser.Command.declValEqns : commandwhere, followed by definitions for each field.
This may only be used as part of a definition, not in an expression context.
where for structures
The product type in Lean is a structure named Prod.
Products can be defined using their projections:
Structures may be declared as extending other structures using the optional Lean.Parser.Command.structure : commandextends clause.
The resulting structure type has all of the fields of all of the parent structure types.
If the parent structure types have overlapping field names, then all overlapping field names must have the same type.
If the overlapping fields have different default values, then the default value from the last parent structure that includes the field is used.
New default values in the child structure take precedence over default values from the parent structures.
When the new structure extends existing structures, the new structure's constructor takes the existing structure's information as additional arguments.
Typically, this is in the form of a constructor parameter for each parent structure type.
If the parents' fields overlap, however, then the subset of non-overlapping fields from one or more of the parents is included instead to prevent duplicating field information.
There is no subtyping relation between a parent structure type and its children.
Even if structure B extends structure A, a function expecting an A will not accept a B.
However, conversion functions are generated that convert a structure into each of its parents.
These conversion functions are in the child structure's namespace, and their name is the parent structure's name preceded by to.
Structure type inheritance with overlapping fields
The resulting structure's projections can be used as if its fields are simply the union of the parents' fields.
The Lean elaborator automatically generates an appropriate projection when fields are used.
Likewise, the field-based initialization and structure update notations hide the details of the encoding of inheritance.
The encoding is, however, visible when using the constructor's name, when using anonymous constructor syntax, or when referring to fields by their index rather than their name.
Field Indices and Structure InheritancestructurePair(α:Typeu)wherefst:αsnd:αderivingReprstructureTriple(α:Typeu)extendsPairαwherethd:αderivingReprdefcoords:TripleNat:={fst:=17,snd:=2,thd:=95}
Evaluating the first field index of coords yields the underlying Pair, rather than the contents of the field fst:
it is a type error to apply printEven directly to two:
printEven sorry : IO Unit#checkprintEvenapplication type mismatch
printEven two
argument
two
has type
EvenPrime : Type
but is expected to have type
EvenNumber : Typetwo
application type mismatch
printEven two
argument
two
has type
EvenPrime : Type
but is expected to have type
EvenNumber : Type
Every inductive type is equipped with a recursor.
The recursor is completely determined by the signatures of the type constructor and the constructors.
Recursors have function types, but they are primitive and are not definable using fun.
Because parameters are consistent, they can be abstracted over the entire recursor
The motive
The motive determines the type of an application of the recursor. The motive is a function whose arguments are the type's indices and an instance of the type with these indices instantiated. The specific universe for the type that the motive determines depends on the inductive type's universe and the specific constructors—see the section on subsingleton elimination for details.
A case for each constructor
For each constructor, the recursor expects a function that satisfies the motive for an arbitrary application of the constructor. Each case abstracts over all of the constructor's parameters. If the constructor's parameter's type is the inductive type itself, then the case additionally takes a parameter whose type is the motive applied to that parameter's value—this will receive the result of recursively processing the recursive parameter.
The target
Finally, the recursor takes an instance of the type as an argument, along with any index values.
The result type of the recursor is the motive applied to these indices and the target.
The motive should be satisfiable for any application of List.cons, given that it is satisfiable for the tail. The extra parameter motive tail is because tail's type is a recursive occurrence of List.
Once again, the return type is the motive applied to the target.
List.rec.{u,v}{α:Typev}{motive:Listα→Sortu}(nil:motive[])(cons:(head:α)→(tail:Listα)→motivetail→motive(head::tail))(t:Listα):motivetRecursor with parameters and indices
When using a predicate (that is, a function that returns a Prop) for the motive, recursors express induction.
The cases for non-recursive constructors are the base cases, and the additional arguments supplied to constructors with recursive arguments are the induction hypotheses.
Proofs in Lean are computationally irrelevant.
In other words, having been provided with some proof of a proposition, it should be impossible for a program to check which proof it has received.
This is reflected in the types of recursors for inductively defined propositions or predicates.
For these types, if there's more than one potential proof of the theorem then the motive may only return another Prop.
If the type is structured such that there's only at most one proof anyway, then the motive may return a type in any universe.
A proposition that has at most one inhabitant is called a subsingleton.
Rather than obligating users to prove that there's only one possible proof, a conservative syntactic approximation is used to check whether a proposition is a subsingleton.
Propositions that fulfill both of the following requirements are considered to be subsingletons:
There is at most one constructor.
Each of the constructor's parameter types is either a Prop, a parameter, or an index.
True is a subsingleton
True is a subsingleton because it has one constructor, and this constructor has no parameters.
Its recursor has the following signature:
True.rec.{u}{motive:True→Sortu}(intro:motiveTrue.intro)(t:True):motivetFalse is a subsingleton
False is a subsingleton because it has no constructors.
Its recursor has the following signature:
Note that the motive is an explicit parameter.
This is because it is not mentioned in any further parameters' types, so it could not be solved by unification.
And is a subsingleton
And is a subsingleton because it has one constructor, and both of the constructor's parameters' types are propositions.
Its recursor has the following signature:
And.rec.{u}{ab:Prop}{motive:a∧b→Sortu}(intro:(left:a)→(right:b)→motive(And.introleftright))(t:a∧b):motivetOr is not a subsingleton
Or is not a subsingleton because it has more than one constructor.
Its recursor has the following signature:
The motive's type indicates that Or.rec can only be used to produce proofs.
A proof of a disjunction can be used to prove something else, but there's no way for a program to inspect which of the two disjuncts was true and used for the proof.
Eq is a subsingleton
Eq is a subsingleton because it has just one constructor, Eq.refl.
This constructor instantiates Eq's index with a parameter value, so all arguments are parameters:
In addition to adding new constants to the logic, inductive type declarations also add new reduction rules.
These rules govern the interaction between recursors and constructors; specifically recursors that have constructors as their targets.
This form of reduction is called ι-reduction (iota reduction).
When the recursor's target is a constructor with no recursive parameters, the recursor application reduces to an application of the constructor's case to the constructor's arguments.
If there are recursive parameters, then these arguments to the case are found by applying the recursor to the recursive occurrence.
Inductive type declarations are subject to a number of well-formedness requirements.
These requirements ensure that Lean remains consistent as a logic when it is extended with the inductive type's new rules.
They are conservative: there exist potential inductive types that do not undermine consistency, but that these requirements nonetheless reject.
3.2.4.3.2.1. Universe Levels
Type constructors of inductive types must either inhabit a universe or a function type whose return type is a universe.
Each constructor must inhabit a function type that returns a saturated application of the inductive type.
If the inductive type's universe is Prop, then there are no further restrictions on universes, because Prop is impredicative.
If the universe is not Prop, then the following must hold for each parameter to the constructor:
If the constructor's parameter is a parameter (in the sense of parameters vs indices) of the inductive type, then this parameter's type may be no larger than the type constructor's universe.
All other constructor parameters must be smaller than the type constructor's universe.
Universes, constructors, and parameters
Either is in the greater of its arguments' universes, because both are parameters to the inductive type:
All occurrences of the type being defined in the types of the parameters of the constructors must be in strictly positive positions.
A position is strictly positive if it is not in a function's argument type (no matter how many function types are nested around it) and it is not an argument of any expression other than type constructors of inductive types.
This restriction rules out unsound inductive type definitions, at the cost of also ruling out some unproblematic ones.
Non-strictly-positive inductive types
The type Bad would make Lean inconsistent if it were not rejected:
(kernel) arg #1 of 'Bad.bad' has a non positive occurrence of the datatypes being declaredinductiveBadwhere|bad:(Bad→Bad)→Bad
(kernel) arg #1 of 'Bad.bad' has a non positive occurrence of the datatypes being declared
This is because it would be possible to write a circular argument that proves False under the assumption Bad.
Bad.bad is rejected because the constructor's parameter has type Bad→Bad, which is a function type in which Bad occurs as an argument type.
This declaration of a fixed point operator is rejected, because Fix occurs as an argument to f:
(kernel) arg #2 of 'Fix.fix' contains a non valid occurrence of the datatypes being declaredinductiveFix(f:Typeu→Typeu)where|fix:f(Fixf)→Fixf
(kernel) arg #2 of 'Fix.fix' contains a non valid occurrence of the datatypes being declared
Fix.fix is rejected because f is not a type constructor of an inductive type, but Fix itself occurs as an argument to it.
In this case, Fix is also sufficient to construct a type equivalent to Bad:
Lean rejects universe-polymorphic types that could not, in practice, be used polymorphically.
This could arise if certain instantiations of the universe parameters would cause the type itself to be a Prop.
If this type is not a subsingleton, then is recursor can only target propositions (that is, the motive must return a Prop).
These types only really make sense as Props themselves, so the universe polymorphism is probably a mistake.
Because they are largely useless, Lean's inductive type elaborator has not been designed to support these types.
When such universe-polymorphic inductive types are indeed subsingletons, it can make sense to define them.
Lean's standard library defines PUnit and PEmpty.
To define a subsingleton that can inhabit Prop or a Type, set the option bootstrap.inductiveCheckResultingUniverse to false.
by default the inductive/structure commands report an error if the resulting universe is not zero, but may be zero for some universe parameters. Reason: unless this type is a subsingleton, it is hardly what the user wants since it can only eliminate into Prop. In the Init package, we define subsingletons, and we use this option to disable the check. This option may be deleted in the future after we improve the validator
Defining a version of Bool that can be in any universe is not allowed:
inductivePBool:invalid universe polymorphic resulting type, the resulting universe is not 'Prop', but it may be 'Prop' for some parameter values:
Sort u
Possible solution: use levels of the form 'max 1 _' or '_ + 1' to ensure the universe is of the form 'Type _'.Sortuwhere|true|false
invalid universe polymorphic resulting type, the resulting universe is not 'Prop', but it may be 'Prop' for some parameter values:
Sort u
Possible solution: use levels of the form 'max 1 _' or '_ + 1' to ensure the universe is of the form 'Type _'.
3.2.4.3.3. Constructions for Termination Checking🔗
In addition to the type constructor, constructors, and recursors that Lean's core type theory prescribes for inductive types, Lean constructs a number of useful helpers.
First, the equation compiler (which translates recursive functions with pattern matching in to applications of recursors) makes use of these additional constructs:
recOn is a version of the recursor in which the target is prior to the cases for each constructor.
casesOn is a version of the recursor in which the target is prior to the cases for each constructor, and recursive arguments do not yield induction hypotheses. It expresses case analysis rather than primitive recursion.
below computes a type that, for some motive, expresses that all inhabitants of the inductive type that are subtrees of the target satisfy the motive. It transforms a motive for induction or primitive recursion into a motive for strong recursion or strong induction.
brecOn is a version of the recursor in which below is used to provide access to all subtrees, rather than just immediate recursive parameters. It represents strong induction.
noConfusion is a general statement from which injectivity and disjointness of constructors can be derived.
noConfusionType is the motive used for noConfusion that determines what the consequences of two constructors being equal would be. For separate constructors, this is False; if both constructors are the same, then the consequence is the equality of their respective parameters.
For well-founded recursion, it is frequently useful to have a generic notion of size available.
This is captured in the SizeOf class.
SizeOf is a typeclass automatically derived for every inductive type,
which equips the type with a "size" function to Nat.
The default instance defines each constructor to be 1 plus the sum of the
sizes of all the constructor fields.
This is used for proofs by well-founded induction, since every field of the
constructor has a smaller size than the constructor itself,
and in many cases this will suffice to do the proof that a recursive function
is only called on smaller values.
If the default proof strategy fails, it is recommended to supply a custom
size measure using the termination_by argument on the function definition.
An inductive type's run-time representation depends both on how many constructors it has, how many arguments each constructor takes, and whether these arguments are relevant.
An enum inductive type of at least 2 and at most 2^{32} constructors, each of which has no parameters, is represented by the first type of uint8_t, uint16_t, uint32_t that is sufficient to assign a unique value to each constructor. For example, the type Bool is represented by uint8_t, with values 0 for false and 1 for true. Find out whether this should say "no relevant parameters"
Decidableα is represented the same way as BoolAren't Decidable and Bool just special cases of the rules for trivial constructors and irrelevance?
Nat is represented by lean_object *. Its run-time value is either a pointer to an opaque arbitrary-precision integer object or, if the lowest bit of the “pointer” is 1 (checked by lean_is_scalar), an encoded unboxed natural number (lean_box/lean_unbox). Move these to FFI section or Nat chapter
Types and proofs have no run-time representation.
That is, if an inductive type is a Prop, then its values are erased prior to compilation.
Similarly, all theorem statements and types are erased.
Types with run-time representations are called relevant, while types without run-time representations are called irrelevant.
Types are irrelevant
Even though List.cons has the following signature, which indicates three parameters:
its run-time representation has only two, because the proof is erased.
In most cases, irrelevant values simply disappear from compiled code.
However, in cases where some representation is required (such as when they are arguments to polymorphic constructors), they are represented by a trivial value.
If an inductive type has exactly one constructor, and that constructor has exactly one run-time relevant parameter, then the inductive type is represented identically to its parameter.
Zero-Overhead Subtypes
The structure Subtype bundles an element of some type with a proof that it satisfies a predicate.
Its constructor takes four arguments, but three of them are irrelevant:
If an inductive type doesn't fall into one of the categories above, then its representation is determined by its constructors.
Constructors without relevant parameters are represented by their index into the list of constructors, as unboxed unsigned machine integers (scalars).
Constructors with relevant parameters are represented as an object with a header, the constructor's index, an array of pointers to other objects, and then arrays of scalar fields sorted by their types.
The header tracks the object's reference count and other necessary bookkeeping.
Recursive functions are compiled as they are in most programming languages, rather than by using the inductive type's recursor.
Elaborating recursive functions to recursors serves to provide reliable termination evidence, not executable code.
From the perspective of C, these other inductive types are represented by lean_object *.
Each constructor is stored as a lean_ctor_object, and lean_is_ctor will return true.
A lean_ctor_object stores the constructor index in its header, and the fields are stored in the m_objs portion of the object.
Lean assumes that sizeof(size_t) == sizeof(void*)—while this is not guaranteed by C, the Lean run-time system contains an assertion that fails if this is not the case.
The memory order of the fields is derived from the types and order of the fields in the declaration. They are ordered as follows:
Within each group the fields are ordered in declaration order. Warning: Trivial wrapper types still count toward a field being treated as non-scalar for this purpose.
To access fields of the first kind, use lean_ctor_get(val, i) to get the ith non-scalar field.
To access USize fields, use lean_ctor_get_usize(val, n+i) to get the ith USize field and n is the total number of fields of the first kind.
To access other scalar fields, use lean_ctor_get_uintN(val, off) or lean_ctor_get_usize(val, off) as appropriate. Here off is the byte offset of the field in the structure, starting at n*sizeof(void*) where n is the number of fields of the first two kinds.
For example, a structure such as
structureSwhereptr_1:ArrayNatusize_1:USizesc64_1:UInt64ptr_2:{x:UInt64//x>0} -- wrappers don't count as scalars
sc64_2:Float -- `Float` is 64 bit
sc8_1:Boolsc16_1:UInt16sc8_2:UInt8sc64_3:UInt64usize_2:USizeptr_3:Char -- trivial wrapper around `UInt32`
sc32_1:UInt32sc16_2:UInt16
would get re-sorted into the following memory order:
Inductive types may be mutually recursive.
Mutually recursive definitions of inductive types are specified by defining the types in a mutual ... end block.
Mutually Defined Inductive Types
The type EvenOddList in a prior example used a Boolean index to select whether the list in question should have an even or odd number of elements.
This distinction can also be expressed by the choice of one of two mutually inductive types EvenList and OddList:
The inductive types declared in a mutual block are considered as a group; they must collectively satisfy generalized versions of the well-formedness criteria for non-mutually-recursive inductive types.
This is true even if they could be defined without the mutual block, because they are not in fact mutually recursive.
Each type constructor's signature must be able to be elaborated without reference to the other inductive types in the mutual group.
In other words, the inductive types in the mutual group may not take each other as arguments.
The constructors of each inductive type may mention the other type constructors in the group in their parameter types, with restrictions that are a generalization of those for recursive occurrences in non-mutual inductive types.
Mutual inductive type constructors may not mention each other
All inductive types in the mutual group must have the same parameters.
Their indices may differ.
Differing numbers of parameters
Even though Both and OneOf are not mutually recursive, they are declared in the same mutual block and must therefore have identical parameters:
mutualinductiveBoth(α:Typeu)(β:Typev)where|mk:α→β→Bothαβinvalid inductive type, number of parameters mismatch in mutually inductive datatypesinductiveOptional(α:Typeu)where|none|some:α→Optionalαend
invalid inductive type, number of parameters mismatch in mutually inductive datatypes
Differing parameter types
Even though Many and OneOf are not mutually recursive, they are declared in the same mutual block and must therefore have identical parameters.
They both have exactly one parameter, but Many's parameter is not necessarily in the same universe as Optional's:
mutualinductiveMany(α:Type):Typeuwhere|nil:Manyα|cons:α→Manyα→Manyαinvalid mutually inductive types, parameter 'α' has type
Type u : Type (u + 1)
but is expected to have type
Type : Type 1inductiveOptional(α:Typeu)where|none|some:α→Optionalαend
invalid mutually inductive types, parameter 'α' has type
Type u : Type (u + 1)
but is expected to have type
Type : Type 1
The universe levels of each inductive type in a mutual group must obey the same requirements as non-mutually-recursive inductive types.
Additionally, all the inductive types in a mutual group must be in the same universe, which implies that their constructors are similarly limited with respect to their parameters' universes.
Universe mismatch
These mutually-inductive types are a somewhat complicated way to represent run-length encoding of a list:
Each inductive type that is defined in the mutual group may occur only strictly positively in the types of the parameters of the constructors of all the types in the group.
In other words, in the type of each parameter to each constructor in all the types of the group, none of the type constructors in the group occur to the left of any arrows, and none of them occur in argument positions unless they are an argument to an inductive type's type constructor.
Mutual strict positivity
In the following mutual group, Tm occurs in a negative position in the argument to Binding.scope:
mutual(kernel) arg #1 of 'Binding.scope' has a non positive occurrence of the datatypes being declaredinductiveTmwhere|app:Tm→Tm→Tm|lam:Binding→TminductiveBindingwhere|scope:(Tm→Tm)→Bindingend
Because Tm is part of the same mutual group, it must occur only strictly positively in the arguments to the constructors of Binding.
It occurs, however, negatively:
(kernel) arg #1 of 'Binding.scope' has a non positive occurrence of the datatypes being declared
Nested positions
The definitions of LocatedStx and Stx satisfy the positivity condition because the recursive occurrences are not to the left of any arrows and, when they are arguments, they are arguments to inductive type constructors.
Mutual inductive types are provided with primitive recursors, just like non-mutually-defined inductive types.
These recursors take into account that they must process the other types in the group, and thus will have a motive for each inductive type.
Because all inductive types in the mutual group are required to have identical parameters, the recursors still take the parameters first, abstracting them over the motives and the rest of the recursor.
Additionally, because the recursor must process the group's other types, it will require cases for each constructor of each of the types in the group.
The actual dependency structure between the types is not taken into account; even if an additional motive or constructor case is not really required due to there being fewer mutual dependencies than there could be, the generated recursor still requires them.
Mutual inductive types are represented identically to non-mutual inductive types in compiled code and in the runtime.
The restrictions on mutual inductive types exist to ensure Lean's consistency as a logic, and do not impact compiled code.