Compact reference
Overview
Compact is a strongly typed, statically typed, bounded smart contract language, designed to be used in combination with TypeScript for writing smart contracts for the three-part structure of Midnight, where contracts have the following components:
- a replicated component on a public ledger
- a zero-knowledge circuit component, confidentially proving the correctness of the former
- a local, off-chain component that can perform arbitrary code
Each Compact program (also referred to as a contract) can contain several kinds of program elements:
- module and import forms for management of namespaces and separate files,
- declarations of program-defined types,
- declarations of the data that the contract stores in the public ledger,
- declarations of witnesses, which are callback functions supplied by the TypeScript runner, and
- definitions of circuits, which are functions serving as the operational core of a smart contract, and
- the definition of at most one constructor, which is a function used to intialize the public ledger.
Compact is similar to TypeScript: it has a syntax similar to that of JavaScript, and it layers a type system over the JavaScript syntax. Compact deviates intentionally from TypeScript, however, in several important ways:
- Unlike TypeScript, Compact is strongly typed. Programs cannot bypass the static type system via missing type declarations or unsafe casts. Furthermore, the JavaScript code produced by the Compact compiler includes run-time checks enforcing the static types of values that come from outside Compact as well as preventing external application of a Compact circuit with more or fewer than the declared number of arguments.
- Compact provides namespace management via static rather than dynamic modules, and these modules can be parameterized via compile-time generic parameters, which include size as well as type parameters.
- Because every Compact program must compile into a set of finite proving circuits, all Compact types have sizes that are fixed at compile time, loops are bounded either by constant bounds or by the size of an object of constant size, and recursion is disallowed.
- Compact numeric values are limited to unsigned integers either with a program-declared range or with a range determined by the field size of the target proving system.
- Compact distinguishes certain values as potentially containing private data that
should typically be protected, and it requires explicit declaration of the
disclosure of any potential private data via
disclose()wrappers. The basics of this are discussed in the description ofdisclose, and a more thorough description and discussion appears in the separate document Explicit disclosure in Compact.
Like TypeScript, Compact compiles into JavaScript, but it also produces a TypeScript definition file so that it effectively also compiles into TypeScript. It produces separate TypeScript definition files and JavaScript implementation files rather than simply producing TypeScript for three reasons:
- to allow compiled Compact programs to be used without requiring an additional TypeScript compilation step,
- to permit the generated code to check function argument counts without disabling compile-time argument-type checks when called from TypeScript, and
- so that it can generate a sourcemap file that properly maps elements of the generated JavaScript code (e.g., variable bindings and references) to the corresponding elements of the source Compact code.
For each circuit that touches the public ledger and hence requires a proof for on-chain execution, the Compact compiler also produces proving circuits in a zero-knowledge intermediate language (zkir), and it uses a zkir compiler to produce proving keys for each such circuit.
Finally, the compact compiler also produces a JSON-formatted contract info file that contains information about the program and its compiled representation, including version numbers and the types and characteristics of the contract's exported circuits.
This document explains each syntactic category individually. It starts by introducing the building blocks that are used in various contexts: identifiers, constants, types, generic parameters, and patterns. Then it describes the structure of Compact programs, each kind of program element, and the statements and expressions that can appear within circuit and constructor bodies. Finally, it discusses the TypeScript target.
Writing a contract provides a small example of what a Compact program looks like. It also introduces the basic building blocks of a Compact contract. The full grammar of Compact is provided separately.
Notation
The syntax of Compact programs is given by EBNF grammar snippets that use the following notational conventions:
- Keywords and punctuation are in
monospacedfont. - Terminal and nonterminal names are in emphasized font.
- Alternation is indicated by a vertical bar (
|). - Optional items are indicated by the superscript opt.
- Repetition is specified by ellipses.
The notation X ⋯ X, where X is a grammar symbol, represents zero
or more occurrences of X.
The notation X
,⋯,X, where X is a grammar symbol and,is a literal comma, represents zero or more occurrences of X separated by commas. In either case, when the ellipsis is marked with the superscript 1, i.e., ⋯¹, the notation represents a sequence containing at least one X. When such a sequence is followed by,opt, an optional trailing comma is allowed, but only if there is at least one X. For example, id ⋯ id represents zero or more ids, and expr,⋯¹,expr,opt represents one or more comma-separated exprs possibly followed by an extra comma. The rules involving commas apply equally to semicolons, i.e., apply when,is replaced by;.
Every program is formed of characters that are organized into atomic
sequences of characters known as tokens.
Each keyword and punctionation symbol appearing in the grammar snippets
represents itself exactly, i.e., represents the token consisting of
the same sequence of characters.
For example, when the keyword circuit appears in a grammar snippet,
it matches only the token circuit, and when the punctuation symbol
: appears, it matches only the token :.
Each terminal name appearing in the grammar snippets represents a set of possible tokens. For example, the terminal name id represents the set of all identifiers: when id appears in a grammar snippet, it matches any identifier. The sets of tokens represented by the terminal names appearing in the grammar snippets are described in Terminal names below.
Each nonterminal name appearing in the grammar snippets represents a
sequence of tokens that comprise some structured piece of a program.
For example, the terminal name expr matches any sequence of tokens
that can be interpreted as an expression, such as 3 + x or
a ? b : c.
The set of structures represented by each nonterminal name is given
in the various sections of this reference manual along with typing
and evaluation rules where appropriate.
For example, the structure of a circuit-definition is described
in Circuits definitions.
Terminal names
The following terminal names appear in the grammar snippets.
-
id, module-name, function-name, struct-name, enum-name, contract-name, tvar-name, and type-name all represent identifier tokens.
-
nat represents natural-number literals.
-
str and file represent string literals.
-
version represents version literals (in pragmas).
While identifiers and string literals are each represented by more than one name, each representes the entire set of possible identifier or string-literal tokens. The grammar snippets use different terminal names only to suggest their use, e.g, module-name for module names versus tvar-name for type variable names.
Static and dynamic errors
The compiler detects various kinds of static errors, e.g., malformed syntax, references to undefined identifiers, and type mismatches. When it detects one or more static errors, it prints descriptive error messages for the errors and terminates without generating any output.
The code the compiler generates and the run-time libraries it uses detect various kinds of dynamic errors, e.g., attempts from code outside of Compact to call Compact circuits with wrong numbers or types of arguments. These errors are reported when the generated code is run.
Identifiers, bindings, and scope
Identifiers are used in Compact, as in most other programming languages, to
name things.
Syntactally, an identifier is a token (atomic sequence of characters), beginning
with with an alphabetic character, a dollar sign ($), or an underscore (_)
followed by one or more alphabetic characters, digits (0 - 9), dollar signs,
or underscores.
Some identifiers are reserved words.
Of these, some are used as keywords in the syntax of the Compact language, e.g.,
module, import, circuit, and for.
Others, specifically keywords reserved by JavaScript and TypeScript, are considered
reserved for future use in Compact, e.g., self and class.
Still others, specifically every identifier that begins with __compact,
are reserved for use by the compiler.
A comprehensive list of keywords and reserved words is given
in keywords and reserved words.
The remaining identifiers can be used to name specific instances of various kinds of entities, including modules, types, generic parameters, ledger fields, function (circuit or witness) names, function parameters, and local variables. An identifier associated with, i.e., bound to, one of these entities can be referenced anywhere within the scope of the binding. Compact is lexically scoped, so the scope of each binding is limited to a specific region of the program text. The binding might be shadowed (hidden from view) in some region of program text within its scope that contains a binding for the same identifier.
It is a static error for an identifier to be bound more than once in a same scope, except that function overloading allows more than one function with the same name to be visible in the same scope with different signatures, i.e., different numbers or kinds of generic parameters and/or different numbers or types of run-time parameters.
The scope of each binding depends upon where it appears, as described below. (The caveat "except where shadowed" is not explicitly stated but applies in each case.)
- Identifiers bound at the outermost level of a contract (refered to as the top level) are visible throughout the contract, but not within any modules that are imported from separate files.
- Identifiers bound at the outermost level of a module are visible throughout the module. They are not visible outside of the module unless exported: any exported binding is also visible if and where it is imported from the module.
- The generic parameters of a module, structure declaration, or function declaration are visible throughout the declaration.
- The run-time parameters of a circuit or constructor are visible within its body.
- Identifiers defined by a
constbinding inside a block are visible throughout the block. - Identifiers defined by a
constbinding inside afor-loop header are visible throughout theforloop.
Every reference to an identifier must appear within the scope of a binding for the identifier, in which case we say that the identifier is bound to the entity associated with the identifier by that binding. Otherwise, the reference is a static error.
For example:
circuit c(): Field {
const answer = 42;
{
const answer = 12;
assert(answer != 42, "shadowing did not work!");
}
return answer; // returns 42 (the outer 'answer')
}
The identifier c is bound to the circuit named c, and this binding is visible
throughout the contract, though no references to c appear in the example.
The first (outer) binding of the identifier answer to the value 42 is visible
throughout the body of c except where shadowed by the second (inner) binding of
answer within in the inner block, so the reference to answer in return answer
evaluates to 42.
The second (inner) binding of answer to 42 is visible throughout the inner
block, so the reference to answer in answer != 42 evaluates to 12.
In addition to having a scope, every binding also has a lifetime. For circuit and witness bindings, the lifetime is effectively permanent, i.e., the binding is always available for use whenever the program is run.
The lifetimes of ledger-field bindings begin when they are first initialized and are effectively permanant from that point on; although the value of a field can change over time, the association of the ledger-field name with the ledger field's location in the (replicated) public state of a contract never changes.
On the other hand, bindings for module names, type names, and generic parameters are live only when a program is compiled, i.e., they help determine the structure of the program and the shape of the data used by the program but are not needed once the program has been compiled. (TypeScript bindings for type names exported from the program's top level do live on, however, in the generated TypeScript definition file.)
Variable bindings, i.e., bindings of circuit parameters, constructor parameters,
and local variables bound by const statements and for loops, have dynamic
lifetimes.
The bindings of a circuit's or constructor's parameters start a new lifetime
when the circuit or constructor is called that ends when the circuit or constructor
exits.
A variable binding established by a const statement starts a new lifetime when
the const statement is evaluated that ends when the block containing the const
statement exits.
A const binding established by the const subform of a for-loop header starts
a new lifetime on each iteration of the loop that ends when that iteration ends.
Variable bindings can have multiple lifetimes, because a circuit might be called
multiple times, a block might be evaluated multiple times, and a for loop might
be evaluated multiple times and/or have multiple iterations.
Variables in Compact are immutable, however: they have the same value over the
entire lifetime of the variable's binding.
Thus, they are referred to as variables not because their values can vary over
any single lifetime but because they can have different values in different
lifetimes.
Generic parameters and arguments
Various entities, specifically module declarations, structure declarations, type-alias declarations, circuit definitions, and witness declarations, can have generic parameters, i.e., compile-time type and numeric (natural-number) parameters whose values are given at the use site rather than fixed at the point of declaration. This allows the same generic code to be used with different specific types, bounds, and sizes. Except where shadowed, generic parameters are visible thoughout the entire entity. In particular, the generic parameters of a module are visible within the program elements that appear within the body of the module.
When present, generic parameters are enclosed in angle brackets following the
name of the generic entity (module, structure, type alias, circuit, or witness).
Each parameter is either a type name (e.g., T) or a hash-prefixed natural-number name
(e.g., #N).
Generic natural-number parameters are prefixed by # to distingish them from
generic type parameters.
| gparams | → | < generic-param , ⋯ , generic-param ,opt > |
| generic-param | → | # tvar-name |
| | | tvar-name |
Generic entities must be specialized at the point of use to produce non-generic entities at compile time by supplying them with generic arguments. Any attempt to use a generic entity without specializing it is a static error. Generic arguments are also enclosed in angle brackets. Each generic argument must be a type, a natural number literal, or the type or numeric value of a generic parameter.
| gargs | → | < garg , ⋯ , garg ,opt > |
| garg | → | nat |
| | | type |
The syntax of types allows for type references, including references to generic parameters, so any generic argument can pass along the value of a generic type or natural-number parameter that is visible at the point of specialization.
The # used to distinguish generic natural-number parameters from generic type parameters
need not and must not used at the point of specialization.
It is a static error, however, if a generic argument supplied for a generic
parameter is not numeric when a numeric value is expected or is not a type when
a type is expected.
The example below demonstrates the use of two levels of generic parameterization, one at the module level and one at the circuit level.
module M<T, #N> {
export circuit foo<A>(x: T, v: Vector<N, A>): Vector<N, [A, T]> {
return map((y) => [y, x], v);
}
}
import M<Boolean, 3>;
export circuit bar1(): Vector<3, [Uint<8>, Boolean]> {
return foo<Uint<8>>(true, [101, 103, 107]);
}
export circuit bar2(): Vector<3, [Boolean, Boolean]> {
return foo<Boolean>(false, [true, false, true]);
}
The body of circuit foo is generic with respect to the module's type parameters
T and N as well as to the circuit's own parameter A.
The module is specialized at the point of import, while the circuit is specialized
at the point of call (in both bar1 and bar2).
Compact types
Compact is statically typed: every expression in a Compact program must have
a static type.
For named circuits and witnesses, the parameters and return types must be explicitly
declared.
For anonymous circuit expressions, the parameters and return types do not need to
be declared but can be.
The types of const bindings can also be declared or not.
The language is strongly typed: the compiler rejects programs that do not type check. For example, it rejects programs in which a circuit or witness with a parameter type annotation is called with an incorrectly typed argument for that parameter, and it rejects programs where a circuit with a return-type annotation returns an incorrectly typed value. If an optional type annotation is omitted, the compiler attempts to infer a type and it rejects the program if no such type can be inferred.
Types consist of built-in primitive types, ledger-state types, program-defined types,
and references to generic type parameters in scope.
When the term "type" occurs in this document without any other qualifier, it means
either a primitive type, ledger-state type, a program-defined type, or a generic type
parameter in scope.
The use of ledger-state types is, at present, limited to typing the result of
default<T> to obtain the default value of type T, and only constant bindings
can have a ledger-state type.
A generic type is not a valid type and so cannot, for example, be used as the type of a parameter or return value. Any attempt to do so is a static error. As with any other generic entity, it must be specialized at the point of use.
Primitive types
The following are the primitive types of Compact:
| type | → | tref |
| | | Boolean | |
| | | Field | |
| | | Uint < tsize > | |
| | | Uint < tsize .. tsize > | |
| | | Bytes < tsize > | |
| | | Opaque < str > | |
| | | Vector < tsize , type > | |
| | | [ type , ⋯ , type ,opt ] | |
| tref | → | id gargsopt |
| tsize | → | nat |
| | | id |
-
Booleanis the type of Boolean values. There are only two values ofBooleantype. They are the values of the expressionstrueandfalse. -
Uint<m..n>, wheremis the literal0or generic natural-number parameter bound to0, and wherenis a non-zero natural number literal or a generic natural-number parameter bound to a non-zero natural number, is the type of bounded unsigned integer values between0(inclusive) andn(exclusive). (While the lower bound is currently required to be0, this restriction might be lifted at some point.)Uinttypes with different upper bounds are different types, although the one with the lower bound is a subtype of the other. The compiler and run-time system might impose a limit on the range of supported unsigned integer values. If so, it is a static error whenever aUinttype includes values that exceed this limit. The current limit, if any, is given in Implementation-specific limits. -
Uint<n>, wherenis a non-zero natural number literal or generic natural-number parameter bound to a non-zero natural number, is the type of sized unsigned integer values with binary representations using up tonbits. This is the same type asUint<0..m>wheremis equal to2^n. Sized integer types can be seen as a convenience for programmers.Uint<32>, for example, can be more obvious and less error-prone than the equivalentUint<0..4294967295>. Any Compact program that uses sized integer types can be rewritten to one that uses only bounded integer types, but the converse is not true. -
Fieldrepresents the set of unsigned integer values up to the maximum value of the scalar prime field of the ZK proving system. The current maximum field value is given in Implementation-specific limits. -
[T,⋯], whereT, ⋯ are zero or more comma-separated types, is the type of tuple values with element typesT, ⋯. Tuples are heterogeneous: any element type can differ from any of the others. The length of a tuple type is the number of element types. Two tuple types with different lengths are different types. Two tuple types where any element type of one differs from the corresponding element type of the other are also different types, though one of the tuple types might be a subtype of the other. -
Vector<n, T>, wherenis a natural number literal or generic natural-number parameter andTis a type, is a shorthand notation for the tuple type[T,⋯]withnoccurrences of the typeT. Note that a vector type and the corresponding tuple type are two different ways of writing exactly the same type. Unless otherwise specified, type rules for vector types are derived from the rules for the corresponding tuple type. -
Bytes<n>, wherenis a natural number literal or a generic natural-number parameter, is the type of byte vectors of lengthn.Bytestypes with different lengths are different types.Bytestypes are used in the Compact standard library for hashing. String literals in Compact also have aBytestype, wherenis the number of bytes in the UTF-8 encoding of the string. -
Opaque<s>, wheresis a string literal, is the type of opaque values with tags.Opaquetypes with different tags are different types. Opaque values can be manipulated in witnesses but they are opaque to circuits. They are represented in circuits as their hash. The only tags currently allowed are"string"and"Uint8Array".
Program-defined types
Programs can define three kinds of new types: structures, enumerations, and contracts. They can also define structural and nominal aliases for existing types.
Structure types
Structure types are defined via struct declarations with the following form:
| struct-declaration | → | exportopt struct struct-name gparamsopt { typed-identifier ; ⋯ ; typed-identifier ;opt } ;opt |
| | | exportopt struct struct-name gparamsopt { typed-identifier , ⋯ , typed-identifier ,opt } ;opt | |
| typed-identifier | → | id : type |
A structure declaration has a sequence of named fields which must be separated either by commas or by semicolons. Comma and semicolon separators cannot be mixed within a single structure declaration. A trailing separator is allowed, but not required.
Each structure field must have a type annotation. Here are a couple of examples:
struct Thing {
triple: Vector<3, Field>,
flag: Boolean,
}
struct NumberAnd<T> {
num: Uint<32>;
item: T
}
The first declaration introduces a structure type named Thing with two fields:
triple (a vector with 3 field elements) and flag (a Boolean).
The second introduces a generic structure type named NumberAnd with generic
parameter T and two fields: num (a 32-bit unsigned integer) and item
(a value of type T).
Generic structure types are not fixed types and must eventually be specialized
by supplying generic arguments, e.g., NumberAnd<Uint<8>>.
When any generic structure type is specialized, it must be fully specialized:
the number of supplied generic arguments must match the number of declared
generic parameters.
The effect of specializing a generic structure type is to produce the same type
as one in which the generic parameters are replaced by the generic argument
values.
For example, NumberAnd<Uint<8>> is equivalent to NumberAnd if NumberAnd
had been defined by:
struct NumberAnd {
num: Uint<32>;
item: Uint<8>
}
It is possible and common for a generic structure type to be specialized via different generic arguments to produce different specialized structure types in different parts of a program.
Structure typing is always nominal: two types are equivalent only if they have
the same names and same fields.
They are distinct if they have different names even if they have the same fields.
More precisely: each structure type is the same as any other structure type
that has the same name, same element names (in the same order), and same element
types (in the same order).
It is distinct from every other type.
This means, for example, that the following program is well-typed:
module M {
struct NumberAnd {
num: Uint<32>;
item: Uint<8>
}
export circuit bar(x: NumberAnd): NumberAnd {
return x;
}
}
import M;
struct NumberAnd<T> {
num: Uint<32>;
item: T
}
export circuit foo(x: NumberAnd<Uint<8>>): NumberAnd<Uint<8>> {
return bar(x);
}
Structure types must not be recursive, i.e., they cannot contain elements of the same type as the structure, either directly or indirectly. An attempt to define a recursive structure type is a static error. For example, it is a static error to use the following pair of declarations:
struct Even {
predecessor: Odd
}
struct Odd {
predecessor: Even
}
export circuit doesntWork(s: Even): Odd {
return s.predecessor;
}
Values of structure types are created with structure-creation expressions and accessed via structure-member-access expressions.
Enumeration types
Enumeration types are defined via enum declarations with the following form:
| enum-declaration | → | exportopt enum enum-name { id , ⋯¹ , id ,opt } ;opt |
An enumeration declaration has a non-empty sequence of named elements separated by commas. A trailing comma is allowed but not required.
An enumeration declaration introduces a named enumeration type, such as Arrow
in the example below:
enum Arrow { up, down, left, right };
Within the scope of this declaration, a value of type Arrow can have one of
three values, selected via Arrow.up, Arrow.down, Arrow.left, and Arrow.right.
Two enumeration types are the same if they have the same name and the same element names (in the same order) and distinct otherwise.
Contract types
Compact 1.0 does not fully implement declarations of contracts and the
cross-contract calls they support, but the keyword contract used to declare
contracts is reserved for this use.
Type aliases
Type aliases can be created via type declarations of the form:
| type-alias-declaration | → | exportopt newopt type type-name gparamsopt = type ; |
Within the scope of a type-alias declaration of type-name for type, type-name
is itself a type.
Type aliases are either structural or nominal, depending on whether the optional
new modifier is present:
- A type alias type-name for type declared without the optional
newmodifier is a structural type alias, i.e., type-name is the same type and is fully interchangeable with type. - A type alias type-name for type declared with the optional
newmodifier is a nominal type alias, i.e., type-name is a distinct type compatible with type but neither a subtype of nor a supertype of type (or any other type).
Any nominal type alias type-name for some type type is compatible with type in the following senses:
- values of type type-name have the same representation as values of type type
- values of type type-name can be used by primitive operations that require a value of type type
- values of type type-name can be cast explicitly to type, and
- values of type type can be cast explicitly to type type-name.
For example, within the scope of
new type V3U16 = Vector<3, Uint<16>>
a value of type V3U16 can be referenced or sliced just like a vector
of type Vector<3, Uint<16>>, but it cannot, for example, be passed to a function
that expects a value of type Vector<3, Uint<16>> without an explicit cast.
When one operand of an arithmetic operation (e.g., +) receives a value of some
nominal type alias type-name, the other operand must also be of type type-name,
and the result of performing the operation is cast to type type-name.
This might cause a run-time error if the result cannot be represented by type
type-name.
Values of any nominal type alias type-name cannot be compared directly using,
e.g., <, or ==, with values of any other type, including with values of
type type.
Such comparisons require one of the operands to be cast to the type of the other.
Both structural and nominal type aliases can take generic parameters, e.g.:
type V3<T> = Vector<3, T>;
and
new type VField<#N> = Vector<N, Field>;
When a generic nominal type is specialized, the specialized type is a nominal type.
Subtyping and least upper bounds
Some Compact types are related to other types via subtyping.
Informally, if a type T is a subtype of a type S (equivalently, S is a
supertype of type T), then every value of type T is also a value of type S,
i.e., any value of type T can be used where a value of type S is expected without
the need for an explicit cast.
For example, a circuit or witness can be called with argument expressions whose
types are subtypes of the corresponding parameter type annotations, and
a constant binding statement with a type annotation can be initialized with an
expression whose type is a subtype of the type annotation.
Subtyping is defined by the following rules:
- Any type
Tis a subtype of itself (subtyping is reflexive) Uint<0..n>is a subtype ofUint<0..m>ifnis less than (or equal to)mUint<0..n>is a subtype ofFieldfor alln- The tuple type
[T,⋯]is a subtype of the tuple type[S,⋯]if they have the same length and each typeTis a subtype of the corresponding typeS.
The least upper bound (with respect to subtyping) of a non-empty set of types
{T1, ⋯, Tn} is a type S such that:
Sis an upper bound:Tiis a subtype ofSfor alliin the range 1..n, andSis the least upper bound: for all upper boundsRof the set of types {T1, ⋯,Tn},Sis a subtype ofR.
Note that least upper bounds do not exist for all sets of types, because
some types (such as Boolean and Field) are not related.
Tuple and vector types: Every vector type has an equivalent tuple type:
Vector<n, T> is equivalent to (in fact, the same type as) [T1, ⋯, Tn].
The converse is not always true.
For example, neither [Boolean, Field] nor [Uint<8>, Uint<16>] have an
equivalent vector type.
We say, however, that a tuple type [T1, ⋯, Tn] with possibly distinct types T1, ⋯, Tn
"has a vector type" if the least upper bound S of the set of types {T1, ⋯, Tn}
exists.
In that case, the tuple type has the vector type Vector<n, S>.
Some operations over tuples (such as mapping and folding) require the tuple type
to have a vector type.
When a tuple type has a vector type, the tuple type is a subtype of the vector
type, but it might not be the same as the vector type.
For example, [Uint<16>, Uint<16>] has the vector type Vector<2, Uint<16>>,
and the two types are equivalent, whereas [Uint<8>, Uint<16>] also has the
vector type Vector<2, Uint<16>>, but the types are not equivalent.
Every vector type is a subtype of the equivalent tuple type and possibly of some
other tuple types.
In general, a vector type Vector<n, T> is a subtype of a tuple type [S1, ⋯, Sn]
if T is a subtype of each of the types S1, ⋯, Sn.
The means, for instance, that a vector can often be passed to a circuit where a
tuple is expected.
Patterns and destructuring
The parameters of a circuit or constructor and the target of a const binding
are specified via patterns:
| pattern | → | id |
| | | [ patternopt , ⋯ , patternopt ,opt ] | |
| | | { pattern-struct-elt , ⋯ , pattern-struct-elt ,opt } | |
| pattern-struct-elt | → | id |
| | | id : pattern |
In its simplest form, a pattern is just an identifier.
For example, in the code below, the parameter of sumTuple is the identifier
x and the targets of the two const bindings are the identifiers a and b.
circuit sumTuple(x: [Field, Field]): Field {
const a = x[0], b = x[1];
return a + b;
}
When the parameter type is a tuple, vector, or struct, it is often convenient to use one of the destructuring forms of patterns to name individual pieces of the tuple or struct at the point of binding rather than extracting them at each point of use. For example, one could replace the above with:
circuit sumTuple(x: [Field, Field]): Field {
const [a, b] = x;
return a + b;
}
or more simply with:
circuit sumTuple([a, b]: [Field, Field]): Field {
return a + b;
}
Here is a similar example that destructures a struct instead of a tuple:
struct S { x: Uint<16>, y: Uint<32> }
circuit sumStruct({x, y}: S): Field {
return x + y;
}
Whereas the elements of a tuple pattern are necessarily given in order, the
elements of a struct pattern need not be consistent with the order of the fields
in the declaration.
For example, the definition of sumStruct below is equivalent to the one above,
even though the order of the pattern elements has been swapped:
struct S { x: Uint<16>, y: Uint<32> }
circuit sumStruct({y, x}: S): Field {
return x + y;
}
By default, the names bound by the pattern are the same as the names of the structure elements. When this is not convenient, it is possible to choose different names for the structure elements:
struct S { x: Uint<16>, y: Uint<32> }
circuit sumStruct({x: a, y}: S): Field {
return a + y;
}
While x: a looks like an identifier with a type annotation, in this context
it simply indicates that a rather than x is bound to the value in the
x field.
Patterns can be arbitrarily nested, e.g.:
struct S { x: Uint<16>, y: Uint<32> }
circuit sumTupleStruct([{x: a1, y: b1}, {x: a2, y: b2}]: [S, S]): Field {
return a1 + b1 + a2 + b2;
}
It is permissible and sometimes useful to not name certain parts of the tuple or struct.
struct S { x: Uint<16>, y: Uint<32> }
circuit sumSomeYs([{y: b1}, , {y: b3}]: [S, S, S]): Field {
return b1 + b3;
}
Here the input is a tuple with three elements, but the pattern skips the
second by putting two commas between the first and third.
Similarly, while each element of the tuple is a struct with both x and y
fields, the pattern ignores the x fields simply by failing to mention them.
It is a static error if a pattern implies a different shape from the declared or inferred type of value to be destructured. For example:
struct S { x: Uint<16>, y: Uint<32> }
circuit sumStruct({x, y}: [Field, Field]): Field {
return x + y;
}
fails because it tries to treat a tuple as a struct, while:
struct S { x: Uint<16>, y: Uint<32> }
circuit sumSomeYs([{y: b1}, , , {y: b3}]: [S, S, S]): Field {
return b1 + b3;
}
fails because it implies that the input tuple has four elements (including two skipped elements) when it actually has only three, and:
struct S { x: Uint<16>, y: Uint<32> }
circuit sumSomeYs([{y: b1}, , {z: b3}]: [S, S, S]): Field {
return b1 + b3;
}
fails because it tries to name a nonexistent z field in one of the structs.
Trailing commas in a pattern imply nothing about the structure of the input and are ignored:
struct S { x: Uint<16>, y: Uint<32> }
circuit sumSomeYs([{y: b1,}, , {y: b3,},]: [S, S, S]): Field {
return b1 + b3;
}
Programs
A compact program is a sequence of zero or more program elements.
| program | → | pelt ⋯ pelt |
| pelt | → | pragma-form |
| | | module-definition | |
| | | export-form | |
| | | import-form | |
| | | include-form | |
| | | struct-declaration | |
| | | enum-declaration | |
| | | contract-declaration | |
| | | type-alias-declaration | |
| | | witness-declaration | |
| | | ledger-declaration | |
| | | constructor-definition | |
| | | circuit-definition |
Briefly:
- A pragma form allows the program to declare the version of the compiler and/or the language that it requires
- A module definition defines a Compact module, which also contains a sequence of program elements in its own nested scope.
- An export form exports bindings from a module or from the program itself.
- An import form imports bindings from a Compact module.
- An include form allows program elements to be included from other Compact programs
- A structure definition defines a structure type.
- An enumeration definition define an enumeration type.
- A contract declaration declares a contract type.
- A type alias definition defines a type alias, possibly creating a distinct type.
- A witness declaration declare a witness, which is a function provided by the Dapp.
- A ledger declaration declares one field of the contract's public state.
- A constructor definition defines the contract's constructor, if any.
- A circuit definition defines a circuit.
The order of program elements in a program or module is unimportant, except that any module must be defined before any import of the module, and any program-defined types used as generic parameters by an import form must be defined before the import form.
Detailed descriptions of struct, enum, contract, and type-alias declarations appear in Compact types above. Detailed descriptions of the remaining program elements are described in the following section.
Pragmas
A pragma declares a constraint on either the compiler version or the language
version. The valid identifiers for the language and compiler versions are
language_version and compiler_version, respectively.
| pragma-form | → | pragma id version-expr ; |
| version-expr | → | version-expr || version-expr0 |
| | | version-expr0 | |
| version-expr0 | → | version-expr0 && version-term |
| | | version-term | |
| version-term | → | version-atom |
| | | ! version-atom | |
| | | < version-atom | |
| | | <= version-atom | |
| | | >= version-atom | |
| | | > version-atom | |
| | | ( version-expr ) | |
| version-atom | → | nat |
| | | version |
Modules, exports, and imports
Modules in Compact are used for namespace management and also possibly to split
programs into multiple files.
A module is a named collection of program elements created via a module definition:
| module-definition | → | exportopt module module-name gparamsopt { pelt … pelt } |
A module definition makes a binding from module-name to the module visible in
the program or module containing the module definition.
Any bindings established by program elements within the module are not made
visible, at least not until the module is imported.
A module can have generic parameters, in which case it is a generic module and must be specialized with generic arguments at the point of import.
Exports
By default, identifier bindings created by the program elements within the body
of a module are visible only within the module, i.e., they are not exported from
the module.
Any identifier defined at or imported into the top level of a module can be exported
from the module in one of two ways: (1) by prefixing the definition with the
export keyword, or by listing the identifier in a separate export declaration:
| export-form | → | export { id , ⋯ , id ,opt } ;opt |
For example, the following module exports G and S but not F.
module M {
export { G };
export struct S { x: Uint<16>, y: Boolean }
circuit F(s: S): Boolean {
return s.y;
}
circuit G(s: S): Uint<16> {
return F(s) ? s.x : 0;
}
}
Exporting a binding from a module has no effect unless the module is imported.
Imports
A module can be imported into another module or into the program top level, making some or all of its exported bindings visible there, potentially with a prefix.
| import-form | → | import import-selectionopt import-name gargsopt import-prefixopt ; |
| import-selection | → | { import-element , ⋯ , import-element ,opt } from |
| import-element | → | id |
| | | id as id | |
| import-name | → | id |
| | | file | |
| import-prefix | → | prefix id |
For example:
module Runner {
export circuit start(): [] {}
export circuit stop(): [] {}
}
module UseRunner1 {
import Runner;
// start and stop are now in scope
}
module UseRunner2 {
import { start } from Runner;
// start is now in scope, but not stop
}
module UseRunner3 {
import Runner prefix Runner$;
// Runner$start and Runner$stop are now in scope, but not stop or run
}
and
module Identity<T> {
export { id }
circuit id(x: T): T {
return x;
}
}
import Identity<Field>;
// id is now in scope, specialized to type Field
When an import for some module M appears before any visible definition
of M, the module is assumed to reside in the filesystem, and it is imported
directly from there.
If M is an identifier M, a definition for module M must be contained within
the file M.compact in the same directory as the importing file or in one of the
directories in the compact path.
If M is a string "{prefix/}M", where {prefix/} is either empty or is
a pathname ending in a directory separator, a definition for a module named
M must be contained within a file M.compact that is:
- (a) if
{prefix/}M.compactis an absolute pathname, then exactly at{prefix/}M.compact, otherwise - (b) at
{prefix/}M.compactrelative to the directory of the importing file or to one of the directories in the compact path. Details on the search order and the mechanism for setting the compact path are given in Compiler Usage.
In any of these cases, it is a static error if M.compact is not found, if it
does not contain a definition for a module named M, or if it contains anything
else other than comments and whitespace.
Several examples follow.
Example 1: The file M.compact below contains a single module definition:
module M {
export { F };
export struct S { x: Uint<16>, y: Boolean }
circuit F(x: S): Boolean {
return x.y;
}
}
Then, test1.compact import M from M.compact:
import M;
export { F };
whereas test2.compact uses its own definition of M:
module M {
export { G };
export struct S { x: Uint<16>, y: Boolean }
circuit G(x: S): Boolean {
return x.y;
}
}
import M;
export { G };
Importing by a pathname allows multiple modules with the same name to be imported into the same scope. For example:
The file M.compact below contains a single module definition, as before:
module M {
export { F };
export struct S { x: Uint<16>, y: Boolean }
circuit F(x: S): Boolean {
return x.y;
}
}
and A/M.compact contains a different module definition:
module M {
export { F };
export struct S { x: Uint<16>, y: Boolean }
circuit F(x: S): Boolean {
return x.y;
}
}
Then the program test.compact can define M and import all three of
M, "M", and "A/M":
module M {
export { F };
export struct S { x: Uint<16>, y: Boolean }
circuit F(x: S): Boolean {
return x.y;
}
}
import M prefix M1$;
import "M" prefix M2$;
import "A/M" prefix M3$;
export { M1$F, M2$F, M3$F };
The compact standard library
Compact's standard library can be imported by import CompactStandardLibrary.
The standard library defines a number of useful types and circuits along with
ledger-state types such as Counter, Map, and MerkleTree.
Top-level exports
Certain kinds of program elements can be exported from a contract's top level, namely circuits, program-defined types, and ledger fields. Exporting them makes them visible outside of the contract, i.e., to the TypeScript driver for the smart contract.
The circuits exported at the top level of a contract (i.e., not merely exported from a module) are the entry points of the contract. (A Compact program has no "main" entry point, but is more similar to a library containing multiple entry points that share a common store.) Although multiple circuits with the same name are allowed generally to support function overloading, it is a static error if more than one circuit with the same name is exported from the top level. It is also a static error for a generic circuit, i.e., one with generic parameters, to be exported from the top level.
Program-defined types exported from the top level of the main file can be used to describe the argument and return types of witnesses and exported circuits; these may accept generic arguments, but generic arguments marked as sizes rather than types are dropped in the exported type. For example:
export struct S<#n, T> { v: Vector<n, T>; curidx: Uint<0..n> }
is exported for use as a TypeScript type with the T parameter but not the n
parameter, i.e.,:
export type S<T> = { v: T[]; curidx: bigint }
Ledger field names exported from the top level are visible for direct inspection
by code outside of the contract via the generated TypeScript ledger() function.
It is a static error to export any other kind of binding from the top level.
Include files
Compact allows programs and modules to be split into multiple files and spliced
togther via include forms, which have the following syntax, where file is a
string literal specifying a filesystem pathname for file to be included:
| include-form | → | include file ; |
file can be an absolute pathname, one that is relative to the directory
of the including file, or one that is relative to one of the directories in the
compact path.
Details on the search order and the mechanism for setting the compact path are
given in Compiler Usage.
It is a static error if the file is not present or cannot be read.
If present and readable, the file must contain a sequence of syntactically
valid program elements, and these elements are treated as if they had
been present in the including file in place of the include form.
Declaring witnesses for private state management
A user's private state should be maintained in some secure way by the TypeScript driver of a smart contract and never stored directly in the public state of the contract. A contract must sometimes prove something about some piece of private state, however, as well as cause an update to the private state.
The TypeScript driver of the smart contract can provide pieces of private state to the contract via the arguments of some exported circuit, and it can update the private state based on the return values of the exported circuit.
A circuit can also access or update private state as it operates via witnesses. Witnesses are callback functions provided by the TypeScript driver.
Witnesses must be declared to make them visible to the circuits of a contract. A witness declaration does not include a body, because the implementation is provided by the TypeScript driver.
| witness-declaration | → | exportopt witness id gparamsopt simple-parameter-list : type ; |
| simple-parameter-list | → | ( typed-identifier , ⋯ , typed-identifier ,opt ) |
| typed-identifier | → | id : type |
Witness declarations can appear anywhere among the program elements of a module or the contract's top level.
For instance:
witness W(x: Uint<16>): Bytes<32>;
defines a witness W, to which the contract provides a 16-bit unsigned value
and from which the contract receives 32 bytes of some presumably private data.
Do not assume in your contract that the code of any witness function
is the code that you wrote in your own implementation. Any DApp may
provide any implementation that it wants for your witness functions.
Results from them should be treated as untrusted input.
Declaring and maintaining public state
A contract declares the shape of its public state through ledger declarations.
Each ledger declaration defines one piece of information that the contract might store in the public ledger. Multiple ledger declarations can appear in a program, or none. They can appear anywhere among the program elements of a module or the contract's top level.
| ledger-declaration | → | exportopt sealedopt ledger id : type ; |
A ledger declaration binds a ledger field name to one of a set of predefined ledger-state types. For instance:
import CompactStandardLibrary;
ledger val: Field;
export ledger cnt: Counter;
sealed ledger u8list: List<Uint<8>>;
export sealed ledger mapping: Map<Boolean, Field>;
Ledger-state types
The following ledger-state types are supported.
T, for any regular Compact typeTCounterSet<T>, for any Compact typeTMap<K, T>, for any Compact typesKandTMap<K, V>, for any Compact typeKand ledger-state typeV(see the following section)List<T>, for any Compact typeTMerkleTree<n, T>, for a compile time integer1 < n <= 32, and any Compact typeTHistoricMerkleTree<n, T>, for a compile time integer1 < n <= 32, and any Compact typeT
Each ledger type supports a set of operations, which can be invoked with
<field name>.<operation>(<arguments ⋯>)
A ledger field that is declared with a Compact type T implicitly has the type
Cell<T>, which supports several operations, including read, write, and
reset_to_default.
For example:
ledger F: Uint<16>;
export circuit putF(x: Uint<16>): [] {
F.write(disclose(x));
}
export circuit getF(): Uint<16> {
return F.read();
}
The read operation of the Cell ledger-state type can be abbreviated to,
simply, a reference to the field name, and the write operation can be abbreviated
to an assignment of the form <field name> = expr.
So the above can be written more simply as:
ledger F: Uint<16>;
export circuit putF(x: Uint<16>): [] {
F = disclose(x);
}
export circuit getF(): Uint<16> {
return F;
}
The read operation of the Counter type can be abbreviated in the same way,
and its increment and decrement operations can be abbreviated to assignments
of the form <field name> += expr and <field name> -= expr.
For example:
import CompactStandardLibrary;
ledger F: Counter;
export circuit incrF(): [] {
F += 1;
}
export circuit decrF(): [] {
F -= 1;
}
export circuit getF(): Uint<64> {
return F;
}
The ledger-state types are designed in to minimize the extent to which a circuit
proof depends upon the exact value of a ledger-state field as of when a transaction
is created to minimize the chance of the transaction being rejected when the
proof is checked on chain.
For example, Counter increment and decrement do not commit to the current value
of the counter.
While the preceding example could be written a Cell instead:
ledger F: Uint<64>;
export circuit incrF(): [] {
F = F + 1 as Uint<64>;
}
export circuit decrF(): [] {
F = F - 1;
}
export circuit getF(): Uint<64> {
return F;
}
A call to incrF or decrF involves an explicit read of F, and if F does
not have the same value when the proof is checked as when the transaction is
created, the proof fails and the transaction is rejected.
A comprehensive description of ledger-state types and operations can be found in the Compact ledger data type documentation.
Nested state types in the Map type
The only ledger-state type in which values of other state types may be held is
Map.
The key values in a Map must be regular Compact types, but the mapped values
may be counters, sets, lists, other maps, and so on.
Here is a small example:
import CompactStandardLibrary;
ledger fld: Map<Boolean, Map<Field, Counter>>;
export circuit initNestedMap(b: Boolean): [] {
fld.insert(disclose(b), default<Map<Field, Counter>>);
}
export circuit initNestedCounter(b: Boolean, n: Field): [] {
fld.lookup(b).insert(disclose(n), default<Counter>);
}
export circuit incrementNestedCounter(b: Boolean, n: Field, k: Uint<16>): [] {
fld.lookup(b).lookup(n).increment(disclose(k));
}
export circuit readNestedCounter1(b: Boolean, n: Field): Uint<64> {
return fld.lookup(b).lookup(n).read();
}
export circuit readNestedCounter2(b: Boolean, n: Field): Uint<64> {
return fld.lookup(b).lookup(n);
}
In this example,
fldis bound to aMapfromBooleanvalues toMaps fromFieldvalues toCountersinitNestedMapcan be used to create the innerMapfor a particular outer-MapkeyinitNestedCountercan be used to create aCounterfor a given outer-Mapkey and a given inner-MapkeyincrementNestedCountercan be used to increment an existingCounterfor a given outer-Mapkey and a given inner-Mapkey- either
readNestedCounter1orreadNestedCounter2can be used to read the value of an existingCounterfor a given outer-Mapkey and a given inner-Mapkey.
Notes:
-
Nesting is permitted only within
Mapvalues. That is, nesting is not permitted inMapkeys or within any ledger-state type other thanMap. -
Nested values must be initialized before first use. The syntax
default<T>is used to create default ledger-state type values, just as it can be used to create default Compact type values. -
Ledger state type values are not first-class objects, so when accessing a nested value, the entire indirection chain must be used. For example, the following results in a compiler error:
export circuit incrementNestedCounter(b: Boolean, n: Field, k: Uint<16>): [] {
fld.lookup(b); // ERROR: incomplete chain of indirects
} -
When the last lookup is a read of a base type one can omit the explicit
read()indirect, as illustrated by the definitions ofreadNestedCounter1andreadNestedCounter2above, which have the same behavior. -
For convenience, local variables can hold default values of ledger-state types, so the following definition of
initNestedMapis equivalent to the one above.export circuit initNestedMap(b: Boolean): [] {
const t = default<Map<Field, Counter>>;
fld.insert(disclose(b), t);
}
Sealed and unsealed ledger fields
Any ledger field can be marked sealed by prefixing the ledger field declaration
with the optional modifier sealed.
A sealed field cannot be set except during contract initialization. That is, its
value can be modified only by the contract constructor (if any),
either directly within the body of the constructor or via helper
circuits called by the constructor.
The sealed keyword must come after the export keyword (if present) and before
the ledger keyword, as in the following example:
sealed ledger field1: Uint<32>;
export sealed ledger field2: Uint<32>;
circuit init(x: Uint<32>): [] {
field2 = disclose(x);
}
constructor(x: Uint<16>) {
field1 = 2 * disclose(x);
init(x);
}
It is a static error if a sealed ledger field is updated by any code that is reachable from an exported circuit.
Contract constructor
A contract can be initialized via a contract constructor defined at the program's top level.
| constructor-definition | → | constructor pattern-parameter-list block |
| pattern-parameter-list | → | ( typed-pattern , ⋯ , typed-pattern ,opt ) |
| typed-pattern | → | pattern : type |
The constructor, if any, is typically used to initialize public state and can also be used to initialize private state through witness calls. At most one contract constructor can be defined for a contract, and it must appear only at the program top level, i.e., it cannot be defined in a module. To initialize ledger fields that are visible only within a module, the constructor can call a circuit that is exported from the module. For example:
module PublicState {
enum STATE { unset, set }
ledger state: STATE;
ledger value: Field;
export circuit init(v: Field): [] {
value = disclose(v);
state = STATE.set;
}
}
import PublicState;
constructor(v: Field) {
init(v);
}
Each constructor parameter must be explicitly typed. The type of each variable binding arising from the binding of identifiers in each parameter pattern to the corresponding pieces of the input is the type of the corresonding part of the declared type's structure.
The return type of the constructor is always [].
Any attempt to return another type of value using return expression; where the
type of expression is something other than [], is a static error.
Circuit definitions
The basic operational element in Compact is the circuit.
This corresponds closely to a function in most languages but is designed
to be compilable into a zero-knowledge circuit.
The key limitation of circuits relative to functions in most languages is
that circuits cannot be recursive, either directly or indirectly.
Compact supports two kinds of circuits: named circuits and anonymous circuits. Named circuits are described here, and anonymous circuits are described in Circuit and witness calls.
Named circuit definitions have the following syntax:
| circuit-definition | → | exportopt pureopt circuit function-name gparamsopt pattern-parameter-list : type block |
| pattern-parameter-list | → | ( typed-pattern , ⋯ , typed-pattern ,opt ) |
| typed-pattern | → | pattern : type |
A circuit definition binds function-name to a circuit with the given parameters,
type, and body.
The optional export modifier indicates that the circuit binding should
be exported from the enclosing module or the program itself, if the circuit
is defined outside of any module.
The optional pure modifier indicates that the
circuit is pure.
If any generic parameters are present (gparams is present and is nonempty),
the circuit is [generic][#generic-parameters-and-arguments] and must be specialized
(provided with generic arguments) at the point of call.
Circuits can take zero or more parameters.
The parameters are all patterns containing identifiers
to be bound to selected pieces of the argument values.
In the simplest case, a pattern is just an identifier and is bound to the argument
value as a whole.
The bindings established by the parameters are visible within (and only within)
the block that constitutes the body of the circuit.
Each parameter must be explicitly typed, and at the point of every call to the
circuit, the type of the corresponding argument expression must be a subtype of
that type.
The type of each variable binding arising from the binding of identifiers in
each parameter pattern to the corresponding pieces of the input is the type of
the corresonding part of the declared type's structure.
For example, in the body of sumStruct below:
struct S { x: Uint<16>, y: Uint<32> }
circuit sumStruct({x, y}: S): Field {
return x + y;
}
The variable binding established for x by the pattern {x, y} has type
Uint<16>, and the variable binding for y has type Uint<32>.
Every named circuit's return type must be explicitly declared, and it is a static error if the circuit can return a value that is not a subtype of that type.
The body is evaluated each time the circuit it is called.
Pure and impure circuits
A Compact circuit is considered pure if it computes its outputs from its inputs without reference to or modification of public state (via the ledger) or private state (via witnesses). In practice, the compiler considers a circuit to be impure if the body of the circuit contains a ledger operation, a call to any impure circuit, or a call to a witness.
Some external circuits defined in CompactStandardLibrary are witnesses; calls
to these make the caller impure. The remainder are considered pure, so calls to
those do not make the caller impure.
A Compact program can declare a circuit to be pure by prefixing the circuit
definition with the pure modifier, which must follow the export modifier,
if present, e.g.:
pure circuit c(a: Field): Field { ⋯ }
export pure circuit c(a: Field): Field { ⋯ }
The only effect of the pure modifier is that the compiler flags the
declaration as an error if its own analysis determines that the circuit is
actually impure. The pure modifier allows an application to ensure that the
circuit is present in the PureCircuits type declaration and via the
pureCircuits constant in the TypeScript module produced for a (correct)
Compact program by the Compact compiler.
Blocks
A block is a group of statements enclosed in braces:
| block | → | { stmt ⋯ stmt } |
A block can be used in place of a single statement anywhere a single statement
is allowed, which is useful for allowing multiple statements to be evaluated by the
then and else parts of an if statement or the body of a for.
The body of every circuit definition and the body of the constructor, if any, is always a block. The right-hand side of the arrow in an anonymous circuit can be either a block or an expression.
The statements within a block occupy a nested scope: const bindings created by
const statements within the block are not visible outside the block, and they can
shadow identifier bindings with the same names that exist outside the block.
A block is well-typed if the statements within it are well-typed.
A block is evaluated by evaluating the statements in sequence.
Statements
The syntax of Compact statements is summarized by the following grammar snippet:
| stmt | → | if ( expr-seq ) stmt |
| | | stmt0 | |
| stmt0 | → | expr-seq ; |
| | | const cbinding , ⋯¹ , cbinding ; | |
| | | if ( expr-seq ) stmt0 else stmt | |
| | | for ( const id of nat .. nat ) stmt | |
| | | for ( const id of expr-seq ) stmt | |
| | | return expr-seq ; | |
| | | return ; | |
| | | block |
The snippet shows that a statement (stmt) is either a one-armed if expression
or some other kinf of statement (stmt0).
This structure is used to enforce the restriction that the then part of a
two-armed cannot be a one-armed if expression.
This is often left ambiguous in the grammar, with a separate note to say
that the ambiguity is resolved by assocating an else part with the closest
enclosing if expression, but we prefer to make the constraint explicit in
the grammar and avoid any ambiguity.
The first kind of stmt0 is expr-seq.
An expr-seq, or expression sequence, is a comma separated sequence of one
or more expressions to be evaluated in sequence.
This simply means that any expression or comma-separated sequence also qualifies
as a statement.
Expression sequences are described more fully in
(their own section)[expression-sequences].
The remaining subsections of this section describe each of the other kinds of statement.
Statements either do not have a value or (in the case of expression sequences
serving as statements) their value is ignored.
Thus it is not necessary to talk about the type of a statement.
Nevertheless, each statement has typing rules that must be followed, such as that
the type of the test expression of an if expression must be Boolean.
const statements
const statements create local variable bindings.
Every const statement takes the following form:
const cbindings
where cbindings is a comma-separated sequence of one or more cbinding subforms
in the syntax specified by the following grammar snippet:
| cbinding | → | optionally-typed-pattern = expr |
| optionally-typed-pattern | → | pattern |
| | | typed-pattern | |
| pattern | → | id |
| | | [ patternopt , ⋯ , patternopt ,opt ] | |
| | | { pattern-struct-elt , ⋯ , pattern-struct-elt ,opt } | |
| pattern-struct-elt | → | id |
| | | id : pattern | |
| typed-pattern | → | pattern : type |
A const statement is typed by typing each of its cbinding subforms.
A cbinding subform is typed by typing the expression on the right-hand side
of the =.
If a type T is included on the left-hand side, the type of the expression
must be a subtype of T, otherwise it is a static error.
It is also a static error if the pattern implies a different structure from
the type of the expression.
For example, it is a static error if the pattern implies that the expression's
value is a tuple when the type is actually, say, Field.
The type of each variable binding arising from the binding of identifiers in each
pattern to the corresponding pieces of the input is the type of the corresonding
part of the structure of the declared type, if present, otherwise of the inferred
type.
For example, in the following code, the binding for x has type Boolean, and
the binding for y has type [Uint<64>, Uint<64>]:
witness w(): [Boolean, [Uint<16>, Uint<32>]];
circuit foo(): [Uint<64>, Uint<64>] {
const [x, y]: [Boolean, [Uint<64>, Uint<64>]] = w();
return x ? y : [0, 0];
}
while in the following, x still has type Boolean but y has (the inferred)
type [Uint<16>, Uint<64>].
witness w(): [Boolean, [Uint<16>, Uint<32>]];
circuit foo(): [Uint<64>, Uint<64>] {
const [x, y] = w();
return x ? y : [0, 0];
}
Except where shadowed, the scope of each variable bound by a const statement
is the entirety of the innermost block containing the const statement.
It must not be referenced, however, before it is given a binding.
Any attempt to do so is a static error.
For example, the reference to x on the first line of the body in the
definition of foo below is a static error:
circuit foo(a: Uint<16>): Field {
const y = x + a;
const x = 7;
return y;
}
Similarly, the reference to x in the first cbinding of the const
statement below is also a static error.
const y = x, x = 7;
}
A const statement is evaluated by evaluating the cbinding subforms in order
so that the variables given values by each cbinding is available to the
cbindings that follow.
The evaluation of each cbinding involves evaluating the expression
on the right-hand side of the = and then giving values to identifiers
in the pattern p on the left-hand side to the value v of the
expression as described earlier in
(Patterns and destructuring)[patterns-and-destructuring].
Any variable bound by const may not be reused within a block, although a
const binding in a nested block might shadow it.
Variables are immutable, although the same variable might take on different values
at different times if is contined within a block of code that is evaluated more
than once, such as would be the case for the body of a circuit that is called
more than once.
if statements
An if statement is used to determine the flow of control through the
statements of a circuit or constructor body.
A one-armed if expression has a test part (an expr-seq enclosed
in parentheses) and a then part (then-statement):
if (expr-seq) then-statement
A two-armed if expression has a test part (an expr-seq enclosed
in parentheses), a then part (then-statement), and an else part
(else-statement):
if (expr-seq)
then-statement
else
else-statement
The typing of an if expression requires only that the type of expr-seq must be
Boolean.
Evaluating an if expression involves first evaluating the expr-seq.
If it evaluates to true, then-statement is evaluated.
Otherwise, it must evaluate to false, in which case else-statement
(if present), is evaluated.
for statements
for statements are used to iterate over a sequence of values.
In Compact, in constrast to most languages, the number of iterations can
always be determined at compile time.
That is, the number of iterations is bounded either by constant numeric
bounds or by size of an object of constant size.
This restriction is motivated by the need for the compiler to generate
finite proving circuits.
Compact supports two kinds of for statements.
The first iterates over vectors, tuples, and byte-vectors and takes the
following form:
for (const x of expression) statement
This kind of for statement is typed by typing expression and verifying that
it is a Vector type, a tuple type that (has a vector type)[Subtyping and least
upper bounds], or a Bytes type.
Evaluating this kind of for requires evaluating expression then evaluating
statement once for each element with x bound to the value of each element
in turn.
The second form iterates over a range of unsigned integer values and takes the following form:
for (const i of start..end) statement
In this form, each of start and end must be a literal unsigned integer
or a generic natural-number parameter, and end must be greater than or
equal to start.
Otherwise, it is a static error.
This form is always well-typed.
Evaluating this kind of for requires evaluating statement with i bound
to k for each k in the range start (inclusive) to end (exclusive).
return statements cannot be used to return from within for statements.
It is therefore a static error for statement to be a return statement
or for a return statement to appear within statement (except where it
appears nested within an anoymous circuit).
Iteration can also be accomplished via
(map and fold expressions)[map-and-fold-expressions].
return statements
A return statement can be used to exit from the closest enclosing
anonymous circuit circuit, if any, or otherwise from
the enclosing constructor or named circuit, and to return to the caller either
an explicit return value, i.e., the value of (expr-seq)[expression-sequences]
in this form of return:
return expr-seq;
or the default value [] in this form of return:
return;
A circuit or constructor body can also exit without an explicit return statement:
any path through the body that does not end in an explicit return statement is
treated as if ended with return;.
A return statement is always well-typed if it exits from an anonymous circuit
without a declared return type.
Otherwise, a return statement is well-typed if the type of expr-seq, or
[] if no expr-seq is present, is a subtype the expected return type.
If the return form exits from a named circuit or an anonymous circuit
with a declared return type, the expected type is the declared return type,
while if it exits from the constructor, the expected return type is [].
An implication of these rules is that it is a static error to exit without an
explicit return value from a circuit with a declared return type other than [].
When a return statement is evaluated, the expr-seq, if present, is evaluated,
the circuit or constructor exits immediately without evaluating any
subsequent statements, and it returns to the caller the value of expr-seq or
[] if no expr-seq is present.
Expressions
The syntax of Compact expressions is summarized by the following grammar snippet:
| expr-seq | → | expr |
| | | expr , ⋯¹ , expr , expr | |
| expr | → | expr0 ? expr : expr |
| | | expr0 = expr | |
| | | expr0 += expr | |
| | | expr0 -= expr | |
| | | expr0 | |
| expr0 | → | expr0 || expr1 |
| | | expr1 | |
| expr1 | → | expr1 && expr2 |
| | | expr2 | |
| expr2 | → | expr2 == expr3 |
| | | expr2 != expr3 | |
| | | expr3 | |
| expr3 | → | expr4 < expr4 |
| | | expr4 <= expr4 | |
| | | expr4 >= expr4 | |
| | | expr4 |