Zig is a general-purpose programming language and toolchain for maintaining robust, optimal, and reusable software.
Robust
Behavior is correct even for edge cases such as out of memory.
Optimal
Write programs the best way they can behave and perform.
Reusable
The same code works in many environments which have different constraints.
Maintainable
Precisely communicate intent to the compiler and other programmers. The language imposes a low overhead to reading code and is resilient to changing requirements and environments.
Often the most efficient way to learn something new is to see examples, so this documentation shows how to use each of Zig's features. It is all on one page so you can search with your browser's search tool.
The code samples in this document are compiled and tested as part of the main test suite of Zig.
This HTML document depends on no external files, so you can use it offline.
Zig's Standard Library contains commonly used algorithms, data structures, and definitions to help you build programs or libraries. You will see many examples of Zig's Standard Library used in this documentation. To learn more about the Zig Standard Library, visit the link above.
Hello World
The Zig code sample above demonstrates one way to create a program that will output: Hello, world!.
The code sample shows the contents of a file named hello.zig. Files storing Zig source code are UTF-8 encoded text files. The files storing Zig source code are usually named with the .zig extension.
Following the hello.zig Zig code sample, the Zig Build System is used to build an executable program from the hello.zig source code. Then, the hello program is executed showing its output Hello, world!. The lines beginning with $ represent command line prompts and a command. Everything else is program output.
The code sample begins by adding the Zig Standard Library to the build using the @import builtin function. The @import("std") function call creates a structure that represents the Zig Standard Library. The code then declares a constant identifier, named std, that gives access the features of the Zig Standard Library.
Next, a public function, pubfn, named main is declared. The main function is necessary because it tells the Zig compiler where the start of the program exists. Programs designed to be executed will need a pubfnmain function.
A function is a block of any number of statements and expressions that, as a whole, perform a task. Functions may or may not return data after they are done performing their task. If a function cannot perform its task, it might return an error. Zig makes all of this explicit.
In the hello.zig code sample, the main function is declared with the !void return type. This return type is known as an Error Union Type. This syntax tells the Zig compiler that the function will either return an error or a value. An error union type combines an Error Set Type and any other data type (e.g. a Primitive Type or a user-defined type such as a struct, enum, or union). The full form of an error union type is <error set type>!<any data type>. In the code sample, the error set type is not explicitly written on the left side of the ! operator. When written this way, the error set type is an inferred error set type. The void after the ! operator tells the compiler that the function will not return a value under normal circumstances (i.e. when no errors occur).
In Zig, a function's block of statements and expressions are surrounded by an open curly-brace { and close curly-brace }. Inside of the main function are expressions that perform the task of outputting Hello, world! to standard output.
First, a constant identifier, stdout, is initialized to represent standard output's writer. Then, the program tries to print the Hello, world! message to standard output.
Functions sometimes need information to perform their task. In Zig, information is passed to functions between an open parenthesis ( and a close parenthesis ) placed after the function's name. This information is also known as arguments. When there are multiple arguments passed to a function, they are separated by commas ,.
The two arguments passed to the stdout.print() function, "Hello, {s}!\n" and .{"world"}, are evaluated at compile-time. The code sample is purposely written to show how to perform string substitution in the print function. The curly-braces inside of the first argument are substituted with the compile-time known value inside of the second argument (known as an anonymous struct literal). The \n inside of the double-quotes of the first argument is the escape sequence for the newline character. The try expression evaluates the result of stdout.print. If the result is an error, then the try expression will return from main with the error. Otherwise, the program will continue. In this case, there are no more statements or expressions left to execute in the main function, so the program exits.
In Zig, the standard output writer's print function is allowed to fail because it is actually a function defined as part of a generic Writer. Consider a generic Writer that represents writing data to a file. When the disk is full, a write to the file will fail. However, we typically do not expect writing text to the standard output to fail. To avoid having to handle the failure case of printing to standard output, you can use alternate functions: the functions in std.log for proper logging or the std.debug.print function. This documentation will use the latter option to print to standard error (stderr) and silently return on failure. The next code sample, hello_again.zig demonstrates the use of std.debug.print.
Note that you can leave off the ! from the return type because std.debug.print cannot fail.
There are no multiline comments in Zig (e.g. like /* */ comments in C). This helps allow Zig to have the property that each line of code can be tokenized out of context.
Doc comments
A doc comment is one that begins with exactly three slashes (i.e. /// but not ////); multiple doc comments in a row are merged together to form a multiline doc comment. The doc comment documents whatever immediately follows it.
Doc comments are only allowed in certain places; eventually, it will become a compile error to have a doc comment in an unexpected place, such as in the middle of an expression, or just before a non-doc comment.
Top-Level Doc Comments
User documentation that doesn't belong to whatever immediately follows it, like package-level documentation, goes in top-level doc comments. A top-level doc comment is one that begins with two slashes and an exclamation point: //!.
Values
Primitive Types
Primitive Types
Type
C Equivalent
Description
i8
int8_t
signed 8-bit integer
u8
uint8_t
unsigned 8-bit integer
i16
int16_t
signed 16-bit integer
u16
uint16_t
unsigned 16-bit integer
i32
int32_t
signed 32-bit integer
u32
uint32_t
unsigned 32-bit integer
i64
int64_t
signed 64-bit integer
u64
uint64_t
unsigned 64-bit integer
i128
__int128
signed 128-bit integer
u128
unsigned __int128
unsigned 128-bit integer
isize
intptr_t
signed pointer sized integer
usize
uintptr_t
unsigned pointer sized integer
c_short
short
for ABI compatibility with C
c_ushort
unsigned short
for ABI compatibility with C
c_int
int
for ABI compatibility with C
c_uint
unsigned int
for ABI compatibility with C
c_long
long
for ABI compatibility with C
c_ulong
unsigned long
for ABI compatibility with C
c_longlong
long long
for ABI compatibility with C
c_ulonglong
unsigned long long
for ABI compatibility with C
c_longdouble
long double
for ABI compatibility with C
f16
_Float16
16-bit floating point (10-bit mantissa) IEEE-754-2008 binary16
f32
float
32-bit floating point (23-bit mantissa) IEEE-754-2008 binary32
f64
double
64-bit floating point (52-bit mantissa) IEEE-754-2008 binary64
f128
_Float128
128-bit floating point (112-bit mantissa) IEEE-754-2008 binary128
bool
bool
true or false
anyopaque
void
Used for type-erased pointers.
void
(none)
0 bit type
noreturn
(none)
the type of break, continue, return, unreachable, and while (true) {}
type
(none)
the type of types
anyerror
(none)
an error code
comptime_int
(none)
Only allowed for comptime-known values. The type of integer literals.
comptime_float
(none)
Only allowed for comptime-known values. The type of float literals.
In addition to the integer types above, arbitrary bit-width integers can be referenced by using an identifier of i or u followed by digits. For example, the identifier i7 refers to a signed 7-bit integer. The maximum allowed bit-width of an integer type is 65535.
String literals are constant single-item Pointers to null-terminated byte arrays. The type of string literals encodes both the length, and the fact that they are null-terminated, and thus they can be coerced to both Slices and Null-Terminated Pointers. Dereferencing string literals converts them to Arrays.
The encoding of a string in Zig is de-facto assumed to be UTF-8. Because Zig source code is UTF-8 encoded, any non-ASCII bytes appearing within a string literal in source code carry their UTF-8 meaning into the content of the string in the Zig program; the bytes are not modified by the compiler. However, it is possible to embed non-UTF-8 bytes into a string literal using \xNN notation.
Unicode code point literals have type comptime_int, the same as Integer Literals. All Escape Sequences are valid in both string literals and Unicode code point literals.
In many other programming languages, a Unicode code point literal is called a "character literal". However, there is no precise technical definition of a "character" in recent versions of the Unicode specification (as of Unicode 13.0). In Zig, a Unicode code point literal corresponds to the Unicode definition of a code point.
Multiline string literals have no escapes and can span across multiple lines. To start a multiline string literal, use the \\ token. Just like a comment, the string literal goes until the end of the line. The end of the line is not included in the string literal. However, if the next line begins with \\ then a newline is appended and the string literal continues.
undefined can be coerced to any type. Once this happens, it is no longer possible to detect that the value is undefined. undefined means the value could be anything, even something that is nonsense according to the type. Translated into English, undefined means "Not a meaningful value. Using this value would be a bug. The value will be unused, or overwritten before being used."
In Debug mode, Zig writes 0xaa bytes to undefined memory. This is to catch bugs early, and to help detect use of undefined memory in a debugger.
Zig Test
Code written within one or more test declarations can be used to ensure behavior meets expectations:
The introducing_zig_test.zig code sample tests the functionaddOne to ensure that it returns 42 given the input 41. From this test's perspective, the addOne function is said to be code under test.
zig test is a tool that creates and runs a test build. By default, it builds and runs an executable program using the default test runner provided by the Zig Standard Library as its main entry point. During the build, test declarations found while resolving the given Zig source file are included for the default test runner to run and report on.
The shell output shown above displays two lines after the zig test command. These lines are printed to standard error by the default test runner:
Test [1/1] test "expect addOne adds one to 41"...
Lines like this indicate which test, out of the total number of tests, is being run. In this case, [1/1] indicates that the first test, out of a total of one test, is being run. Note that, when the test runner program's standard error is output to the terminal, these lines are cleared when a test succeeds.
All 1 tests passed.
This line indicates the total number of tests that have passed.
Test Declarations
Test declarations contain the keywordtest, followed by an optional name written as a string literal, followed by a block containing any valid Zig code that is allowed in a function.
Test declarations are similar to Functions: they have a return type and a block of code. The implicit return type of test is the Error Union Typeanyerror!void, and it cannot be changed. When a Zig source file is not built using the zig test tool, the test declarations are omitted from the build.
Test declarations can be written in the same file, where code under test is written, or in a separate Zig source file. Since test declarations are top-level declarations, they are order-independent and can be written before or after the code under test.
When the zig test tool is building a test runner, only resolved test declarations are included in the build. Initially, only the given Zig source file's top-level declarations are resolved. Unless nested containers are referenced from a top-level test declaration, nested container tests will not be resolved.
The code sample below uses the std.testing.refAllDecls(@This()) function call to reference all of the containers that are in the file including the imported Zig source file. The code sample also shows an alternative way to reference containers using the _ = C; syntax. This syntax tells the compiler to ignore the result of the expression on the right side of the assignment operator.
Test Failure
The default test runner checks for an error returned from a test. When a test returns an error, the test is considered a failure and its error return trace is output to standard error. The total number of failures will be reported after all tests have run.
Skip Tests
One way to skip tests is to filter them out by using the zig test command line parameter --test-filter [text]. This makes the test build only include tests whose name contains the supplied filter text. Note that non-named tests are run even when using the --test-filter [text] command line parameter.
To programmatically skip a test, make a test return the error error.SkipZigTest and the default test runner will consider the test as being skipped. The total number of skipped tests will be reported after all tests have run.
The default test runner skips tests containing a suspend point while the test is running using the default, blocking IO mode. (The evented IO mode is enabled using the --test-evented-io command line parameter.)
In the code sample above, the test would not be skipped in blocking IO mode if the nosuspend keyword was used (see Async and Await).
Report Memory Leaks
When code allocates Memory using the Zig Standard Library's testing allocator, std.testing.allocator, the default test runner will report any leaks that are found from using the testing allocator:
Use the compile variable@import("builtin").is_test to detect a test build:
Test Output and Logging
The default test runner and the Zig Standard Library's testing namespace output messages to standard error.
The Testing Namespace
The Zig Standard Library's testing namespace contains useful functions to help you create tests. In addition to the expect function, this document uses a couple of more functions as exemplified here:
The Zig Standard Library also contains functions to compare Slices, strings, and more. See the rest of the std.testing namespace in the Zig Standard Library for more available functions.
Test Tool Documentation
zig test has a few command line parameters which affect the compilation. See zig test --help for a full list.
Variables are never allowed to shadow identifiers from an outer scope.
It is generally preferable to use const rather than var when declaring a variable. This causes less work for both humans and computers to do when reading code, and creates more optimization opportunities.
Container Level Variables
Container level variables have static lifetime and are order-independent and lazily analyzed. The initialization value of container level variables is implicitly comptime. If a container level variable is const then its value is comptime-known, otherwise it is runtime-known.
Container level variables may be declared inside a struct, union, or enum:
Static Local Variables
It is also possible to have local variables with static lifetime by using containers inside functions.
The extern keyword or @extern builtin function can be used to link against a variable that is exported from another object. The export keyword or @export builtin function can be used to make a variable available to other objects at link time. In both cases, the type of the variable must be C ABI compatible.
When a local variable is const, it means that after initialization, the variable's value will not change. If the initialization value of a const variable is comptime-known, then the variable is also comptime-known.
A local variable may be qualified with the comptime keyword. This causes the variable's value to be comptime-known, and all loads and stores of the variable to happen during semantic analysis of the program, rather than at runtime. All variables declared in a comptime expression are implicitly comptime variables.
Integers
Integer Literals
Runtime Integer Values
Integer literals have no size limitation, and if any undefined behavior occurs, the compiler catches it.
However, once an integer value is no longer known at compile-time, it must have a known size, and is vulnerable to undefined behavior.
In this function, values a and b are known only at runtime, and thus this division operation is vulnerable to both Integer Overflow and Division by Zero.
Operators such as + and - cause undefined behavior on integer overflow. Alternative operators are provided for wrapping and saturating arithmetic on all targets. +% and -% perform wrapping arithmetic while +| and -| perform saturating arithmetic.
Zig supports arbitrary bit-width integers, referenced by using an identifier of i or u followed by digits. For example, the identifier i7 refers to a signed 7-bit integer. The maximum allowed bit-width of an integer type is 65535.
c_longdouble - matches long double for the target C ABI
Float Literals
Float literals have type comptime_float which is guaranteed to have the same precision and operations of the largest other floating point type, which is f128.
Float literals coerce to any floating point type, and to any integer type when there is no fractional component.
There is no syntax for NaN, infinity, or negative infinity. For these special values, one must use the standard library:
Floating Point Operations
By default floating point operations use Strict mode, but you can switch to Optimized mode on a per-block basis:
For this test we have to separate code into two object files - otherwise the optimizer figures out all the values at compile-time, which operates in strict mode.
If a is an error, returns b ("default value"), otherwise returns the unwrapped value of a. Note that b may be a value of type noreturn. err is the error and is in scope of the expression b.
A vector is a group of booleans, Integers, Floats, or Pointers which are operated on in parallel using SIMD instructions. Vector types are created with the builtin function @Type, or using the shorthand function std.meta.Vector.
Vectors support the same builtin operators as their underlying base types. These operations are performed element-wise, and return a vector of the same length as the input vectors. This includes:
It is prohibited to use a math operator on a mixture of scalars (individual numbers) and vectors. Zig provides the @splat builtin to easily convert from scalars to vectors, and it supports @reduce and array indexing syntax to convert from vectors to scalars. Vectors also support assignment to and from fixed-length arrays with comptime known length.
For rearranging elements within and between vectors, Zig provides the @shuffle and @select functions.
Operations on vectors shorter than the target machine's native SIMD size will typically compile to single SIMD instructions, while vectors longer than the target machine's native SIMD size will compile to multiple SIMD instructions. If a given operation doesn't have SIMD support on the target architecture, the compiler will default to operating on each vector element one at a time. Zig supports any comptime-known vector length up to 2^32-1, although small powers of two (2-64) are most typical. Note that excessively long vector lengths (e.g. 2^20) may result in compiler crashes on current versions of Zig.
TODO talk about C ABI interop TODO consider suggesting std.MultiArrayList
Loads and stores are assumed to not have side effects. If a given load or store should have side effects, such as Memory Mapped Input/Output (MMIO), use volatile. In the following code, loads and stores with mmio_ptr are guaranteed to all happen and in the same order as in source code:
Note that volatile is unrelated to concurrency and Atomics. If you see code that is using volatile for something other than Memory Mapped Input/Output, it is probably a bug.
To convert one pointer type to another, use @ptrCast. This is an unsafe operation that Zig cannot protect you against. Use @ptrCast only when other conversions are not possible.
Alignment
Each type has an alignment - a number of bytes such that, when a value of the type is loaded from or stored to memory, the memory address must be evenly divisible by this number. You can use @alignOf to find out this value for any type.
Alignment depends on the CPU architecture, but is always a power of two, and less than 1 << 29.
In Zig, a pointer type has an alignment value. If the value is equal to the alignment of the underlying type, it can be omitted from the type:
In the same way that a *i32 can be coerced to a *consti32, a pointer with a larger alignment can be implicitly cast to a pointer with a smaller alignment, but not vice versa.
You can specify alignment on variables and functions. If you do this, then pointers to them get the specified alignment:
If you have a pointer or a slice that has a small alignment, but you know that it actually has a bigger alignment, use @alignCast to change the pointer into a more aligned pointer. This is a no-op at runtime, but inserts a safety check:
allowzero
This pointer attribute allows a pointer to have address zero. This is only ever needed on the freestanding OS target, where the address zero is mappable. If you want to represent null pointers, use Optional Pointers instead. Optional Pointers with allowzero are not the same size as pointers. In this code example, if the pointer did not have the allowzero attribute, this would be a Pointer Cast Invalid Null panic:
Sentinel-Terminated Pointers
The syntax [*:x]T describes a pointer that has a length determined by a sentinel value. This provides protection against buffer overflow and overreads.
The syntax [:x]T is a slice which has a runtime known length and also guarantees a sentinel value at the element indexed by the length. The type does not guarantee that there are no sentinel elements before that. Sentinel-terminated slices allow element access to the len index.
Sentinel-terminated slices can also be created using a variation of the slice syntax data[start..end :x], where data is a many-item pointer, array or slice and x is the sentinel value.
Sentinel-terminated slicing asserts that the element in the sentinel position of the backing data is actually the sentinel value. If this is not the case, safety-protected Undefined Behavior results.
Each struct field may have an expression indicating the default field value. Such expressions are executed at comptime, and allow the field to be omitted in a struct literal expression:
extern struct
An externstruct has in-memory layout guaranteed to match the C ABI for the target.
This kind of struct should only be used for compatibility with the C ABI. Every other use case should be solved with packed struct or normal struct.
Unlike normal structs, packed structs have guaranteed in-memory layout:
Fields remain in the order declared.
There is no padding between fields.
Zig supports arbitrary width Integers and although normally, integers with fewer than 8 bits will still use 1 byte of memory, in packed structs, they use exactly their bit width.
bool fields use exactly 1 bit.
An enum field uses exactly the bit width of its integer tag type.
A packed union field uses exactly the bit width of the union field with the largest bit width.
Non-ABI-aligned fields are packed into the smallest possible ABI-aligned integers in accordance with the target endianness.
This means that a packedstruct can participate in a @bitCast or a @ptrCast to reinterpret memory. This even works at comptime:
Zig allows the address to be taken of a non-byte-aligned field:
However, the pointer to a non-byte-aligned field has special properties and cannot be passed when a normal pointer is expected:
In this case, the function bar cannot be called because the pointer to the non-ABI-aligned field mentions the bit offset, but the function expects an ABI-aligned pointer.
Pointers to non-ABI-aligned fields share the same address as the other fields within their host integer:
Packed structs have 1-byte alignment. However if you have an overaligned pointer to a packed struct, Zig should correctly understand the alignment of fields. However there is a bug:
When this bug is fixed, the above test in the documentation will unexpectedly pass, which will cause the test suite to fail, notifying the bug fixer to update these docs.
It's also possible to set alignment of struct fields:
Using packed structs with volatile is problematic, and may be a compile error in the future. For details on this subscribe to this issue. TODO update these docs with a recommendation on how to use packed structs with MMIO (the use case for volatile packed structs) once this issue is resolved. Don't worry, there will be a good solution for this use case in zig.
Struct Naming
Since all structs are anonymous, Zig infers the type name based on a few rules.
If the struct is in the initialization expression of a variable, it gets named after that variable.
If the struct is in the return expression, it gets named after the function it is returning from, with the parameter values serialized.
Otherwise, the struct gets a name such as (anonymous struct at file.zig:7:38).
Anonymous Struct Literals
Zig allows omitting the struct type of a literal. When the result is coerced, the struct literal will directly instantiate the result location, with no copy:
The struct type can be inferred. Here the result location does not include a type, and so Zig infers the type:
Anonymous structs can be created without specifying field names, and are referred to as "tuples".
The fields are implicitly named using numbers starting from 0. Because their names are integers, the @"0" syntax must be used to access them. Names inside @"" are always recognised as identifiers.
Like arrays, tuples have a .len field, can be indexed and work with the ++ and ** operators. They can also be iterated over with inline for.
By default, enums are not guaranteed to be compatible with the C ABI:
For a C-ABI-compatible enum, provide an explicit tag type to the enum:
Enum Literals
Enum literals allow specifying the name of an enum field without specifying the enum type:
Non-exhaustive enum
A Non-exhaustive enum can be created by adding a trailing '_' field. It must specify a tag type and cannot consume every enumeration value.
@intToEnum on a non-exhaustive enum involves the safety semantics of @intCast to the integer tag type, but beyond that always results in a well-defined enum value.
A switch on a non-exhaustive enum can include a '_' prong as an alternative to an else prong with the difference being that it makes it a compile error if all the known tag names are not handled by the switch.
union
A bare union defines a set of possible types that a value can be as a list of fields. Only one field can be active at a time. The in-memory representation of bare unions is not guaranteed. Bare unions cannot be used to reinterpret memory. For that, use @ptrCast, or use an extern union or a packed union which have guaranteed in-memory layout. Accessing the non-active field is safety-checked Undefined Behavior:
You can activate another field by assigning the entire union:
To initialize a union when the tag is a comptime-known name, see @unionInit.
Tagged union
Unions can be declared with an enum tag type. This turns the union into a tagged union, which makes it eligible to use with switch expressions. Tagged unions coerce to their tag type: Type Coercion: unions and enums.
In order to modify the payload of a tagged union in a switch expression, place a * before the variable name to make it a pointer:
Unions can be made to infer the enum tag type. Further, unions can have methods just like structs and enums.
@tagName can be used to return a comptime[:0]constu8 value representing the field name:
extern union
An externunion has memory layout guaranteed to be compatible with the target C ABI.
Identifiers are never allowed to "hide" other identifiers by using the same name:
Because of this, when you read Zig code you can always rely on an identifier to consistently mean the same thing within the scope it is defined. Note that you can, however, use the same name if the scopes are separate:
switch
switch can be used to capture the field values of a Tagged union. Modifications to the field values can be done by placing a * before the capture variable name, turning it into a pointer.
When a switch expression does not have an else clause, it must exhaustively list all the possible values. Failure to do so is a compile error:
Switching with Enum Literals
Enum Literals can be useful to use with switch to avoid repetitively specifying enum or union types:
while
A while loop is used to repeatedly execute an expression until some condition is no longer true.
Use break to exit a while loop early.
Use continue to jump back to the beginning of the loop.
While loops support a continue expression which is executed when the loop is continued. The continue keyword respects this expression.
While loops are expressions. The result of the expression is the result of the else clause of a while loop, which is executed when the condition of the while loop is tested as false.
break, like return, accepts a value parameter. This is the result of the while expression. When you break from a while loop, the else branch is not evaluated.
Labeled while
When a while loop is labeled, it can be referenced from a break or continue from within a nested loop:
while with Optionals
Just like if expressions, while loops can take an optional as the condition and capture the payload. When null is encountered the loop exits.
When the |x| syntax is present on a while expression, the while condition must have an Optional Type.
The else branch is allowed on optional iteration. In this case, it will be executed on the first null value encountered.
while with Error Unions
Just like if expressions, while loops can take an error union as the condition and capture the payload or the error code. When the condition results in an error code the else branch is evaluated and the loop is finished.
When the else |x| syntax is present on a while expression, the while condition must have an Error Union Type.
inline while
While loops can be inlined. This causes the loop to be unrolled, which allows the code to do some things which only work at compile time, such as use types as first class values.
It is recommended to use inline loops only for one of these reasons:
You need the loop to execute at comptime for the semantics to work.
You have a benchmark to prove that forcibly unrolling the loop in this way is measurably faster.
When a for loop is labeled, it can be referenced from a break or continue from within a nested loop:
inline for
For loops can be inlined. This causes the loop to be unrolled, which allows the code to do some things which only work at compile time, such as use types as first class values. The capture value and iterator value of inlined for loops are compile-time known.
It is recommended to use inline loops only for one of these reasons:
You need the loop to execute at comptime for the semantics to work.
You have a benchmark to prove that forcibly unrolling the loop in this way is measurably faster.
In Debug and ReleaseSafe mode, and when using zig test, unreachable emits a call to panic with the message reached unreachable code.
In ReleaseFast mode, the optimizer uses the assumption that unreachable code will never be hit to perform optimizations. However, zig test even in ReleaseFast mode still emits unreachable as calls to panic.
Basics
In fact, this is how std.debug.assert is implemented:
When resolving types together, such as if clauses or switch prongs, the noreturn type is compatible with every other type. Consider:
Another use case for noreturn is the exit function:
Functions
Function values are like pointers:
Pass-by-value Parameters
Primitive types such as Integers and Floats passed as parameters are copied, and then the copy is available in the function body. This is called "passing by value". Copying a primitive type is essentially free and typically involves nothing more than setting a register.
Structs, unions, and arrays can sometimes be more efficiently passed as a reference, since a copy could be arbitrarily expensive depending on the size. When these types are passed as parameters, Zig may choose to copy and pass by value, or pass by reference, whichever way Zig decides will be faster. This is made possible, in part, by the fact that parameters are immutable.
For extern functions, Zig follows the C ABI for passing structs and unions by value.
Function Parameter Type Inference
Function parameters can be declared with anytype in place of the type. In this case the parameter types will be inferred when the function is called. Use @TypeOf and @typeInfo to get information about the inferred type.
Function Reflection
Errors
Error Set Type
An error set is like an enum. However, each error name across the entire compilation gets assigned an unsigned integer greater than 0. You are allowed to declare the same error name more than once, and if you do, it gets assigned the same integer value.
The number of unique error values across the entire compilation should determine the size of the error set type. However right now it is hard coded to be a u16. See #768.
You can coerce an error from a subset to a superset:
But you cannot coerce an error from a superset to a subset:
There is a shortcut for declaring an error set with only 1 value, and then getting that value:
anyerror refers to the global error set. This is the error set that contains all errors in the entire compilation unit. It is a superset of all other error sets and a subset of none of them.
You can coerce any error set to the global one, and you can explicitly cast an error of the global error set to a non-global one. This inserts a language-level assert to make sure the error value is in fact in the destination error set.
The global error set should generally be avoided because it prevents the compiler from knowing what errors are possible at compile-time. Knowing the error set at compile-time is better for generated documentation and helpful error messages, such as forgetting a possible error value in a switch.
Error Union Type
An error set type and normal type can be combined with the ! binary operator to form an error union type. You are likely to use an error union type more often than an error set type by itself.
Here is a function to parse a string into a 64-bit integer:
Notice the return type is !u64. This means that the function either returns an unsigned 64 bit integer, or an error. We left off the error set to the left of the !, so the error set is inferred.
Within the function definition, you can see some return statements that return an error, and at the bottom a return statement that returns a u64. Both types coerce to anyerror!u64.
What it looks like to use this function varies depending on what you're trying to do. One of the following:
You want to provide a default value if it returned an error.
If it returned an error then you want to return the same error.
You know with complete certainty it will not return an error, so want to unconditionally unwrap it.
You want to take a different action for each possible error.
If you want to provide a default value, you can use the catch binary operator:
In this code, number will be equal to the successfully parsed string, or a default value of 13. The type of the right hand side of the binary catch operator must match the unwrapped error union type, or be of type noreturn.
Let's say you wanted to return the error if you got one, otherwise continue with the function logic:
There is a shortcut for this. The try expression:
try evaluates an error union expression. If it is an error, it returns from the current function with the same error. Otherwise, the expression results in the unwrapped value.
Maybe you know with complete certainty that an expression will never be an error. In this case you can do this:
const number = parseU64("1234", 10) catchunreachable;
Here we know for sure that "1234" will parse successfully. So we put the unreachable value on the right hand side. unreachable generates a panic in Debug and ReleaseSafe modes and undefined behavior in ReleaseFast mode. So, while we're debugging the application, if there was a surprise error here, the application would crash appropriately.
Finally, you may want to take a different action for every situation. For that, we combine the if and switch expression:
The other component to error handling is defer statements. In addition to an unconditional defer, Zig has errdefer, which evaluates the deferred expression on block exit path if and only if the function returned with an error from the block.
Example:
The neat thing about this is that you get robust error handling without the verbosity and cognitive overhead of trying to make sure every exit path is covered. The deallocation code is always directly following the allocation code.
A couple of other tidbits about error handling:
These primitives give enough expressiveness that it's completely practical to have failing to check for an error be a compile error. If you really want to ignore the error, you can add catchunreachable and get the added benefit of crashing in Debug and ReleaseSafe modes if your assumption was wrong.
Since Zig understands error types, it can pre-weight branches in favor of errors not occurring. Just a small optimization benefit that is not available in other languages.
Use the || operator to merge two error sets together. The resulting error set contains the errors of both error sets. Doc comments from the left-hand side override doc comments from the right-hand side. In this example, the doc comments for C.PathNotFound is A doc comment.
This is especially useful for functions which return different error sets depending on comptime branches. For example, the Zig standard library uses LinuxFileOpenError || WindowsFileOpenError for the error set of opening files.
Because many functions in Zig return a possible error, Zig supports inferring the error set. To infer the error set for a function, prepend the ! operator to the function’s return type, like !T:
When a function has an inferred error set, that function becomes generic and thus it becomes trickier to do certain things with it, such as obtain a function pointer, or have an error set that is consistent across different build targets. Additionally, inferred error sets are incompatible with recursion.
In these situations, it is recommended to use an explicit error set. You can generally start with an empty error set and let compile errors guide you toward completing the set.
These limitations may be overcome in a future version of Zig.
Error Return Traces
Error Return Traces show all the points in the code that an error was returned to the calling function. This makes it practical to use try everywhere and then still be able to know what happened if an error ends up bubbling all the way out of your application.
Look closely at this example. This is no stack trace.
You can see that the final error bubbled up was PermissionDenied, but the original error that started this whole thing was FileNotFound. In the bar function, the code handles the original error code, and then returns another one, from the switch statement. Error Return Traces make this clear, whereas a stack trace would look like this:
Here, the stack trace does not explain how the control flow in bar got to the hello() call. One would have to open a debugger or further instrument the application in order to find out. The error return trace, on the other hand, shows exactly how the error bubbled up.
This debugging feature makes it easier to iterate quickly on code that robustly handles all error conditions. This means that Zig developers will naturally find themselves writing correct, robust code in order to increase their development pace.
There are a few ways to activate this error return tracing feature:
Return an error from main
An error makes its way to catchunreachable and you have not overridden the default panic handler
Use errorReturnTrace to access the current return trace. You can use std.debug.dumpStackTrace to print it. This function returns comptime-known null when building without error return tracing support.
For the case when no errors are returned, the cost is a single memory write operation, only in the first non-failable function in the call graph that calls a failable function, i.e. when a function returning void calls a function returning error. This is to initialize this struct in the stack memory:
Here, N is the maximum function call depth as determined by call graph analysis. Recursion is ignored and counts for 2.
A pointer to StackTrace is passed as a secret parameter to every function that can return an error, but it's always the first parameter, so it can likely sit in a register and stay there.
That's it for the path when no errors occur. It's practically free in terms of performance.
When generating the code for a function that returns an error, just before the return statement (only for the return statements that return errors), Zig generates a call to this function:
The cost is 2 math operations plus some memory reads and writes. The memory accessed is constrained and should remain cached for the duration of the error return bubbling.
As for code size cost, 1 function call before a return statement is no big deal. Even so, I have a plan to make the call to __zig_return_error a tail call, which brings the code size cost down to actually zero. What is a return statement in code without error return tracing can become a jump instruction in code with error return tracing.
Optionals
One area that Zig provides safety without compromising efficiency or readability is with the optional type.
The question mark symbolizes the optional type. You can convert a type to an optional type by putting a question mark in front of it, like this:
Now the variable optional_int could be an i32, or null.
Instead of integers, let's talk about pointers. Null references are the source of many runtime exceptions, and even stand accused of being the worst mistake of computer science.
Zig does not have them.
Instead, you can use an optional pointer. This secretly compiles down to a normal pointer, since we know we can use 0 as the null value for the optional type. But the compiler can check your work and make sure you don't assign null to something that can't be null.
Typically the downside of not having null is that it makes the code more verbose to write. But, let's compare some equivalent C code and Zig code.
Task: call malloc, if the result is null, return null.
C code
Zig code
Here, Zig is at least as convenient, if not more, than C. And, the type of "ptr" is *u8not?*u8. The orelse keyword unwrapped the optional type and therefore ptr is guaranteed to be non-null everywhere it is used in the function.
The other form of checking against NULL you might see looks like this:
In Zig you can accomplish the same thing:
Once again, the notable thing here is that inside the if block, foo is no longer an optional pointer, it is a pointer, which cannot be null.
One benefit to this is that functions which take pointers as arguments can be annotated with the "nonnull" attribute - __attribute__((nonnull)) in GCC. The optimizer can sometimes make better decisions knowing that pointer arguments cannot be null.
Optional Type
An optional is created by putting ? in front of a type. You can use compile-time reflection to access the child type of an optional:
null
Just like undefined, null has its own type, and the only way to use it is to cast it to a different type:
Optional Pointers
An optional pointer is guaranteed to be the same size as a pointer. The null of the optional is guaranteed to be address 0.
Casting
A type cast converts a value of one type to another. Zig has Type Coercion for conversions that are known to be completely safe and unambiguous, and Explicit Casts for conversions that one would not want to happen on accident. There is also a third kind of type conversion called Peer Type Resolution for the case when a result type must be decided given multiple operand types.
Type Coercion
Type coercion occurs when one type is expected, but different type is provided:
Type coercions are only allowed when it is completely unambiguous how to get from one type to another, and the transformation is guaranteed to be safe. There is one exception, which is C Pointers.
Values which have the same representation at runtime can be cast to increase the strictness of the qualifiers, no matter how nested the qualifiers are:
Integers coerce to integer types which can represent every value of the old type, and likewise Floats coerce to float types which can represent every value of the old type.
Tagged unions can be coerced to enums, and enums can be coerced to tagged unions when they are comptime-known to be a field of the union that has only one possible value, such as void:
Explicit casts are performed via Builtin Functions. Some explicit casts are safe; some are not. Some explicit casts perform language-level assertions; some do not. Some explicit casts are no-ops at runtime; some are not.
@bitCast - change type but maintain bit representation
These types can only ever have one possible value, and thus require 0 bits to represent. Code that makes use of these types is not included in the final generated code:
When this turns into machine code, there is no code generated in the body of entry, even in Debug mode. For example, on x86_64:
These assembly instructions do not have any code associated with the void values - they only perform the function call prologue and epilog.
void
void can be useful for instantiating generic types. For example, given a Map(Key, Value), one can pass void for the Value type to make it into a Set:
Note that this is different from using a dummy value for the hash map value. By using void as the type of the value, the hash map entry type has no value field, and thus the hash map takes up less space. Further, all the code that deals with storing and loading the value is deleted, as seen above.
void is distinct from c_void. void has a known size of 0 bytes, and c_void has an unknown, but non-zero, size.
Expressions of type void are the only ones whose value can be ignored. For example:
However, if the expression has type void, there will be no error. Function return values can also be explicitly ignored by assigning them to _.
Pointers to Zero Bit Types
Pointers to zero bit types also have zero bits. They always compare equal to each other:
The type being pointed to can only ever be one value; therefore loads and stores are never generated. ptrToInt and intToPtr are not allowed:
usingnamespace is a declaration that mixes all the public declarations of the operand, which must be a struct, union, enum, or opaque, into the namespace:
usingnamespace has an important use case when organizing the public API of a file or package. For example, one might have c.zig with all of the C imports:
The above example demonstrates using pub to qualify the usingnamespace additionally makes the imported declarations pub. This can be used to forward declarations, giving precise control over what declarations a given file exposes.
comptime
Zig places importance on the concept of whether an expression is known at compile-time. There are a few different places this concept is used, and these building blocks are used to keep the language small, readable, and powerful.
Compile-time parameters is how Zig implements generics. It is compile-time duck typing.
In Zig, types are first-class citizens. They can be assigned to variables, passed as parameters to functions, and returned from functions. However, they can only be used in expressions which are known at compile-time, which is why the parameter T in the above snippet must be marked with comptime.
A comptime parameter means that:
At the callsite, the value must be known at compile-time, or it is a compile error.
In the function definition, the value is known at compile-time.
For example, if we were to introduce another function to the above snippet:
This is an error because the programmer attempted to pass a value only known at run-time to a function which expects a value known at compile-time.
Another way to get an error is if we pass a type that violates the type checker when the function is analyzed. This is what it means to have compile-time duck typing.
For example:
On the flip side, inside the function definition with the comptime parameter, the value is known at compile-time. This means that we actually could make this work for the bool type if we wanted to:
This works because Zig implicitly inlines if expressions when the condition is known at compile-time, and the compiler guarantees that it will skip analysis of the branch not taken.
This means that the actual function generated for max in this situation looks like this:
All the code that dealt with compile-time known values is eliminated and we are left with only the necessary run-time code to accomplish the task.
This works the same way for switch expressions - they are implicitly inlined when the target expression is compile-time known.
In Zig, the programmer can label variables as comptime. This guarantees to the compiler that every load and store of the variable is performed at compile-time. Any violation of this results in a compile error.
This combined with the fact that we can inline loops allows us to write a function which is partially evaluated at compile-time and partially at run-time.
For example:
This example is a bit contrived, because the compile-time evaluation component is unnecessary; this code would work fine if it was all done at run-time. But it does end up generating different code. In this example, the function performFn is generated three different times, for the different values of prefix_char provided:
Note that this happens even in a debug build; in a release build these generated functions still pass through rigorous LLVM optimizations. The important thing to note, however, is not that this is a way to write more optimized code, but that it is a way to make sure that what should happen at compile-time, does happen at compile-time. This catches more errors and as demonstrated later in this article, allows expressiveness that in other languages requires using macros, generated code, or a preprocessor to accomplish.
In Zig, it matters whether a given expression is known at compile-time or run-time. A programmer can use a comptime expression to guarantee that the expression will be evaluated at compile-time. If this cannot be accomplished, the compiler will emit an error. For example:
It doesn't make sense that a program could call exit() (or any other external function) at compile-time, so this is a compile error. However, a comptime expression does much more than sometimes cause a compile error.
Within a comptime expression:
All variables are comptime variables.
All if, while, for, and switch expressions are evaluated at compile-time, or emit a compile error if this is not possible.
All function calls cause the compiler to interpret the function at compile-time, emitting a compile error if the function tries to do something that has global run-time side effects.
This means that a programmer can create a function which is called both at compile-time and run-time, with no modification to the function required.
Let's look at an example:
Imagine if we had forgotten the base case of the recursive function and tried to run the tests:
The compiler produces an error which is a stack trace from trying to evaluate the function at compile-time.
Luckily, we used an unsigned integer, and so when we tried to subtract 1 from 0, it triggered undefined behavior, which is always a compile error if the compiler knows it happened. But what would have happened if we used a signed integer?
The compiler noticed that evaluating this function at compile-time took a long time, and thus emitted a compile error and gave up. If the programmer wants to increase the budget for compile-time computation, they can use a built-in function called @setEvalBranchQuota to change the default number 1000 to something else.
What if we fix the base case, but put the wrong value in the expect line?
What happened is Zig started interpreting the expect function with the parameter ok set to false. When the interpreter hit @panic it emitted a compile error because a panic during compile causes a compile error if it is detected at compile-time.
At container level (outside of any function), all expressions are implicitly comptime expressions. This means that we can use functions to initialize complex static data. For example:
When we compile this program, Zig generates the constants with the answer pre-computed. Here are the lines from the generated LLVM IR:
Note that we did not have to do anything special with the syntax of these functions. For example, we could call the sum function as is with a slice of numbers whose length and values were only known at run-time.
Generic Data Structures
Zig uses these capabilities to implement generic data structures without introducing any special-case syntax. If you followed along so far, you may already know how to create a generic data structure.
Here is an example of a generic List data structure, that we will instantiate with the type i32. In Zig we refer to the type as List(i32).
That's it. It's a function that returns an anonymous struct. For the purposes of error messages and debugging, Zig infers the name "List(i32)" from the function name and parameters invoked when creating the anonymous struct.
To keep the language small and uniform, all aggregate types in Zig are anonymous. To give a type a name, we assign it to a constant:
This works because all top level declarations are order-independent, and as long as there isn't an actual infinite regression, values can refer to themselves, directly or indirectly. In this case, Node refers to itself as a pointer, which is not actually an infinite regression, so it works fine.
Case Study: print in Zig
Putting all of this together, let's see how print works in Zig.
Let's crack open the implementation of this and see how it works:
This is a proof of concept implementation; the actual function in the standard library has more formatting capabilities.
Note that this is not hard-coded into the Zig compiler; this is userland code in the standard library.
When this function is analyzed from our example code above, Zig partially evaluates the function and emits a function that actually looks like this:
printValue is a function that takes a parameter of any type, and does different things depending on the type:
And now, what happens if we give too many arguments to print?
Zig gives programmers the tools needed to protect themselves against their own mistakes.
Zig doesn't care whether the format argument is a string literal, only that it is a compile-time known value that can be coerced to a []constu8:
This works fine.
Zig does not special case string formatting in the compiler and instead exposes enough power to accomplish this task in userland. It does so without introducing another language on top of Zig, such as a macro language or a preprocessor language. It's Zig all the way down.
For some use cases, it may be necessary to directly control the machine code generated by Zig programs, rather than relying on Zig's code generation. For these cases, one can use inline assembly. Here is an example of implementing Hello, World on x86_64 Linux using inline assembly:
Dissecting the syntax:
For i386 and x86_64 targets, the syntax is AT&T syntax, rather than the more popular Intel syntax. This is due to technical constraints; assembly parsing is provided by LLVM and its support for Intel syntax is buggy and not well tested.
Some day Zig may have its own assembler. This would allow it to integrate more seamlessly into the language, as well as be compatible with the popular NASM syntax. This documentation section will be updated before 1.0.0 is released, with a conclusive statement about the status of AT&T vs Intel/NASM syntax.
Output Constraints
Output constraints are still considered to be unstable in Zig, and so LLVM documentation and GCC documentation must be used to understand the semantics.
Note that some breaking changes to output constraints are planned with issue #215.
Input Constraints
Input constraints are still considered to be unstable in Zig, and so LLVM documentation and GCC documentation must be used to understand the semantics.
Note that some breaking changes to input constraints are planned with issue #215.
Clobbers
Clobbers are the set of registers whose values will not be preserved by the execution of the assembly code. These do not include output or input registers. The special clobber value of "memory" means that the assembly causes writes to arbitrary undeclared memory locations - not only the memory pointed to by a declared indirect output.
Failure to declare the full set of clobbers for a given inline assembly expression is unchecked Undefined Behavior.
Global Assembly
When an assembly expression occurs in a container level comptime block, this is global assembly.
This kind of assembly has different rules than inline assembly. First, volatile is not valid because all global assembly is unconditionally included. Second, there are no inputs, outputs, or clobbers. All global assembly is concatenated verbatim into one long string and assembled together. There are no template substitution rules regarding % as there are in inline assembly expressions.
Atomics
TODO: @fence()
TODO: @atomic rmw
TODO: builtin atomic memory ordering enum
Async Functions
When a function is called, a frame is pushed to the stack, the function runs until it reaches a return statement, and then the frame is popped from the stack. The code following the callsite does not run until the function returns.
An async function is a function whose execution is split into an async initiation, followed by an await completion. Its frame is provided explicitly by the caller, and it can be suspended and resumed any number of times.
The code following the async callsite runs immediately after the async function first suspends. When the return value of the async function is needed, the calling code can await on the async function frame. This will suspend the calling code until the async function completes, at which point execution resumes just after the await callsite.
Zig infers that a function is async when it observes that the function contains a suspension point. Async functions can be called the same as normal functions. A function call of an async function is a suspend point.
Suspend and Resume
At any point, a function may suspend itself. This causes control flow to return to the callsite (in the case of the first suspension), or resumer (in the case of subsequent suspensions).
In the same way that each allocation should have a corresponding free, Each suspend should have a corresponding resume. A suspend block allows a function to put a pointer to its own frame somewhere, for example into an event loop, even if that action will perform a resume operation on a different thread. @frame provides access to the async function frame pointer.
Upon entering a suspend block, the async function is already considered suspended, and can be resumed. For example, if you started another kernel thread, and had that thread call resume on the frame pointer provided by the @frame, the new thread would begin executing after the suspend block, while the old thread continued executing the suspend block.
However, the async function can be directly resumed from the suspend block, in which case it never returns to its resumer and continues executing.
This is guaranteed to tail call, and therefore will not cause a new stack frame.
Async and Await
In the same way that every suspend has a matching resume, every async has a matching await in standard code.
However, it is possible to have an async call without a matching await. Upon completion of the async function, execution would continue at the most recent async callsite or resume callsite, and the return value of the async function would be lost.
The await keyword is used to coordinate with an async function's return statement.
await is a suspend point, and takes as an operand anything that coerces to anyframe->T. Calling await on the frame of an async function will cause execution to continue at the await callsite once the target function completes.
There is a common misconception that await resumes the target function. It is the other way around: it suspends until the target function completes. In the event that the target function has already completed, await does not suspend; instead it copies the return value directly from the target function's frame.
In general, suspend is lower level than await. Most application code will use only async and await, but event loop implementations will make use of suspend internally.
Async Function Example
Putting all of this together, here is an example of typical async/await usage:
Now we remove the suspend and resume code, and observe the same behavior, with one tiny difference:
Previously, the fetchUrl and readFile functions suspended, and were resumed in an order determined by the main function. Now, since there are no suspend points, the order of the printed "... returning" messages is determined by the order of async callsites.
Builtin Functions
Builtin functions are provided by the compiler and are prefixed with @. The comptime keyword on a parameter means that the parameter must be known at compile time.
Performs result.* = a + b. If overflow or underflow occurs, stores the overflowed bits in result and returns true. If no overflow or underflow occurs, returns false.
This function returns the number of bytes that this type should be aligned to for the current target to match the C ABI. When the child type of a pointer has this alignment, the alignment can be omitted from the type.
Performs Type Coercion. This cast is allowed when the conversion is unambiguous and safe, and is the preferred way to convert between types, whenever possible.
@asyncCall performs an async call on a function pointer, which may or may not be an async function.
The provided frame_buffer must be large enough to fit the entire function frame. This size can be determined with @frameSize. To provide a too-small buffer invokes safety-checked Undefined Behavior.
result_ptr is optional (null may be provided). If provided, the function call will write its result directly to the result pointer, which will be available to read after await completes. Any result location provided to await will copy the result from result_ptr.
@atomicLoad
@atomicLoad(comptime T: type, ptr: *const T, comptime ordering: builtin.AtomicOrder) T
This builtin function atomically dereferences a pointer and returns the value.
T must be a pointer, a bool, a float, an integer or an enum.
Asserts that @sizeOf(@TypeOf(value)) == @sizeOf(DestType).
Asserts that @typeInfo(DestType) != .Pointer. Use @ptrCast or @intToPtr if you need this.
Can be used for these things for example:
Convert f32 to u32 bits
Convert i32 to u32 preserving twos complement
Works at compile-time if value is known at compile time. It's a compile error to bitcast a struct to a scalar type of the same size since structs have undefined layout. However if the struct is packed then it works.
Returns the bit offset of a field relative to its containing struct.
For non packed structs, this will always be divisible by 8. For packed structs, non-byte-aligned fields will share a byte offset, but they will have different bit offsets.
Converts true to @as(u1, 1) and false to @as(u1, 0).
If the value is known at compile-time, the return type is comptime_int instead of u1.
@bitSizeOf
@bitSizeOf(comptime T: type) comptime_int
This function returns the number of bits it takes to store T in memory if the type were a field in a packed struct/union. The result is a target-specific compile time constant.
This function measures the size at runtime. For types that are disallowed at runtime, such as comptime_int and type, the result is 0.
Swaps the byte order of the integer. This converts a big endian integer to a little endian integer, and converts a little endian integer to a big endian integer.
Note that for the purposes of memory layout with respect to endianness, the integer type should be related to the number of bytes reported by @sizeOf bytes. This is demonstrated with u24. @sizeOf(u24) == 4, which means that a u24 stored in memory takes 4 bytes, and those 4 bytes are what are swapped on a little vs big endian system. On the other hand, if T is specified to be u24, then only 3 bytes are reversed.
@bitReverse
@bitReverse(comptime T: type, integer: T) T
T accepts any integer type.
Reverses the bitpattern of an integer value, including the sign bit if applicable.
For example 0b10110110 (u8 = 182, i8 = -74) becomes 0b01101101 (u8 = 109, i8 = 109).
This function parses C code and imports the functions, types, variables, and compatible macro definitions into a new empty struct type, and then returns that type.
expression is interpreted at compile time. The builtin functions @cInclude, @cDefine, and @cUndef work within this expression, appending to a temporary buffer which is then parsed as C code.
Usually you should only have one @cImport in your entire application, because it saves the compiler from invoking clang multiple times, and prevents inline functions from being duplicated.
Reasons for having multiple @cImport expressions would be:
To avoid a symbol collision, for example if foo.h and bar.h both #define CONNECTION_COUNT
To analyze the C code with different preprocessor defines
This function counts the number of most-significant (leading in a big-Endian sense) zeroes in an integer.
If operand is a comptime-known integer, the return type is comptime_int. Otherwise, the return type is an unsigned integer or vector of unsigned integers with the minimum number of bits that can represent the bit count of the integer type.
If operand is zero, @clz returns the bit width of integer type T.
This function performs a weak atomic compare exchange operation. It's the equivalent of this code, except atomic:
If you are using cmpxchg in a loop, the sporadic failure will be no problem, and cmpxchgWeak is the better choice, because it can be implemented more efficiently in machine instructions. However if you need a stronger guarantee, use @cmpxchgStrong.
T must be a pointer, a bool, a float, an integer or an enum.
@typeInfo(@TypeOf(ptr)).Pointer.alignment must be >= @sizeOf(T).
This function, when semantically analyzed, causes a compile error with the message msg.
There are several ways that code avoids being semantically checked, such as using if or switch with compile time constants, and comptime functions.
@compileLog
@compileLog(args: ...)
This function prints the arguments passed to it at compile-time.
To prevent accidentally leaving compile log statements in a codebase, a compilation error is added to the build, pointing to the compile log statement. This error prevents code from being generated, but does not otherwise interfere with analysis.
This function can be used to do "printf debugging" on compile-time executing code.
will output:
If all @compileLog calls are removed or not encountered by analysis, the program compiles successfully and the generated executable prints:
This function counts the number of least-significant (trailing in a big-Endian sense) zeroes in an integer.
If operand is a comptime-known integer, the return type is comptime_int. Otherwise, the return type is an unsigned integer or vector of unsigned integers with the minimum number of bits that can represent the bit count of the integer type.
If operand is zero, @ctz returns the bit width of integer type T.
Floored division. Rounds toward negative infinity. For unsigned integers it is the same as numerator / denominator. Caller guarantees denominator != 0 and !(@typeInfo(T) == .Int and T.is_signed and numerator == std.math.minInt(T) and denominator == -1).
@divFloor(-5, 3) == -2
(@divFloor(a, b) * b) + @mod(a, b) == a
For a function that returns a possible error code, use @import("std").math.divFloor.
Truncated division. Rounds toward zero. For unsigned integers it is the same as numerator / denominator. Caller guarantees denominator != 0 and !(@typeInfo(T) == .Int and T.is_signed and numerator == std.math.minInt(T) and denominator == -1).
@divTrunc(-5, 3) == -1
(@divTrunc(a, b) * b) + @rem(a, b) == a
For a function that returns a possible error code, use @import("std").math.divTrunc.
This function returns a compile time constant pointer to null-terminated, fixed-size array with length equal to the byte count of the file given by path. The contents of the array are the contents of the file. This is equivalent to a string literal with the file contents.
path is absolute or relative to the current file, just like @import.
This function returns the string representation of an error. The string representation of error.OutOfMem is "OutOfMem".
If there are no calls to @errorName in an entire application, or all calls have a compile-time known value for err, then no error name table will be generated.
@errorReturnTrace
@errorReturnTrace() ?*builtin.StackTrace
If the binary is built with error return tracing, and this function is invoked in a function that calls a function with an error or error union return type, returns a stack trace object. Otherwise returns null.
Converts an error value from one error set to another error set. Attempting to convert an error which is not in the destination error set results in safety-protected Undefined Behavior.
This builtin can be called from a comptime block to conditionally export symbols. When declaration is a function with the C calling convention and options.linkage is Strong, this is equivalent to the export keyword used on a function:
This is equivalent to:
Note that even when using export, @"foo" syntax can be used to choose any string for the symbol name:
When looking at the resulting object, you can see the symbol is used verbatim:
00000000000001f0 T A function name that is a complete sentence.
This function returns a pointer to the frame for a given function. This type can be coerced to anyframe->T and to anyframe, where T is the return type of the function in scope.
This function does not mark a suspension point, but it does cause the function in scope to become an async function.
@Frame
@Frame(func: anytype) type
This function returns the frame type of a function. This works for Async Functions as well as any function without a specific calling convention.
This type is suitable to be used as the return type of async which allows one to, for example, heap-allocate an async function frame:
@frameAddress
@frameAddress() usize
This function returns the base pointer of the current stack frame.
The implications of this are target specific and not consistent across all platforms. The frame address may not be available in release mode due to aggressive optimizations.
This function is only valid within function scope.
@frameSize
@frameSize() usize
This is the same as @sizeOf(@Frame(func)), where func may be runtime-known.
This function is typically used in conjunction with @asyncCall.
This function finds a zig file corresponding to path and adds it to the build, if it is not already added.
Zig source files are implicitly structs, with a name equal to the file's basename with the extension truncated. @import returns the struct type corresponding to the file.
Declarations which have the pub keyword may be referenced from a different source file than the one they are declared in.
path can be a relative path or it can be the name of a package. If it is a relative path, it is relative to the file that contains the @import function call.
The following packages are always available:
@import("std") - Zig Standard Library
@import("builtin") - Target-specific information The command zig build-exe --show-builtin outputs the source to stdout for reference.
Converts an integer to another integer while keeping the same numerical value. Attempting to convert a number which is out of range of the destination type results in safety-protected Undefined Behavior.
If T is comptime_int, then this is semantically equivalent to Type Coercion.
Converts an integer to a pointer. To convert the other way, use @ptrToInt.
If the destination pointer type does not allow address zero and address is zero, this invokes safety-checked Undefined Behavior.
@maximum
@maximum(a: T, b: T) T
Returns the maximum value of a and b. This builtin accepts integers, floats, and vectors of either. In the latter case, the operation is performed element wise.
NaNs are handled as follows: if one of the operands of a (pairwise) operation is NaN, the other operand is returned. If both operands are NaN, NaN is returned.
This function copies bytes from one region of memory to another. dest and source are both pointers and must not overlap.
This function is a low level intrinsic with no safety mechanisms. Most code should not use this function, instead using something like this:
for (source[0..byte_count]) |b, i| dest[i] = b;
The optimizer is intelligent enough to turn the above snippet into a memcpy.
There is also a standard library function for this:
const mem = @import("std").mem;
mem.copy(u8, dest[0..byte_count], source[0..byte_count]);
@memset
@memset(dest: [*]u8, c: u8, byte_count: usize)
This function sets a region of memory to c. dest is a pointer.
This function is a low level intrinsic with no safety mechanisms. Most code should not use this function, instead using something like this:
for (dest[0..byte_count]) |*b| b.* = c;
The optimizer is intelligent enough to turn the above snippet into a memset.
There is also a standard library function for this:
const mem = @import("std").mem;
mem.set(u8, dest, c);
@minimum
@minimum(a: T, b: T) T
Returns the minimum value of a and b. This builtin accepts integers, floats, and vectors of either. In the latter case, the operation is performed element wise.
NaNs are handled as follows: if one of the operands of a (pairwise) operation is NaN, the other operand is returned. If both operands are NaN, NaN is returned.
This function returns the size of the Wasm memory identified by index as an unsigned value in units of Wasm pages. Note that each Wasm page is 64KB in size.
This function is a low level intrinsic with no safety mechanisms usually useful for allocator designers targeting Wasm. So unless you are writing a new allocator from scratch, you should use something like @import("std").heap.WasmPageAllocator.
This function increases the size of the Wasm memory identified by index by delta in units of unsigned number of Wasm pages. Note that each Wasm page is 64KB in size. On success, returns previous memory size; on failure, if the allocation fails, returns -1.
This function is a low level intrinsic with no safety mechanisms usually useful for allocator designers targeting Wasm. So unless you are writing a new allocator from scratch, you should use something like @import("std").heap.WasmPageAllocator.
Performs result.* = a * b. If overflow or underflow occurs, stores the overflowed bits in result and returns true. If no overflow or underflow occurs, returns false.
@panic
@panic(message: []const u8) noreturn
Invokes the panic handler function. By default the panic handler function calls the public panic function exposed in the root source file, or if there is not one specified, the std.builtin.default_panic function from std/builtin.zig.
Generally it is better to use @import("std").debug.panic. However, @panic can be useful for 2 scenarios:
From library code, calling the programmer's panic function if they exposed one in the root source file.
When mixing C and Zig code, calling the canonical panic implementation across multiple .o files.
If operand is a comptime-known integer, the return type is comptime_int. Otherwise, the return type is an unsigned integer or vector of unsigned integers with the minimum number of bits that can represent the bit count of the integer type.
This builtin tells the compiler to emit a prefetch instruction if supported by the target CPU. If the target CPU does not support the requested prefetch instruction, this builtin is a noop. This function has no effect on the behavior of the program, only on the performance characteristics.
The ptr argument may be any pointer type and determines the memory address to prefetch. This function does not dereference the pointer, it is perfectly legal to pass a pointer to invalid memory to this function and no illegal behavior will result.
This function returns the address of the next machine code instruction that will be executed when the current function returns.
The implications of this are target specific and not consistent across all platforms.
This function is only valid within function scope. If the function gets inlined into a calling function, the returned address will apply to the calling function.
Sets the floating point mode of the current scope. Possible values are:
Strict (default) - Floating point operations follow strict IEEE compliance.
Optimized - Floating point operations may do all of the following:
Assume the arguments and result are not NaN. Optimizations are required to retain defined behavior over NaNs, but the value of the result is undefined.
Assume the arguments and result are not +/-Inf. Optimizations are required to retain defined behavior over +/-Inf, but the value of the result is undefined.
Treat the sign of a zero argument or result as insignificant.
Use the reciprocal of an argument rather than perform division.
Perform floating-point contraction (e.g. fusing a multiply followed by an addition into a fused multiply-and-add).
Perform algebraically equivalent transformations that may change results in floating point (e.g. reassociate).
This is equivalent to -ffast-math in GCC.
The floating point mode is inherited by child scopes, and can be overridden in any scope. You can set the floating point mode in a struct or module scope by using a comptime block.
Sets whether runtime safety checks are enabled for the scope that contains the function call.
Note: it is planned to replace @setRuntimeSafety with @optimizeFor
@shlExact
@shlExact(value: T, shift_amt: Log2T) T
Performs the left shift operation (<<). For unsigned integers, the result is undefined if any 1 bits are shifted out. For signed integers, the result is undefined if any bits that disagree with the resultant sign bit are shifted out.
The type of shift_amt is an unsigned integer with log2(T.bit_count) bits. This is because shift_amt >= T.bit_count is undefined behavior.
Performs result.* = a << b. If overflow or underflow occurs, stores the overflowed bits in result and returns true. If no overflow or underflow occurs, returns false.
The type of shift_amt is an unsigned integer with log2(T.bit_count) bits. This is because shift_amt >= T.bit_count is undefined behavior.
Constructs a new vector by selecting elements from a and b based on mask.
Each element in mask selects an element from either a or b. Positive numbers select from a starting at 0. Negative values select from b, starting at -1 and going down. It is recommended to use the ~ operator from indexes from b so that both indexes can start from 0 (i.e. ~@as(i32, 0) is -1).
For each element of mask, if it or the selected value from a or b is undefined, then the resulting element is undefined.
a_len and b_len may differ in length. Out-of-bounds element indexes in mask result in compile errors.
If a or b is undefined, it is equivalent to a vector of all undefined with the same length as the other vector. If both vectors are undefined, @shuffle returns a vector with all elements undefined.
E must be an integer, float, pointer, or bool. The mask may be any vector length, and its length determines the result length.
This function returns the number of bytes it takes to store T in memory. The result is a target-specific compile time constant.
This size may contain padding bytes. If there were two consecutive T in memory, this would be the offset in bytes between element at index 0 and the element at index 1. For integer, consider whether you want to use @sizeOf(T) or @typeInfo(T).Int.bits.
This function measures the size at runtime. For types that are disallowed at runtime, such as comptime_int and type, the result is 0.
Note that .Add and .Mul reductions on integral types are wrapping; when applied on floating point types the operation associativity is preserved, unless the float mode is set to Optimized.
Performs result.* = a - b. If overflow or underflow occurs, stores the overflowed bits in result and returns true. If no overflow or underflow occurs, returns false.
@tagName
@tagName(value: anytype) [:0]const u8
Converts an enum value or union value to a string literal representing the name.
If the enum is non-exhaustive and the tag value does not map to a name, it invokes safety-checked Undefined Behavior.
@This
@This() type
Returns the innermost struct, enum, or union that this function call is inside. This can be useful for an anonymous struct that needs to refer to itself:
When @This() is used at file scope, it returns a reference to the struct that corresponds to the current file.
@truncate
@truncate(comptime T: type, integer: anytype) T
This function truncates bits from an integer type, resulting in a smaller or same-sized integer type.
For structs, unions, enums, and error sets, the fields are guaranteed to be in the same order as declared. For declarations, the order is unspecified.
@typeName
@typeName(T: type) *const [N:0]u8
This function returns the string representation of a type, as an array. It is equivalent to a string literal of the type name.
@TypeOf
@TypeOf(...) type
@TypeOf is a special builtin function that takes any (nonzero) number of expressions as parameters and returns the type of the result, using Peer Type Resolution.
The expressions are evaluated, however they are guaranteed to have no runtime side-effects:
@unionInit
@unionInit(comptime Union: type, comptime active_field_name: []const u8, init_expr) Union
This is the same thing as union initialization syntax, except that the field name is a comptime-known value rather than an identifier token.
The overhead of Async Functions becomes equivalent to function call overhead.
The @import("builtin").single_threaded becomes true and therefore various userland APIs which read this variable become more efficient. For example std.Mutex becomes an empty data structure and all of its functions become no-ops.
Undefined Behavior
Zig has many instances of undefined behavior. If undefined behavior is detected at compile-time, Zig emits a compile error and refuses to continue. Most undefined behavior that cannot be detected at compile-time can be detected at runtime. In these cases, Zig has safety checks. Safety checks can be disabled on a per-block basis with @setRuntimeSafety. The ReleaseFast and ReleaseSmall build modes disable all safety checks (except where overridden by @setRuntimeSafety) in order to facilitate optimizations.
When a safety check fails, Zig crashes with a stack trace, like this:
Reaching Unreachable Code
At compile-time:
At runtime:
Index out of Bounds
At compile-time:
At runtime:
Cast Negative Number to Unsigned Integer
At compile-time:
At runtime:
To obtain the maximum value of an unsigned integer, use std.math.maxInt.
This happens when casting a pointer with the address 0 to a pointer which may not have the address 0. For example, C Pointers, Optional Pointers, and allowzero pointers allow address zero, but normal Pointers do not.
At compile-time:
At runtime:
Memory
The Zig language performs no memory management on behalf of the programmer. This is why Zig has no runtime, and why Zig code works seamlessly in so many environments, including real-time software, operating system kernels, embedded devices, and low latency servers. As a consequence, Zig programmers must always be able to answer the question:
Like Zig, the C programming language has manual memory management. However, unlike Zig, C has a default allocator - malloc, realloc, and free. When linking against libc, Zig exposes this allocator with std.heap.c_allocator. However, by convention, there is no default allocator in Zig. Instead, functions which need to allocate accept an Allocator parameter. Likewise, data structures such as std.ArrayList accept an Allocator parameter in their initialization functions:
In the above example, 100 bytes of stack memory are used to initialize a FixedBufferAllocator, which is then passed to a function. As a convenience there is a global FixedBufferAllocator available for quick tests at std.testing.allocator, which will also do perform basic leak detection.
Zig has a general purpose allocator available to be imported with std.heap.GeneralPurposeAllocator. However, it is still recommended to follow the Choosing an Allocator guide.
Choosing an Allocator
What allocator to use depends on a number of factors. Here is a flow chart to help you decide:
Are you making a library? In this case, best to accept an Allocator as a parameter and allow your library's users to decide what allocator to use.
Are you linking libc? In this case, std.heap.c_allocator is likely the right choice, at least for your main allocator.
Is the maximum number of bytes that you will need bounded by a number known at comptime? In this case, use std.heap.FixedBufferAllocator or std.heap.ThreadSafeFixedBufferAllocator depending on whether you need thread-safety or not.
Is your program a command line application which runs from start to end without any fundamental cyclical pattern (such as a video game main loop, or a web server request handler), such that it would make sense to free everything at once at the end? In this case, it is recommended to follow this pattern: When using this kind of allocator, there is no need to free anything manually. Everything gets freed at once with the call to arena.deinit().
Are the allocations part of a cyclical pattern such as a video game main loop, or a web server request handler? If the allocations can all be freed at once, at the end of the cycle, for example once the video game frame has been fully rendered, or the web server request has been served, then std.heap.ArenaAllocator is a great candidate. As demonstrated in the previous bullet point, this allows you to free entire arenas at once. Note also that if an upper bound of memory can be established, then std.heap.FixedBufferAllocator can be used as a further optimization.
Are you writing a test, and you want to make sure error.OutOfMemory is handled correctly? In this case, use std.testing.FailingAllocator.
Are you writing a test? In this case, use std.testing.allocator.
Finally, if none of the above apply, you need a general purpose allocator. Zig's general purpose allocator is available as a function that takes a comptimestruct of configuration options and returns a type. Generally, you will set up one std.heap.GeneralPurposeAllocator in your main function, and then pass it or sub-allocators around to various parts of your application.
String literals such as "foo" are in the global constant data section. This is why it is an error to pass a string literal to a mutable slice, like this:
However if you make the slice constant, then it works:
Just like string literals, const declarations, when the value is known at comptime, are stored in the global constant data section. Also Compile Time Variables are stored in the global constant data section.
var declarations inside functions are stored in the function's stack frame. Once a function returns, any Pointers to variables in the function's stack frame become invalid references, and dereferencing them becomes unchecked Undefined Behavior.
var declarations at the top level or in struct declarations are stored in the global data section.
The location of memory allocated with allocator.alloc or allocator.create is determined by the allocator's implementation.
TODO: thread local variables
Implementing an Allocator
Zig programmers can implement their own allocators by fulfilling the Allocator interface. In order to do this one must read carefully the documentation comments in std/mem.zig and then supply a allocFn and a resizeFn.
There are many example allocators to look at for inspiration. Look at std/heap.zig and std.heap.GeneralPurposeAllocator.
Heap Allocation Failure
Many programming languages choose to handle the possibility of heap allocation failure by unconditionally crashing. By convention, Zig programmers do not consider this to be a satisfactory solution. Instead, error.OutOfMemory represents heap allocation failure, and Zig libraries return this error code whenever heap allocation failure prevented an operation from completing successfully.
Some have argued that because some operating systems such as Linux have memory overcommit enabled by default, it is pointless to handle heap allocation failure. There are many problems with this reasoning:
Only some operating systems have an overcommit feature.
Linux has it enabled by default, but it is configurable.
Windows does not overcommit.
Embedded systems do not have overcommit.
Hobby operating systems may or may not have overcommit.
For real-time systems, not only is there no overcommit, but typically the maximum amount of memory per application is determined ahead of time.
When writing a library, one of the main goals is code reuse. By making code handle allocation failure correctly, a library becomes eligible to be reused in more contexts.
Although some software has grown to depend on overcommit being enabled, its existence is the source of countless user experience disasters. When a system with overcommit enabled, such as Linux on default settings, comes close to memory exhaustion, the system locks up and becomes unusable. At this point, the OOM Killer selects an application to kill based on heuristics. This non-deterministic decision often results in an important process being killed, and often fails to return the system back to working order.
Recursion
Recursion is a fundamental tool in modeling software. However it has an often-overlooked problem: unbounded memory allocation.
The short summary is that currently recursion works normally as you would expect. Although Zig code is not yet protected from stack overflow, it is planned that a future version of Zig will provide such protection, with some degree of cooperation from Zig code required.
Lifetime and Ownership
It is the Zig programmer's responsibility to ensure that a pointer is not accessed when the memory pointed to is no longer available. Note that a slice is a form of pointer, in that it references other memory.
In order to prevent bugs, there are some helpful conventions to follow when dealing with pointers. In general, when a function returns a pointer, the documentation for the function should explain who "owns" the pointer. This concept helps the programmer decide when it is appropriate, if ever, to free the pointer.
For example, the function's documentation may say "caller owns the returned memory", in which case the code that calls the function must have a plan for when to free that memory. Probably in this situation, the function will accept an Allocator parameter.
Sometimes the lifetime of a pointer may be more complicated. For example, the std.ArrayList(T).items slice has a lifetime that remains valid until the next time the list is resized, such as by appending new elements.
The API documentation for functions and data structures should take great care to explain the ownership and lifetime semantics of pointers. Ownership determines whose responsibility it is to free the memory referenced by the pointer, and lifetime determines the point at which the memory becomes inaccessible (lest Undefined Behavior occur).
Compile Variables
Compile variables are accessible by importing the "builtin" package, which the compiler makes available to every Zig source file. It contains compile-time constants such as the current target, endianness, and release mode.
Example of what is imported with @import("builtin"):
TODO: explain how root source file finds other files
TODO: pub fn main
TODO: pub fn panic
TODO: if linking with libc you can use export fn main
TODO: order independent top level declarations
TODO: lazy analysis
TODO: using comptime { _ = @import() }
Zig Build System
The Zig Build System provides a cross-platform, dependency-free way to declare the logic required to build a project. With this system, the logic to build a project is written in a build.zig file, using the Zig Build System API to declare and configure build artifacts and other tasks.
Some examples of tasks the build system can help with:
Creating build artifacts by executing the Zig compiler. This includes building Zig source code as well as C and C++ source code.
Capturing user-configured options and using those options to configure the build.
Surfacing build configuration as comptime values by providing a file that can be imported by Zig code.
Caching build artifacts to avoid unnecessarily repeating steps.
Executing build artifacts or system-installed tools.
Running tests and verifying the output of executing a build artifact matches the expected value.
Running zig fmt on a codebase or a subset of it.
Custom tasks.
To use the build system, run zig build --help to see a command-line usage help menu. This will include project-specific options that were declared in the build.zig script.
Building an Executable
This build.zig file is automatically generated by zig init-exe.
Building a Library
This build.zig file is automatically generated by zig init-lib.
Although Zig is independent of C, and, unlike most other languages, does not depend on libc, Zig acknowledges the importance of interacting with existing C code.
There are a few ways that Zig facilitates C interop.
C Type Primitives
These have guaranteed C ABI compatibility and can be used like any other type.
The @cImport builtin function can be used to directly import symbols from .h files:
The @cImport function takes an expression as a parameter. This expression is evaluated at compile-time and is used to control preprocessor directives and include multiple .h files:
Zig's C translation capability is available as a CLI tool via zig translate-c. It requires a single filename as an argument. It may also take a set of optional flags that are forwarded to clang. It writes the translated file to stdout.
-I: Specify a search directory for include files. May be used multiple times. Equivalent to clang's -I flag. The current directory is not included by default; use -I. to include it.
Important! When translating C code with zig translate-c, you must use the same -target triple that you will use when compiling the translated code. In addition, you must ensure that the -cflags used, if any, match the cflags used by code on the target system. Using the incorrect -target or -cflags could result in clang or Zig parse failures, or subtle ABI incompatibilities when linking with C code.
@cImport and zig translate-c use the same underlying C translation functionality, so on a technical level they are equivalent. In practice, @cImport is useful as a way to quickly and easily access numeric constants, typedefs, and record types without needing any extra setup. If you need to pass cflags to clang, or if you would like to edit the translated code, it is recommended to use zig translate-c and save the results to a file. Common reasons for editing the generated code include: changing anytype parameters in function-like macros to more specific types; changing [*c]T pointers to [*]T or *T pointers for improved type safety; and enabling or disabling runtime safety within specific functions.
The C translation feature (whether used via zig translate-c or @cImport) integrates with the Zig caching system. Subsequent runs with the same source file, target, and cflags will use the cache instead of repeatedly translating the same code.
To see where the cached files are stored when compiling code that uses @cImport, use the --verbose-cimport flag:
cimport.h contains the file to translate (constructed from calls to @cInclude, @cDefine, and @cUndef), cimport.h.d is the list of file dependencies, and cimport.zig contains the translated output.
Some C constructs cannot be translated to Zig - for example, goto, structs with bitfields, and token-pasting macros. Zig employs demotion to allow translation to continue in the face of non-translatable entities.
Demotion comes in three varieties - opaque, extern, and @compileError. C structs and unions that cannot be translated correctly will be translated as opaque{}. Functions that contain opaque types or code constructs that cannot be translated will be demoted to extern declarations. Thus, non-translatable types can still be used as pointers, and non-translatable functions can be called so long as the linker is aware of the compiled function.
@compileError is used when top-level definitions (global variables, function prototypes, macros) cannot be translated or demoted. Since Zig uses lazy analysis for top-level declarations, untranslatable entities will not cause a compile error in your code unless you actually use them.
C Translation makes a best-effort attempt to translate function-like macros into equivalent Zig functions. Since C macros operate at the level of lexical tokens, not all C macros can be translated to Zig. Macros that cannot be translated will be be demoted to @compileError. Note that C code which uses macros will be translated without any additional issues (since Zig operates on the pre-processed source with macros expanded). It is merely the macros themselves which may not be translatable to Zig.
Consider the following example:
Note that foo was translated correctly despite using a non-translatable macro. MAKELOCAL was demoted to @compileError since it cannot be expressed as a Zig function; this simply means that you cannot directly use MAKELOCAL from Zig.
This type is to be avoided whenever possible. The only valid reason for using a C pointer is in auto-generated code from translating C code.
When importing C header files, it is ambiguous whether pointers should be translated as single-item pointers (*T) or many-item pointers ([*]T). C pointers are a compromise so that Zig code can utilize translated header files directly.
[*c]T - C pointer.
Supports all the syntax of the other two pointer types.
Coerces to other pointer types, as well as Optional Pointers. When a C pointer is coerced to a non-optional pointer, safety-checked Undefined Behavior occurs if the address is 0.
Allows address 0. On non-freestanding targets, dereferencing address 0 is safety-checked Undefined Behavior. Optional C pointers introduce another bit to keep track of null, just like ?usize. Note that creating an optional C pointer is unnecessary as one can use normal Optional Pointers.
Does not support Zig-only pointer attributes such as alignment. Use normal Pointers please!
When a C pointer is pointing to a single struct (not an array), dereference the C pointer to access to the struct's fields or member data. That syntax looks like this:
ptr_to_struct.*.struct_member
This is comparable to doing -> in C.
When a C pointer is pointing to an array of structs, the syntax reverts to this:
ptr_to_struct_array[index].struct_member
Exporting a C Library
One of the primary use cases for Zig is exporting a library with the C ABI for other programming languages to call into. The export keyword in front of functions, variables, and types causes them to be part of the library API:
Zig supports building for WebAssembly out of the box.
Freestanding
For host environments like the web browser and nodejs, build as a dynamic library using the freestanding OS target. Here's an example of running Zig code compiled to WebAssembly with nodejs.
WASI
Zig's support for WebAssembly System Interface (WASI) is under active development. Example of using the standard library and reading command line arguments:
A more interesting example would be extracting the list of preopens from the runtime. This is now supported in the standard library via std.fs.wasi.PreopenList:
Targets
Zig supports generating code for all targets that LLVM supports. Here is what it looks like to execute zig targets on a Linux x86_64 computer:
The Zig Standard Library (@import("std")) has architecture, environment, and operating system abstractions, and thus takes additional work to support more platforms. Not all standard library code requires operating system abstractions, however, so things such as generic data structures work on all above platforms.
The current list of targets supported by the Zig Standard Library is:
Linux x86_64
Windows x86_64
macOS x86_64
Style Guide
These coding conventions are not enforced by the compiler, but they are shipped in this documentation along with the compiler in order to provide a point of reference, should anyone wish to point to an authority on agreed upon Zig coding style.
Whitespace
4 space indentation
Open braces on same line, unless you need to wrap.
If a list of things is longer than 2, put each item on its own line and exercise the ability to put an extra comma at the end.
Line length: aim for 100; use common sense.
Names
Roughly speaking: camelCaseFunctionName, TitleCaseTypeName, snake_case_variable_name. More precisely:
If x is a type then x should be TitleCase, unless it is a struct with 0 fields and is never meant to be instantiated, in which case it is considered to be a "namespace" and uses snake_case.
If x is callable, and x's return type is type, then x should be TitleCase.
If x is otherwise callable, then x should be camelCase.
Otherwise, x should be snake_case.
Acronyms, initialisms, proper nouns, or any other word that has capitalization rules in written English are subject to naming conventions just like any other word. Even acronyms that are only 2 letters long are subject to these conventions.
File names fall into two categories: types and namespaces. If the file (implicitly a struct) has top level fields, it should be named like any other struct with fields using TitleCase. Otherwise, it should use snake_case. Directory names should be snake_case.
These are general rules of thumb; if it makes sense to do something different, do what makes sense. For example, if there is an established convention such as ENOENT, follow the established convention.
Omit any information that is redundant based on the name of the thing being documented.
Duplicating information onto multiple similar functions is encouraged because it helps IDEs and other tools provide better help text.
Use the word assume to indicate invariants that cause Undefined Behavior when violated.
Use the word assert to indicate invariants that cause safety-checkedUndefined Behavior when violated.
Source Encoding
Zig source code is encoded in UTF-8. An invalid UTF-8 byte sequence results in a compile error.
Throughout all zig source code (including in comments), some code points are never allowed:
Ascii control characters, except for U+000a (LF), U+000d (CR), and U+0009 (HT): U+0000 - U+0008, U+000b - U+000c, U+000e - U+0001f, U+007f.
Non-Ascii Unicode line endings: U+0085 (NEL), U+2028 (LS), U+2029 (PS).
LF (byte value 0x0a, code point U+000a, '\n') is the line terminator in Zig source code. This byte value terminates every line of zig source code except the last line of the file. It is recommended that non-empty source files end with an empty line, which means the last byte would be 0x0a (LF).
Each LF may be immediately preceded by a single CR (byte value 0x0d, code point U+000d, '\r') to form a Windows style line ending, but this is discouraged. A CR in any other context is not allowed.
HT hard tabs (byte value 0x09, code point U+0009, '\t') are interchangeable with SP spaces (byte value 0x20, code point U+0020, ' ') as a token separator, but use of hard tabs is discouraged. See Grammar.
Note that running zig fmt on a source file will implement all recommendations mentioned here. Note also that the stage1 compiler does not yet support CR or HT control characters.
Note that a tool reading Zig source code can make assumptions if the source code is assumed to be correct Zig code. For example, when identifying the ends of lines, a tool can use a naive search such as /\n/, or an advanced search such as /\r\n?|[\n\u0085\u2028\u2029]/, and in either case line endings will be correctly identified. For another example, when identifying the whitespace before the first token on a line, a tool can either use a naive search such as /[ \t]/, or an advanced search such as /\s/, and in either case whitespace will be correctly identified.
Keyword Reference
Keywords
Keyword
Description
align
align can be used to specify the alignment of a pointer. It can also be used after a variable or function declaration to specify the alignment of pointers to that variable or function.
Function parameters and struct fields can be declared with anytype in place of the type. The type will be inferred where the function is called or the struct is instantiated.
await can be used to suspend the current function until the frame provided after the await completes. await copies the value returned from the target function's frame to the caller.
catch can be used to evaluate an expression if the expression before it evaluates to an error. The expression after the catch can optionally capture the error value.
comptime before a declaration can be used to label variables or function parameters as known at compile time. It can also be used to guarantee an expression is run at compile time.
extern can be used to declare a function or variable that will be resolved at link time, when linking statically or at runtime, when linking dynamically.
An if expression can test boolean expressions, optional values, or error unions. For optional values or error unions, the if expression can capture the unwrapped value.
inline can be used to label a loop expression such that it will be unrolled at compile time. It can also be used to force a function to be inlined at all call sites.
The nosuspend keyword can be used in front of a block, statement or expression, to mark a scope where no suspension points are reached. In particular, inside a nosuspend scope:
Using the suspend keyword results in a compile error.
Using await on a function frame which hasn't completed yet results in safety-checked Undefined Behavior.
Calling an async function may result in safety-checked Undefined Behavior, because it's equivalent to await async some_async_fn(), which contains an await.
Code inside a nosuspend scope does not cause the enclosing function to become an async function.
suspend will cause control flow to return to the call site or resumer of the function. suspend can also be used before a block within a function, to allow the function access to its frame before control flow returns to the call site.
try evaluates an error union expression. If it is an error, it returns from the current function with the same error. Otherwise, the expression results in the unwrapped value.
unreachable can be used to assert that control flow will never happen upon a particular location. Depending on the build mode, unreachable may emit a panic.
Emits a panic in Debug and ReleaseSafe mode, or when using zig test.
Does not emit a panic in ReleaseFast mode, unless zig test is being used.
usingnamespace is a top-level declaration that imports all the public declarations of the operand, which must be a struct, union, or enum, into the current scope.
volatile can be used to denote loads or stores of a pointer have side effects. It can also modify an inline assembly expression to denote it has side effects.
A while expression can be used to repeatedly test a boolean, optional, or error union expression, and cease looping when that expression evaluates to false, null, or an error, respectively.