I figured since I ask people to donate monthly, I will start giving a monthly progress report to provide accountability.
So here's everything that happened in Zig land in December 2017:
You can now specify the tag type for an enum (#305):
const Small2 = enum (u2) {
One,
Two,
};
If you specify the tag type for an enum, you can put it in a packed struct:
const A = enum (u3) {
One,
Two,
Three,
Four,
One2,
Two2,
Three2,
Four2,
};
const B = enum (u3) {
One3,
Two3,
Three3,
Four3,
One23,
Two23,
Three23,
Four23,
};
const C = enum (u2) {
One4,
Two4,
Three4,
Four4,
};
const BitFieldOfEnums = packed struct {
a: A,
b: B,
c: C,
};
const bit_field_1 = BitFieldOfEnums {
.a = A.Two,
.b = B.Three3,
.c = C.Four4,
};
You can no longer cast from an enum to an arbitrary integer. Instead you must cast to the enum tag type and vice versa:
const Small2 = enum (u2) {
One,
Two,
};
test "casting enum to its tag type" {
testCastEnumToTagType(Small2.Two);
}
fn testCastEnumToTagType(value: Small2) {
assert(u2(value) == 1);
}
Now you can set the tag values of enums:
const MultipleChoice = enum(u32) {
A = 20,
B = 40,
C = 60,
D = 1000,
};
Related issue: #618
Enums are now a simple mapping between a symbol and a number. They can no longer contain payloads.
Unions have been upgraded and can now accept an enum as an argument:
const TheTag = enum {A, B, C};
const TheUnion = union(TheTag) { A: i32, B: i32, C: i32 };
test "union field access gives the enum values" {
assert(TheUnion.A == TheTag.A);
assert(TheUnion.B == TheTag.B);
assert(TheUnion.C == TheTag.C);
}
If you want to auto-create an enum for a union, you can use the enum
keyword like this:
const TheUnion2 = union(enum) {
Item1,
Item2: i32,
};
You can switch on a union-enum just like you could previously with an enum:
const SwitchProngWithVarEnum = union(enum) {
One: i32,
Two: f32,
Meh: void,
};
fn switchProngWithVarFn(a: &const SwitchProngWithVarEnum) {
switch(*a) {
SwitchProngWithVarEnum.One => |x| {
assert(x == 13);
},
SwitchProngWithVarEnum.Two => |x| {
assert(x == 13.0);
},
SwitchProngWithVarEnum.Meh => |x| {
const v: void = x;
},
}
}
However, if you do not give an enum to a union, the tag value is not visible to the programmer:
const Payload = union {
A: i32,
B: f64,
C: bool,
};
export fn entry() {
const a = Payload { .A = 1234 };
foo(a);
}
fn foo(a: &const Payload) {
switch (*a) {
Payload.A => {},
else => unreachable,
}
}
test.zig:11:13: error: switch on union which has no attached enum
switch (*a) {
^
test.zig:1:17: note: consider 'union(enum)' here
const Payload = union {
^
test.zig:12:16: error: container 'Payload' has no member called 'A'
Payload.A => {},
^
There is still debug safety though!
const Foo = union {
float: f32,
int: u32,
};
pub fn main() -> %void {
var f = Foo { .int = 42 };
bar(&f);
}
fn bar(f: &Foo) {
f.float = 12.34;
}
access of inactive union field
lib/zig/std/special/panic.zig:12:35: 0x0000000000203674 in ??? (test)
@import("std").debug.panic("{}", msg);
^
test.zig:12:6: 0x0000000000217bd7 in ??? (test)
f.float = 12.34;
^
test.zig:8:8: 0x0000000000217b7c in ??? (test)
bar(&f);
^
Aborted
However, if you make an extern union
to be compatible with C code,
there is no debug safety, just like a C union.
Other tidbits:
@enumTagName
is renamed to @tagName@EnumTagType
is renamed to @TagType, and it works on both enums and
union-enums.EnumTag
typetest "cast tag type of union to union" {
var x: Value2 = Letter2.B;
assert(Letter2(x) == Letter2.B);
}
const Letter2 = enum { A, B, C };
const Value2 = union(Letter2) { A: i32, B, C, };
test "implicit cast union to its tag type" {
var x: Value2 = Letter2.B;
assert(x == Letter2.B);
giveMeLetterB(x);
}
fn giveMeLetterB(x: Letter2) {
assert(x == Value2.B);
}
We have a fork of LLD in the zig project because of several upstream issues, all of which I have filed bugs for:
When LLVM 6.0.0 comes out, Zig will have to keep its fork because of the one issue, but we can drop all the other patches since they have been accepted upstream.
The self-hosted compiler effort has begun.
So far we have a tokenizer, and an incomplete parser and formatter. The code uses no recursion and therefore has compile-time known stack space usage. See #157
The self-hosted compiler works on every supported platform, is built using
the zig build system, tested with zig test
, links against LLVM,
and can import 100% of the LLVM symbols from the LLVM
C-API .h files - even the inline functions.
There is one C++ file in Zig which uses the more powerful LLVM C++ API (for example to create debug information) and exposes a C API. This file is now shared between the C++ self-hosted compiler and the self-hosted compiler. In stage1, we create a static library with this one file in it, and then use that library in both the C++ compiler and the self-hosted compiler.
It's really a shame that Windows command line parsing requires you to allocate memory. This means that to have a cross-platform API for command line arguments, even though in POSIX it can never fail, we have to handle the possibility because of Windows. This lead to a command line args API like this:
pub fn main() -> %void {
var arg_it = os.args();
// skip my own exe name
_ = arg_it.skip();
while (arg_it.next(allocator)) |err_or_arg| {
const arg = %return err_or_arg;
defer allocator.free(arg);
// use the arg...
}
}
Yikes, a bit cumbersome. I added a higher level API. Now you can call
std.os.argsAlloc
and get a %[]const []u8
, and you just have to call
std.os.argsFree
when you're done with it.
pub fn main() -> %void {
const allocator = std.heap.c_allocator;
const args = %return os.argsAlloc(allocator);
defer os.argsFree(allocator, args);
var arg_i: usize = 1;
while (arg_i < args.len) : (arg_i += 1) {
const arg = args[arg_i];
// do something with arg...
}
}
Better! Single point of failure.
For now this uses the other API under the hood, but it could be reimplemented with the same API to do a single allocation.
I added a new kind of test to make sure command line argument parsing works.
#define NRF_GPIO ((NRF_GPIO_Type *) NRF_GPIO_BASE)
Zig now understands this C macro.
aligned
functions to Allocator interface
mem.Allocator
initializes bytes to undefined. This does nothing in ReleaseFast
mode. In Debug and ReleaseSafe modes, it initializes bytes to 0xaa
which helps catch
memory errors.
mem.FixedBufferAllocator
std.os.ChildProcess.exec
for when you want to spawn a child process, wait for it
to complete, and then capture the stdandard output into a buffer.
pub fn exec(self: &Builder, argv: []const []const u8) -> []u8 {
const max_output_size = 100 * 1024;
const result = os.ChildProcess.exec(self.allocator, argv, null, null, max_output_size) %% |err| {
std.debug.panic("Unable to spawn {}: {}", argv[0], @errorName(err));
};
switch (result.term) {
os.ChildProcess.Term.Exited => |code| {
if (code != 0) {
warn("The following command exited with error code {}:\n", code);
printCmd(null, argv);
warn("stderr:{}\n", result.stderr);
std.debug.panic("command failed");
}
return result.stdout;
},
else => {
warn("The following command terminated unexpectedly:\n");
printCmd(null, argv);
warn("stderr:{}\n", result.stderr);
std.debug.panic("command failed");
},
}
}
Hejsil pointed out that the quicksort implementation in the standard library failed a simple test case.
There was another problem with the implementation of sort in the standard library,
which is that it used O(n)
stack space via recursion. This is fundamentally
insecure, especially if you consider that the length of an array you might want to sort could be
user input. It prevents #157
from working as well.
I had a look at Wikipedia's Comparison of Sorting Algorithms and only 1 sorting algorithm checked all the boxes:
O(n)
complexity (adaptive sort)
O(n * log(n))
complexity
O(n * log(n))
complexity
O(1)
memory
And that algorithm is Block sort.
I found a high quality implementation of block sort in C, which is licensed under the public domain.
I ported the code from C to Zig, integrated it into the standard library, and it passed all tests first try. Amazing.
Surely, I thought, there must be some edge case. So I created a simple fuzz tester:
test "sort fuzz testing" {
var rng = std.rand.Rand.init(0x12345678);
const test_case_count = 10;
var i: usize = 0;
while (i < test_case_count) : (i += 1) {
fuzzTest(&rng);
}
}
var fixed_buffer_mem: [100 * 1024]u8 = undefined;
fn fuzzTest(rng: &std.rand.Rand) {
const array_size = rng.range(usize, 0, 1000);
var fixed_allocator = mem.FixedBufferAllocator.init(fixed_buffer_mem[0..]);
var array = %%fixed_allocator.allocator.alloc(IdAndValue, array_size);
// populate with random data
for (array) |*item, index| {
item.id = index;
item.value = rng.range(i32, 0, 100);
}
sort(IdAndValue, array, cmpByValue);
var index: usize = 1;
while (index < array.len) : (index += 1) {
if (array[index].value == array[index - 1].value) {
assert(array[index].id > array[index - 1].id);
} else {
assert(array[index].value > array[index - 1].value);
}
}
}
This test passed as well. And so I think this problem is solved.
There is now an @export builtin function which can be used in a comptime block to conditionally export a function:
const builtin = @import("builtin");
comptime {
const strong_linkage = builtin.GlobalLinkage.Strong;
if (builtin.link_libc) {
@export("main", main, strong_linkage);
} else if (builtin.os == builtin.Os.windows) {
@export("WinMainCRTStartup", WinMainCRTStartup, strong_linkage);
} else {
@export("_start", _start, strong_linkage);
}
}
It can also be used to create aliases:
const builtin = @import("builtin");
const is_test = builtin.is_test;
comptime {
const linkage = if (is_test) builtin.GlobalLinkage.Internal else builtin.GlobalLinkage.Weak;
const strong_linkage = if (is_test) builtin.GlobalLinkage.Internal else builtin.GlobalLinkage.Strong;
@export("__letf2", @import("comparetf2.zig").__letf2, linkage);
@export("__getf2", @import("comparetf2.zig").__getf2, linkage);
if (!is_test) {
// only create these aliases when not testing
@export("__cmptf2", @import("comparetf2.zig").__letf2, linkage);
@export("__eqtf2", @import("comparetf2.zig").__letf2, linkage);
@export("__lttf2", @import("comparetf2.zig").__letf2, linkage);
@export("__netf2", @import("comparetf2.zig").__letf2, linkage);
@export("__gttf2", @import("comparetf2.zig").__getf2, linkage);
}
}
Previous export syntax is still allowed. See #462 and #420.
We used to have labels and goto like this:
export fn entry() {
label:
goto label;
}
Now this does not work, because goto is gone.
test.zig:2:10: error: expected token ';', found ':'
label:
^
There are a few reasons to use goto, but all of the use cases are better served with other zig control flow features:
defer
and %defer
instead.export fn entry() {
start_over:
while (some_condition) {
// do something...
goto start_over;
}
}
Instead, use a loop!
export fn entry() {
outer: while (true) {
while (some_condition) {
// do something...
continue :outer;
}
break;
}
}
pub fn findSection(elf: &Elf, name: []const u8) -> %?&SectionHeader {
var file_stream = io.FileInStream.init(elf.in_file);
const in = &file_stream.stream;
section_loop: for (elf.section_headers) |*elf_section| {
if (elf_section.sh_type == SHT_NULL) continue;
const name_offset = elf.string_section.offset + elf_section.name;
%return elf.in_file.seekTo(name_offset);
for (name) |expected_c| {
const target_c = %return in.readByte();
if (target_c == 0 or expected_c != target_c) goto next_section;
}
{
const null_byte = %return in.readByte();
if (null_byte == 0) return elf_section;
}
next_section:
}
return null;
}
Looks like the use case is breaking out of an outer loop:
pub fn findSection(elf: &Elf, name: []const u8) -> %?&SectionHeader {
var file_stream = io.FileInStream.init(elf.in_file);
const in = &file_stream.stream;
section_loop: for (elf.section_headers) |*elf_section| {
if (elf_section.sh_type == SHT_NULL) continue;
const name_offset = elf.string_section.offset + elf_section.name;
%return elf.in_file.seekTo(name_offset);
for (name) |expected_c| {
const target_c = %return in.readByte();
if (target_c == 0 or expected_c != target_c) continue :section_loop;
}
{
const null_byte = %return in.readByte();
if (null_byte == 0) return elf_section;
}
}
return null;
}
You can also break out of arbitrary blocks:
export fn entry() {
outer: {
while (some_condition) {
// do something...
break :outer;
}
}
}
This can be used to return a value from a block in the same way you can return a value from a function:
export fn entry() {
const value = init: {
for (slice) |item| {
if (item > 100)
break :init item;
}
break :init 0;
};
}
Omitting a semicolon no longer causes the value to be returned by the block.
Instead you must use explicit block labels to return a value from a block.
I'm considering a keyword such as result
which defaults to the
current block.
Removal of goto caused a regression in C-to-Zig translation: Switch statements no longer can be translated. However this code will be resurrected soon using labeled loops and labeled break instead of goto.
Before:
while (cond) {
if (false) { }
break;
}
Pretty crazy right? Something as simple as this would crash the compiler.
Now:
This improvement deletes a lot of messy code:
5 files changed, 288 insertions(+), 1243 deletions(-)
And it also fixes comptime branches not being respected sometimes:
export fn entry() {
while (false) {
@compileError("bad");
}
}
Before, this would cause a compile error. Now the while loop respects the implicit compile-time.
See #667.
&const a.b
, the const (and/or volatile)
qualifiers would be incorrectly dropped. See #655.
std.os.path.resolve
when the drive is missing.
builtin.is_big_endian
to builtin.endian
. This is in preparation for
having endianness be a pointer property, which is related to packed structs.
See #307.
--release-safe
and --release-fast
modes
where the optimizer inlines everything into _start
and
clobbers the command line argument data.
If we were able to verify that the user's code never reads
command line args, we could leave off this "no inline"
attribute. This might call for a patch to LLVM. It seems like inlining
into a naked function should correctly bump the stack pointer.
i29
and u29
primitive types. u29
is the type of alignment,
so it makes sense to be a primitive.
probably in the future we'll make any i
or u
followed by
digits into a primitive.
DW_AT_ranges
, so sometimes when you would see "???"
you now get a useful stack trace instead.
std.sort.min
and std.sort.max
functions
std.fmt.bufPrint
returns a possible error.BufferTooSmall
instead of
asserting that the buffer is large enough.
std.math
.
zig build
now has a --search-prefix
option. Any number of search prefixes can be
specified.
{.x}
where x
is the number of decimals. (See #668)Special thanks to those who donate monthly: