<?xml version="1.0"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
   <channel>
      <title>Andrew Kelley</title>
      <link>https://andrewkelley.me/</link>
      <description>My personal website - thoughts, project demos, research.</description>
      <language>en-us</language>
      <lastBuildDate>Thu, 30 May 2024 03:27:38 GMT</lastBuildDate>
      <docs>https://www.rssboard.org/rss-specification</docs>
      <webMaster>andrew@ziglang.org (Andrew Kelley)</webMaster>
      <atom:link href="https://andrewkelley.me/rss.xml" rel="self" type="application/rss+xml" />

      <image>
        <url>https://andrewkelley.me/img/profile-flowers.jpg</url>
        <title>Andrew Kelley</title>
        <link>https://andrewkelley.me/</link>
      </image>

      <item>
         <title>Zig's New CLI Progress Bar Explained</title>
         <pubDate>Thu, 30 May 2024 03:27:38 GMT</pubDate>

         <link>https://andrewkelley.me/post/zig-new-cli-progress-bar-explained.html</link>
         <guid>https://andrewkelley.me/post/zig-new-cli-progress-bar-explained.html</guid>
         <description><![CDATA[<script>
  Prism.languages['zig'] = Prism.languages.extend('clike', {
    'keyword': /\b(test|fn|import|cImport|const|var|extern|volatile|export|pub|noalias|inline|struct|enum|break|return|continue|asm|defer|if|else|switch|try|catch|while|for|null|undefined|true|false|comptime|setCold|ptrToInt|returnAddress)\b/,
    'property': /\b(bool|i8|u8|i16|u16|i32|u32|i64|u64|isize|usize|f32|f64|void|unreachable|type|error|c_short|c_ushort|c_int|c_uint|c_long|c_ulong|c_longlong|c_ulonglong|noreturn)\b/,
  });
</script>
<h1>Zig's New CLI Progress Bar Explained</h1>

<p>
Sometimes, programming projects are too easy and boring. Sometimes, they're too
hard, never ending or producing subpar results.
</p><p>
This past week I had the pleasure of completing a project that felt like
maximum difficulty - only possible because I am at the top of my game, using a
programming language designed for making perfect software. This problem threw
everything it had at me, but I rose to the challenge and emerged victorious.
</p><p>
What a rush.
</p><p>
In this blog post I'll dig into the technical implementation as well as provide the
<a href="#zig-progress-protocol-spec">Zig Progress Protocol Specification</a>.
</p>

<h2 id="demo">Demo</h2>

<p>
Before we take a deep dive, let's look at the final results by building my <a
href="https://codeberg.org/andrewrk/player">music player side project</a> while recording with Asciinema:
</p><p>
<a href="https://asciinema.org/a/MfJdqRHlMaHeNY8KJSnHBUP2I">Old</a> vs
<a href="https://asciinema.org/a/661404">New</a>
</p><p>
The usage code looks basically like this:
</p>
<pre><code class="language-zig">const parent_progress_node = std.Progress.start(.{});

// ...

const progress_node = parent_progress_node.start("sub-task name", 10);
defer progress_node.end();

for (0..10) |_| progress_node.completeOne();</code></pre>
<p>To include a child process's progress under a particular node, it's a single assignment
before calling <code>spawn</code>:</p>
<pre><code class="language-zig">child_process.progress_node = parent_progress_node;</code></pre>
<p>
For me the most exciting thing about this is its ability to visualize what the
Zig Build System is up to after you run <code>zig build</code>. Before this
feature was even merged into master branch, it led to discovery, diagnosis, and
<a href="https://github.com/ziglang/zig/commit/389181f6be8810b5cd432e236a962229257a5b59">resolution</a>
of a subtle bug that has hidden in Zig's standard library child process
spawning code for years. Not included in this blog post: a rant about how much
I hate the fork() API.
</p>

<h2 id="motivation">Motivation</h2>
<p>
The previous implementation was more conservative. It had the design limitation
that it could not assume ownership of the terminal. This meant that it had to
assume another process or thread could print to stderr at any time, and it was
not allowed to register a <code>SIGWINCH</code> signal handler to learn about when the
terminal size changed. It also did not rely on knowledge of the terminal size,
or spawn any threads.
</p><p>
This new implementation represents a more modern philosophy: when a CLI
application is spawned with stderr being a terminal, then that application owns
the terminal and owes its users the best possible user experience, taking advantage of
all the terminal features available, working around their limitations. I'm
excited for projects such as <a
href="https://mitchellh.com/ghostty">Ghostty</a> which are expanding
the user interface capabilities of terminals, and lifting those limitations.
</p><p>
With this new set of constraints in mind, it becomes possible to design a much
more useful progress bar. We gain a new requirement: since only one process
owns the terminal, child processes must therefore report their progress
semantically so it can be aggregated and displayed by the terminal owner.
</p>

<h2 id="implementation">Implementation</h2>
<p>
The whole system is designed around the public API, which must be thread-safe, lock-free, and avoid
contention as much as possible in order to prevent the progress system itself from harming performance,
particularly in a multi-threaded environment.
</p><p>
The key insight I had here is that, since the end result must be displayed on a
terminal screen, there is a reasonably small upper bound on how much memory is
required, beyond which point the extra memory couldn't be utilized because it
wouldn't fit on the terminal screen anyway.
</p><p>
By statically pre-allocating 200 nodes, the API is made infallible and non-heap-allocating. Furthermore,
it makes it possible to implement a thread-safe, lock-free node allocator based on a free list.
</p><p>
The shared data is split into several arrays:
</p><p>
<pre><code class="language-zig">node_parents: [200]Node.Parent,
node_storage: [200]Node.Storage,
node_freelist: [200]Node.OptionalIndex,</code></pre>
<p>
The freelist is used to ensure that two racing <code>Node.start()</code> calls obtain
different indexes to operate on, as well as a mechanism for <code>Node.end()</code> calls
to return nodes back to the system for reuse. Meanwhile, the parents array is
used to indicate which nodes are actually allocated, and to which parent node
they should be attached to. Finally, the other storage contains the number of
completed items, estimated total items, and task name for each node.
</p><p>
Each <code>Node.Parent</code> is 1 byte exactly. It has 2 special values, which can be
represented with type safety in Zig:
</p>
<pre><code class="language-zig">const Parent = enum(u8) {
    /// Unallocated storage.
    unused = std.math.maxInt(u8) - 1,
    /// Indicates root node.
    none = std.math.maxInt(u8),
    /// Index into `node_storage`.
    _,

    fn unwrap(i: @This()) ?Index {
        return switch (i) {
            .unused, .none =&gt; return null,
            else =&gt; @enumFromInt(@intFromEnum(i)),
        };
    }
};</code></pre>
<p>
The data mentioned above is mutated in the implementation of the thread-safe public API.
</p><p>
Meanwhile, the update thread is running on a timer. After an initial delay, it
wakes up at regular intervals to either draw progress to the terminal, or send
progress information to another process through a pipe.
</p><p>
In either case, when the update thread wakes up, the first thing it does is "serialize" the
shared data into a separate preallocated location. After carefully copying the
shared data using atomic primitives, the copied, serialized data can then be
operated on in a single-threaded manner, since it is not shared with any other
threads.
</p><p>
The update thread performs this copy by iterating over the full 200 shared
parents array, atomically loading each value and checking if it is not the
special "unused" value (0xfe). In most programming projects, a linear scan to
find allocated objects is undesirable, however in this case 200 bytes for node
parent indexes is practically free to iterate over since it's 4 cache lines
total, and the only contention comes from actual updates that must be observed.
This full scan in the update thread buys us a cheap, lock-free implementation
for <code>Node.start()</code> and <code>Node.end()</code> via popping and
pushing the freelist, respectively.
</p><p>
Next, IPC nodes are expanded. More on this point later.
</p><p>
Once the serialization process is complete, we are left with a subset of the shared data from
earlier, this time with no gaps - only used nodes are present here:
</p><p>
<pre><code class="language-zig">serialized_parents: [200]Node.Parent,
serialized_storage: [200]Node.Storage,
serialized_len: usize,</code></pre>
<p>
At this point the behavior diverges depending on whether the current process owns the terminal, or
must send progress updates via a pipe. A process that fits neither of these categories would not
have spawned an update thread to begin with. Any parent process that wants to
track progress from children creates a pipe with <code>O_NONBLOCK</code> enabled, passing
it to the child as if it were a fourth I/O stream after stdin, stdout, and
stderr. To indicate to the process that the file descriptor is in fact a
progress pipe, it sets the <code>ZIG_PROGRESS</code> environment variable. For example, in
the Zig standard library implementation, this ends up being <code>ZIG_PROGRESS=3</code>.
</p><p>
A process that is given a progress pipe sends the serialized data over the
pipe, while a process that owns the terminal draws it directly.
</p>
<p>For example, this source code:</p>
<pre><code class="language-zig">const std = @import("std");

pub fn main() !void {
    const root_node = std.Progress.start(.{
        .root_name = "preparing assets",
    });
    defer root_node.end();

    const sub_node = root_node.start("reticulating splines", 100);
    defer sub_node.end();

    for (0..50) |_| sub_node.completeOne();
    std.time.sleep(1000 * std.time.ns_per_ms);
    for (0..50) |_| sub_node.completeOne();
}</code></pre>
<p>At the sleep() call, would display this to the terminal:</p>
<pre>preparing assets
└─ [50/100] reticulating splines</pre>
<p>But if it were a child process, would send this message over the pipe instead:</p>
<pre>0000  02 00 00 00 00 00 00 00  00 70 72 65 70 61 72 69  .........prepari
00a0  6E 67 20 61 73 73 65 74  73 00 00 00 00 00 00 00  ng assets.......
00b0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
00c0  00 32 00 00 00 64 00 00  00 72 65 74 69 63 75 6C  .2...d...reticul
00d0  61 74 69 6E 67 20 73 70  6C 69 6E 65 73 00 00 00  ating splines...
00e0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
00f0  00 FF 00                                          ...</pre>

<h3 id="drawing-to-the-terminal">Drawing to the Terminal</h3>
<p>Drawing to the terminal begins with the serialized data copy. This data contains only edges
pointing to parents, and therefore cannot be used to walk the tree starting from the root. The first
thing done here is compute sibling and children edges into preallocated buffers so that we can
then walk the tree top-down.</p>

<pre><code class="language-zig">child: [200]Node.OptionalIndex,
sibling: [200]Node.OptionalIndex,</code></pre>

<p>Next, the draw buffer is computed, but not written to the terminal yet. It looks like this:</p>
<ol>
  <li><a href="https://gist.github.com/christianparpart/d8a62cc1ab659194337d73e399004036">Start sync sequence</a>. This makes high framerate terminals not blink rapidly.</li>
  <li>Clear previous update by moving cursor up times the number of newlines
  outputted last time, and then a clear to end of screen escape sequence.</li>
  <li>Recursively walk the tree of nodes, which which is now possible since we
  have computed children and sibling edges, outputting tree-drawing sequences,
  node names, and counting newlines.</li>
  <li>End sync sequence.</li>
</ol>

<p>This draw buffer is only computed - it is not sent to the write() syscall yet. At this point,
we try to obtain the stderr lock. If we get it, then we write the buffer to the terminal. Otherwise,
this update is dropped.</p>

<p>If any attempt to write to the terminal fails, the update thread exits so
that no further attempt is made.</p>

<h3 id="ipc">Inter-Process Communication</h3>
<p>Earlier, I said "IPC nodes are expanded" without further explanation. Let's dig into that a little bit.</p>

<p>As a final step in the serialization process, the update thread iterates the data, looking for
special nodes. Special nodes store the progress pipe file descriptor of a child process rather than
<code>completed_items</code> and <code>estimated_total_items</code>. The update thread reads progress
data from this pipe, and then grafts the child's sub-tree onto the parent's
tree, plugging the root node of the child into the special node of the parent.
The main storage data can be directly memcpy'd, but the parents array must be
relocated based on the offset within the serialized data arrays.
</p><p>
In case of a big-endian system, the 2 integers within each node storage must be
byte-swapped. Not relying on host endianness means that edge cases continue to
work, such as the parent process running on an x86_64 host, with a child
process running a mips program in QEMU user mode. This is a real use case when
testing the Zig compiler, for example.
</p><p>
The parent process ignores all but the last message from the pipe fd, and
references a copy of the data from the last update in case there is no message
in the pipe for a particular update.
</p>
<h2 id="performance">Performance</h2>
<p>
Here are some performance data points I took while working on this.
</p><p>
Building the Zig compiler as a sub-process. This measures the cost of the new
API implementations, primarily <code>Node.start</code>, <code>Node.end</code>,
and the disappearance of <code>Node.activate</code>. In this case, standard
error is not a terminal, and thus an update thread is never spawned:
</p>
<pre>Benchmark 1 (3 runs): old zig build-exe self-hosted compiler
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          51.8s  ±  138ms    51.6s  … 51.9s           0 ( 0%)        0%
  peak_rss           4.57GB ±  286KB    4.57GB … 4.57GB          0 ( 0%)        0%
  cpu_cycles          273G  ±  360M      273G  …  274G           0 ( 0%)        0%
  instructions        487G  ±  105M      487G  …  487G           0 ( 0%)        0%
  cache_references   19.0G  ± 31.8M     19.0G  … 19.1G           0 ( 0%)        0%
  cache_misses       3.87G  ± 12.0M     3.86G  … 3.88G           0 ( 0%)        0%
  branch_misses      2.22G  ± 3.44M     2.21G  … 2.22G           0 ( 0%)        0%
Benchmark 2 (3 runs): new zig build-exe self-hosted compiler
  measurement          mean ± σ            min … max           outliers         delta
  wall_time          51.5s  ±  115ms    51.4s  … 51.7s           0 ( 0%)          -  0.4% ±  0.6%
  peak_rss           4.58GB ±  190KB    4.58GB … 4.58GB          0 ( 0%)          +  0.1% ±  0.0%
  cpu_cycles          272G  ±  494M      272G  …  273G           0 ( 0%)          -  0.4% ±  0.4%
  instructions        487G  ± 63.5M      487G  …  487G           0 ( 0%)          -  0.1% ±  0.0%
  cache_references   19.1G  ± 16.9M     19.1G  … 19.1G           0 ( 0%)          +  0.3% ±  0.3%
  cache_misses       3.86G  ± 17.1M     3.84G  … 3.88G           0 ( 0%)          -  0.2% ±  0.9%
  branch_misses      2.23G  ± 5.82M     2.22G  … 2.23G           0 ( 0%)          +  0.4% ±  0.5%</pre>

<p>
Building the Zig compiler, with <code>time zig build-exe -fno-emit-bin
...</code> so that the progress is updating the terminal on a regular interval:
</p>
<ul>
<li>Old:<ul>
  <li>4.115s</li>
  <li>4.216s</li>
  <li>4.221s</li>
  <li>4.227s</li>
  <li>4.234s</li></ul></li>
<li>New (1% slower):<ul>
  <li>4.231s</li>
  <li>4.240s</li>
  <li>4.271s</li>
  <li>4.339s</li>
  <li>4.340s</li></ul></li>
</ul>
<p>
Building my music player application with <code>zig build</code>, with the
project-local cache cleared. This displays a lot of progress information to the
terminal. This data point accounts for many sub processes sending progress
information over a pipe to the parent process for aggregation:
</p>
<ul>
<li>Old<ul>
  <li>65.74s</li>
  <li>66.39s</li>
  <li>71.09s</li></ul></li>
<li>New (1% faster)<ul>
  <li>65.51s</li>
  <li>65.88s</li>
  <li>66.09s</li></ul></li>
</ul>
<p>
Conclusion? It appears that I succeeded in making this progress-reporting
system have no significant effect on the performance of the software that uses
it.
</p>
<h2 id="zig-progress-protocol-spec">The Zig Progress Protocol Specification</h2>
<p>Any programming language can join the fun! Here is a specification so that any software project
can participate in a standard way of sharing progress information between parent and child processes.
</p><p>
When the <code>ZIG_PROGRESS=X</code> environment variable is present, where <code>X</code> is an
unsigned decimal integer in range 0...65535, a process-wide progress reporting channel is
available.
</p><p>
The integer is a writable file descriptor opened in non-blocking mode.
Subsequent messages to the stream supersede previous ones.
</p><p>
The stream supports exactly one kind of message that looks like this:
</p>
<ol>
<li><code>len: u8</code> - number of nodes, limited to 253 max, reserving 0xfe and 0xff for
   special meaning.</li>
<li>48 bytes for every <code>len</code>:<ol>
   <li><code>completed: u32le</code> - how many items already done</li>
   <li><code>estimated_total: u32le</code> - guessed number of items to be completed, or 0 (unknown)</li>
   <li><code>name: [40]u8</code> - task description; remaining bytes zeroed out</li>
   </ol></li>
<li>1 byte for every <code>len</code>:<ol>
   <li><code>parent: u8</code> - creates an edge to a parent node in the tree, or 0xff (none)</li>
   </ol></li>
</ol>
<p>
Future versions of this protocol, if necessary, will use different environment
variable names.
</p>

<h2 id="c">Bonus: Using Zig Standard Library in C Code</h2>
<p>Much of the Zig standard library is available to C programs with minimal integration pain. Here's
a full, working example:</p>

<h3 id="example.c">example.c</h3>

<pre><code class="language-c">#include "zp.h"
#include &lt;string.h&gt;
#include &lt;unistd.h&gt;

int main(int argc, char **argv) {
    zp_node root_node = zp_init();

    const char *task_name = "making orange juice";
    zp_node sub_node = zp_start(root_node, task_name, strlen(task_name), 5);
    for (int i = 0; i &lt; 5; i += 1) {
        zp_complete_one(sub_node);
        sleep(1);
    }
    zp_end(sub_node);
    zp_end(root_node);
}</code></pre>

<h3 id="zp.h">zp.h</h3>

<pre><code class="language-c">#include &lt;stdint.h&gt;
#include &lt;stddef.h&gt;

typedef uint8_t zp_node;

zp_node zp_init(void);
zp_node zp_start(zp_node parent, const char *name_ptr, size_t name_len, size_t estimated_total);
zp_node zp_end(zp_node node);
zp_node zp_complete_one(zp_node node);</code></pre>

<h3 id="zp.zig">zp.zig</h3>

<pre><code class="language-zig">const std = @import("std");

export fn zp_init() std.Progress.Node.OptionalIndex {
    return std.Progress.start(.{}).index;
}

export fn zp_start(
    parent: std.Progress.Node.OptionalIndex,
    name_ptr: [*]const u8,
    name_len: usize,
    estimated_total_items: usize,
) std.Progress.Node.OptionalIndex {
    const node: std.Progress.Node = .{ .index = parent };
    return node.start(name_ptr[0..name_len], estimated_total_items).index;
}

export fn zp_end(node_index: std.Progress.Node.OptionalIndex) void {
    const node: std.Progress.Node = .{ .index = node_index };
    node.end();
}

export fn zp_complete_one(node_index: std.Progress.Node.OptionalIndex) void {
    const node: std.Progress.Node = .{ .index = node_index };
    node.completeOne();
}

pub const _start = void;</code></pre>

<p>Compile with <code>zig cc -o example example.c zp.zig</code></p>

<p>
<a href="https://asciinema.org/a/dJ8iLKpje6fsJygM8r1qw5e6M">Asciinema Demo</a>
</p>

<p>This same executable will also work correctly as a child process, reporting
progress over the <code>ZIG_PROGRESS</code> pipe if provided!</p>

<h2 id="follow-up-work">Follow-Up Work</h2>
<p>Big thanks to <a href="https://www.ryanliptak.com/">Ryan Liptak</a> for helping me with the
Windows console printing code, and <a href="https://github.com/jacobly0/">Jacob Young</a> for
<a href="https://github.com/ziglang/zig/pull/20114">working on the Windows IPC logic</a>, which isn't done yet.</p>
]]></description>
      </item>
      <item>
         <title>Redis Renamed to Redict</title>
         <pubDate>Fri, 22 Mar 2024 20:32:03 GMT</pubDate>

         <link>https://andrewkelley.me/post/redis-renamed-to-redict.html</link>
         <guid>https://andrewkelley.me/post/redis-renamed-to-redict.html</guid>
         <description><![CDATA[<h1>Redis Renamed to Redict</h1>
<p><a href="https://redict.io/">Redict</a> was originally
created by Salvatore Sanfilippo under the name "Redis".
Around 2018 he started losing interest in the project to pursue a science fiction career and gave
stewardship of the project to <a
href="https://en.wikipedia.org/wiki/Redis_(company)">Redis Labs</a>.</p>
<p>
I think that was an unfortunate move because their goal is mainly to extract profit from
the software project rather than to uphold the ideals of Free and Open Source Software.
On March 20, 2024, they
<a href="https://github.com/redis/redis/pull/13157">changed the license to be proprietary</a>,
a widely unpopular action. You know someone is up to no good when they write
"Live long and prosper 🖖" directly above a meme of Darth Vader.
</p>
<img class="light" alt="screenshot of the PR merging with many downvotes and darth vader" src="/img/redis-screenshot-light.png">
<img class="dark" alt="screenshot of the PR merging with many downvotes and darth vader" src="/img/redis-screenshot-dark.png">

<h2 id="license-problems">What are the actual problems with these licenses?</h2>
<p>In short summary, the licenses limit the freedoms of what one can do with
the software in order for Redis Labs to be solely enriched, while asking for
volunteer labor, and having <a
href="https://github.com/redis/redis/pull/13157#issuecomment-2014737480">already
benefitted from volunteer labor</a>, that is and was <a
href="https://news.ycombinator.com/item?id=39775468">generally offered only
because it enriches <em>everybody</em></a>.</p>
<p><a href="https://ssplisbad.com/">SSPL is BAD</a></p>

<h2 id="legality">Are they allowed to do this?</h2>
<p>All the code before the license change is available under the previous license (BSD-3),
however it is perfectly legal to make further changes to the project under a different license.</p>
<p>This means that it is also legal to <em>fork</em> the project from before the license change,
and continue maintaining the project without the proprietary license change. The only problem
there would be, the project would be missing out on all those juicy future
contributions from Redis Labs... wait a minute, isn't the project already <em>done</em>?
</p>

<h2 id="already-completed">Redict is a Finished Product</h2>
<p>Redict already works great. Lots of companies already use it in production and have
been doing so for many years.</p>
<p>In <a href="/post/why-we-cant-have-nice-software.html">Why We Can't Have Nice Software</a>,
I point out this pattern of needless software churn in the mindless quest for
profit. This is a perfect example occurring right now. Redict has already reached its
peak; it does not need any more serious software development to occur.
It does not need to <a href="https://redis.com/blog/the-future-of-redis/">pivot
to AI</a>. It can be maintained for decades to come with minimal effort. It can
continue to provide a high amount of value for a low amount of labor. That's
the entire point of software!</p>
<p><strong>Redict does not have any profit left to offer</strong>. It no longer
needs a fund-raising entity behind it anymore. It just needs a good project steward.</p>

<h2 id="drew">Drew DeVault is a Good Steward</h2>
<p><a href="https://drewdevault.com/">Drew</a> is a controversial person, I
think for two reasons.</p>
<p>One, is that he has a record of being rude to many people in the past -
including myself. However, in a <a href="https://lore.kernel.org/lkml/CA+55aFy+Hv9O5citAawS+mVZO+ywCKd9NQ2wxUmGsz9ZJzqgJQ@mail.gmail.com/">similar manner as Linus Torvalds</a>,
Drew has expressed what I can only interpret as sincere regret for such interactions, as
well as a pattern of improved behavior. I was poking through his blog to try to find
example posts of what I mean, and it's difficult to pick them out because he's such a
prolific writer, but perhaps <a href="https://drewdevault.com/2022/05/30/bleh.html">this one</a>
or maybe <a href="https://drewdevault.com/2023/05/01/2023-05-01-Burnout.html">this one</a>.
I'm a strong believer in applying
<a href="https://en.wikipedia.org/wiki/Tit_for_tat">the best game theory strategy</a>
to society: people should have consequences for the harm that they do, but then they should
get a chance to start cooperating again. I can certainly think of "cancelable" things
I have done in the past, that I am thankful are not public, and I cringe every
time I remember them.
</p>

<p>Secondly, and I think this is actually the more important point, Drew has been
an uncompromising advocate of Free and Open Source Software his entire life,
walking the walk more than anyone else I can think of. It's crystal clear that
this is the driving force of his core ideology that determines all of his decision
making. He doesn't budge on any of these principles and it creates conflicts
with people who are trying to exploit FOSS for their own gains. For example, when
you <a href="https://drewdevault.com/2023/07/04/Dont-sign-a-CLA-2.html">call out SourceGraph</a>
you basically piss off everyone who has SourceGraph stock. Do enough of these callouts,
and you've pissed off enough people that there's an entire meme subculture around hating you.</p>

<p>Meanwhile, Drew created and maintained <a
href="https://gitlab.freedesktop.org/wlroots/wlroots">wlroots</a> and <a
href="https://swaywm.org/">Sway</a>, successfully appointing a successor
maintainer to carry the torch, and runs <a href="https://sourcehut.org/">a
sustainable business</a> on top of
<a href="https://sr.ht/~sircmpwn/sourcehut/">Free and Open Source Software</a>.
SourceHut has a dependency on Redict, so it naturally follows that Drew wants
to keep his supply chain FOSS.</p>

<h2 id="redis-is-the-fork">Redis is the Fork</h2>
<p>
The only thing Redis has going for it, as a software project, is the brand name.
Salvatore is long gone. The active contributors who are working on it are, like I said,
pivoting to AI. Seriously, here's a quote from <a href="https://redis.com/blog/the-future-of-redis/">The Future of Redis</a>:
<blockquote>
<p>Making Redis the Go-To for Generative AI</p>
<p>we’re staying at the forefront of the GenAI wave</p>
</blockquote>
<p>
Meanwhile, Redict has an actual Free and Open Source Software movement behind
it, spearheaded by Drew DeVault, who has a track record of effective open
source project management.
</p>
<p>
In other words, Redict is the true spiritual successor to what was once Redis.
The title of this blog post is not spicy or edgy; it reflects reality.
</p>
]]></description>
      </item>
      <item>
         <title>Why We Can't Have Nice Software</title>
         <pubDate>Sun, 04 Feb 2024 23:11:21 GMT</pubDate>

         <link>https://andrewkelley.me/post/why-we-cant-have-nice-software.html</link>
         <guid>https://andrewkelley.me/post/why-we-cant-have-nice-software.html</guid>
         <description><![CDATA[<h1>Why We Can't Have Nice Software</h1>
<p>
The problem with software is that it's too powerful. It creates so much wealth
so fast that it's virtually impossible to not distribute it.
</p>
<p>
Think about it: sure, it takes a while to make useful software. But then you
make it, and then it's done. It keeps working with no maintenance whatsoever,
and just a <em>trickle</em> of electricity to run it.
</p>
<p>
Immediately, this poses a problem: how can a small number of people keep all
that wealth for themselves, and not let it escape in the dirty, dirty fingers
of the general populace?
</p>
<p>
This is a question that the music industry faced head-on, and they came up
with EULAs, enforced via the state's monopoly on violence, and DRM, a way for
software to act antagonistically against its own users. Software can do useful
things like encode media into bits, and then copy those bits. That's
dangerously useful, and it had to be stopped.
</p>

<h2>The True Cause of Bitrot</h2>
<p>
What about bitrot, you say? It takes ongoing maintenance to keep software working, doesn't it?
</p>
<p>
Let's think critically about bitrot for a moment because, as a reminder, bits
don't actually rot - that's kinda the point of bits. In the best case scenario,
bitrot happens due to progress - perhaps a dependency has made improvements but
requires breaking API compatibility, or better hardware comes out and the
software needs to be recompiled for that hardware. In this case, it's kind of a
happy outcome. Some labor is needed to enhance the software in response, but
then, once again, it's <em>done</em>; ripples disappearing from the surface of
a lake hours after a stone is thrown into it.
</p>
<p>
The darker side of bitrot is due to businesses trying to make <strong>more profit
than last year</strong>, and launching marketing initiatives. For example, Microsoft
shipped a Windows Update that puts advertisements into the start menu,
advertisements into the task bar, and changed the control panel's user
interface to unify it with their business incentives - namely a superficial
makeover to justify customers paying additional money for what is effectively
worse software - it has new bugs and is now ridden with advertisements. This
caused a bunch of churn in their own codebase, as well as other software trying
to use native user interfaces on Windows.
</p>
<p>
It's all so incredibly <em>wasteful</em>.
And that's the point, isn't it?
</p>
<p>
The programmers at Microsoft could have done less work, or worked on bug fixes
instead. The UI designers could have done less work, or tweaked their existing
design instead of making a new one. The managers could have done less work.
Customers could have paid little to no additional money for a Windows Update.
This all would have culminated in a more robust version of Windows that
customers preferred, instead of one that is effectively boycotted like Vista
and Windows 11.
</p>
<p>
It's actually a problem that software is too efficient and has this nasty tendency
of being <em>completed</em>.
Software offers us a glimpse into a post-scarcity society, but it is being
actively sabotaged by those who seek to turn a profit.
</p>

<h2>Platform Waste</h2>
<p>
Consumers love standards. Standards allow multiple parties, perhaps even competitors,
perhaps <em>especially</em> competitors, to have interchangeable components with each
other, which gives consumers options, and negotiation power.
</p>
<p>
For-profit companies hate standards. They would rather have their own special cable,
for example, that only works for their devices, and only they are allowed to manufacture
them. To be more specific, underdog companies like standards because it
lets them compete. The established players don't want to have to play fair.
</p>
<p>
You can see this playing out right now with the
<a href="https://www.cnn.com/2022/10/24/tech/eu-law-charging-standard/index.html">EU formally adopting a law requiring Apple to support USB-C chargers</a>. At the time of writing there is no such law in the United States, but it is <a href="https://techcrunch.com/2022/06/17/senators-call-for-us-to-adopt-common-charger">being discussed by politicians</a>.
</p>
<p>
Standards allow software to be more efficient. By sticking with a standard for a period of time
and then coordinating an upgrade to a newer one, software churn is minimized, resulting in
a fixed amount of software development labor needed.
</p>
<p>
On the other hand, without a standard, for-profit companies are incentivized to
fiddle with their product in a wasteful manner. For example, Apple has in the past made
insignificant changes to their charging cable, making it not compatible with
the one from the previous year. This resulted in more profit for Apple since
consumers found their existing cables useless and had to buy new ones. Ultimately
this resulted in more money being spent in the economy, increasing the country's GDP.
Economists rejoice; the Earth weeps.
</p>
<p>
Think about how many messaging apps have come and go and how much programmer
hours have been wasted on them. We almost had XMPP be mainstream, but then
Google outgrew their "don't be evil" diaper and put on their "make profit" big
boy pants. If your goal is to turn a profit, it's obviously the correct choice
to invest into a platform that you own. So then we got a half-dozen buggy
messaging apps from Google that didn't even work with each other, let alone
Apple's platform or the other contemporary players.
</p>
<p>
The new hotness is Discord, which is already starting, predictably, 
<a href="https://www.eff.org/deeplinks/2023/04/platforms-decay-lets-put-users-first">to decay</a>. I can't believe a human sat down and wasted hours of their life coding "super reactions". It's
not something that really needed to happen.
</p>
<p>
Imagine if all these programmer hours spent on all these products actually centered around
a proper standard, which evolved along with consumers' needs rather than
these companies' ongoing need to fiddle with the knobs and sliders until
profit comes out. The thing is, if this actually happened, then
<em>what would these employees spend their time on?</em> At some point society
would be pretty much done implementing messaging software. Messaging app updates
would be rare, and bugs in messaging apps would be rare. We would reach peak messaging.
</p>

<h2>Peak Dishwasher</h2>
<p>Decades ago, we already did it. We reached peak dishwasher. Dishwashers
achieved perfection, and it was no longer possible to improve them. The mechanics
were optimal, the user interface was ideal, and consumers had no desire for any
changes.
</p>
<p>
One would, naively, think of this as an accomplishment. But how is a company supposed
to make <em>more profit than last year</em>? By any means necessary, of course.
</p>
<p>
They invented these dishwasher detergent pods that are actually a downgrade - slightly
more time consuming to use than powder, more expensive to manufacture and purchase,
worse for the environment, and most offensive of all -
<strong>actively sabotage the dishwasher's prewash feature</strong> making the product
actually function worse than before!
</p>
<p>
And from a business perspective, it is a critical success. They found a way to
make consumers spend more money on dishwashing. The line goes up, for one more
year. But it's not enough. It has to go up every year. What else can we do?
</p>
<p>
I found myself in a position where I needed to buy a new dishwasher last month, and,
already being aware of this problem, did my very best to buy one that worked well.
I picked one that had 5/5 stars on Consumer Reports. Unfortunately, the
dishwasher that I ended up with is my worst nightmare.
</p>
<p>
It takes 30 seconds to boot up, presumably because of the Bluetooth and WiFi driver
in it. Many of the configuration options are hidden behind a proprietary app.
The buttons are hidden and touch based instead of being visible and depressing
with natural tactile feedback. I still haven't yet done the chore of going into
my router and disabling it from accessing the Internet. I had to give it access to
use the app to find out why it was broken. Until I do that chore, there's a chance it
could auto update and have a firmware bug and stop working, or just waste my bandwidth.
Who knows what it's up to?
</p>
<p>
Meanwhile, I  had to call a repair technician to fix the door latch already, as well
as the soap dispenser latch. Both things have since failed to work properly again and
I still need to do the chore of calling the company to get a repair done a third time.
</p>
<p>
Before we moved, we had an older dishwasher that worked perfectly. No Internet,
no Bluetooth, and the door latches worked flawlessly through thousands of runs.
</p>
<p>
The problem with the requirement for each year to be more profitable than the
last is that once you reach the peak, once it's not possible to actually
improve your product any more, <em>you still have to change something</em>.
Since you can't change it to make it better, you therefore will change it to
make it worse.
</p>

<h2>What Blockchains and LLMs Have in Common</h2>
<p>
Plenty of people roll their eyes at blockchain being the new buzzword, or about
how tech is overobsessed with AI (LLMs) right now. It's easy to chock it up to
it being a silly, harmless fad perpetuated by uneducated or misguided people,
but in reality it's a lot more intentional than that.
</p>
<p>
Most tech workers work 40 hours per week at some company. That's a lot of
collective hours spent on something. What factors go into deciding, as a whole,
what that effort is spent on? Employees have some choices in the matter, but in
the end those choices are limited to job offers. Job offers are created by the
owners of companies who decide what they want to invest their money into.
</p>
<p>
In other words, venture capitalists decide what is the current hotness
precisely by directing large amounts of labor towards whatever they want.
Empirically, VCs are primarily motivated by seeking a return on
investment. The goal is to turn a big sum of money into an even bigger sum of
money. In theory, this is because with an even bigger sum of money, you can
then start to spend that money on directing an even larger amount of labor
towards whatever you want, bypassing democracy to influence the future of
humanity, but in practice, most VCs get fixated on that return on investment
until they die. 
</p>
<p>
If you were to criticize blockchain technology from a purely technical perspective,
you might point out the flaw that proof of work requires an exponentially increasing
amount of computational power, and thus electricity, in order to keep the blockchain
database alive over time. You might point out how inefficient of a database it is.
But you would be missing the point.
<strong>Blockchain technology excites investors precisely because of how wasteful it is</strong>.
Even if we had fusion (!!) it would eat up all that energy and more. It's difficult to express
the magnitude of how wasteful this is, and the fact that it's built into the system intentionally
is sinister.
</p>
<p>
Blockchain technology is a kind of software that doesn't get completed or
perfected. Rather it's the opposite; the longer it is in existence the more
work it creates for everyone to do. The waste is a feature; it's how the <a
href="https://www.youtube.com/watch?v=YQ_xWvX1n9g">line goes up</a>.
</p>

<p>Reminds me of this scene in The Fifth Element where Zorg says:</p>
<blockquote>
Life, which you so nobly serve, comes from destruction, disorder, and chaos.
Now take this empty glass. Here it is: peaceful, serene, <em>boring</em>. But if
it is destroyed... look at all these little things! So busy now. Notice how each one
of them is useful. What a lovely ballet ensues, so full of form and color. Now, think
about all those people that created them. Technicians, engineers, hundreds of people
who will be able to feed their children tonight so those children can grow up big and
strong and have little teeny children of their own and so on and so forth. Thus adding
to the great chain of life.
</blockquote>
<p>You can find this scene on YouTube but I won't link it for fear of accidentally
causing someone to view an advertisement.</p>
<p>
LLMs offer an even more ideal kind of software to the investor. First of all
they require an enormous amount of capital to train, and specialized hardware to run,
making them suitable to offer as a service, where the amount of profit can be made to go
up in a controlled manner. What a delicious idea.
</p>
<p>
More to my point, they offer a host of subjective, ill-defined tasks that are
immune to being completed. They've managed to take something well-defined,
well-scoped, and completable, and turn it into an untameable monster that will
be sure to offer software churn for decades to come.
</p>
<p>
Have a peek at this blog post that is going around lately:
<a href="https://austinhenley.com/blog/copilotpainpoints.html">The pain points of building a copilot</a>
</p>
<p>
These people are brimming with excitement about all the new problems that LLMs are bringing
to the table. Some choice quotes:
</p>
<blockquote>
Prompt engineering is time consuming and requires considerable trial and error...As one developer said, "it's more of an art than a science".
</blockquote>
<blockquote>
Testing is fundamental to software development but arduous when LLMs are involved. Every test is a flaky test. 
</blockquote>
<blockquote>
The field is moving fast, and it requires developers to "throw away everything that they've learned and rethink it."
</blockquote>
<blockquote>
Developers are having to learn and compare many new tools rather than focusing on the customer problem. They then have to glue these tools together.
</blockquote>
<blockquote>
It is still the wild, wild west ... It will be interesting to see how software engineering will evolve, either through new processes or tools, over the next several years.
</blockquote>

<p>
LLMs are a way to make software take orders of magnitude more computational
power, electricity, and human labor, while delivering a product whose extremely
volatile quality is impossible to assure. The work will never be completed; it
will only create the need for ever more labor.
</p>
<p>
For investors, all this churn is attractive. It's <em>disruptive</em>.
</p>
<p>
It's why we can't have nice software.
</p>

<h2>Conclusion</h2>
<p>Technology, and in particular, software, offers a glimpse of magic; a
perpetual motion machine; wealth created from nothing. It offers us a chance to
work together on something beautiful; to achieve perfection by ratcheting
improvements over time.</p>
<p>In the end, this opportunity is squandered in a doomed quest for endless growth.</p>
]]></description>
      </item>
      <item>
         <title>The Techno-Optimist Manifesto</title>
         <pubDate>Tue, 17 Oct 2023 20:14:51 GMT</pubDate>

         <link>https://andrewkelley.me/post/the-techno-optimist-manifesto.html</link>
         <guid>https://andrewkelley.me/post/the-techno-optimist-manifesto.html</guid>
         <description><![CDATA[<h1>The Techno-Optimist Manifesto</h1>
<p>
Technology is a tool that enhances the power of the wielder.
</p>
<p>
Our tools are getting more and more powerful. Some tools, such as nuclear fission, are
so powerful that they pose an existential threat to all of humanity, because
human society has a history of wielding tools to maintain power over each other at any cost.
</p>
<p>
The dark desire to subjugate one another is the ultimate enemy of innovation,
because the conquerors are content to have their needs met by human labor,
while the conquered are too busy laboring to become scientists.
</p>
<p>
Human societies are systems. Systems are themselves tools. Free Market
Capitalism is a system. It has tech specs just like any computer - features,
limitations, and tradeoffs. We will not settle for a deprecated system when
there is a new and improved one that can be created, discovered, and iterated
upon.
</p>
<p>
Venture capital is an outdated tool. It leads to monopolies, anti-competitive
practices, generational wealth, and environmental externalities.
Non-donation-based non-profit organizations are an underexplored alternative,
especially for tech companies where the number of users is large and size of
the organization is relatively small.
</p>
<p>
Religion is an outdated tool, used to control, weaken, and exploit the masses.
It causes societies to maintain unscientific practices, especially in healthcare and
justice, and leads to war.
</p>
<p>
War is a means to destroy the wealth created by technology.
</p>
<p>
War turns technology against mankind.
</p>
<p>
War is opportunity cost for every scientist and engineer who works on national defense.
</p>
<p>
War makes it dangerous to research advanced technology.
</p>
<p>
As techno-optimists, we believe that we <em>must</em>, <strong>and we
will</strong>, create more advanced human systems to prevent war forevermore.
Human society <em>will</em> mature and overcome its
barbaric history, learning to cooperate with each other so that we can create technological
works of global magnitude, such as orbital rings.
</p>
<p>
We cannot meaningfully explore space while borders exist on Earth.
</p>
<p>
Every dollar spent on the military is a blight on the country's honor. We
believe that military budget reductions are a cause for celebration because
they symbolize progress towards the space age. We believe that the atomic
bombings of Hiroshima and Nagasaki were the most horrifying, shameful thing
ever done by humanity, to itself. Importantly, <strong>we believe that it will
never, ever happen again.</strong>
</p>
<p>
We believe that there are irreversible destructive actions. While some amount of
species extinction is a natural process, the outlier extinction event currently happening
due to the emergence of homo sapiens is a tragedy. We believe that it is our duty to
not only improve technology to enhance our own lives, but also enhance technology
to preserve natural diversity. We believe that Free Market Capitalism is an outdated
technology that has failed to prevent irreversible destruction to our home planet. However,
we believe that human society will migrate to a better system without these flaws,
and will prevent the ultimate irreversible action of self-destruction.
</p>
<p>
We believe in the simple, physical fact that endless growth is unsustainable,
and we refuse to resolve this problem with war. Instead, we will improve
efficiency and work within the limits of the system, whatever they may be.
We will upgrade our economic system to one that is capable of supporting these
requirements.
</p>
<p>
Therefore, finally, we believe that it is justified to spend one's life or
career in pursuit of science, engineering, and technology, because we believe
in humanity's ability to mature and improve to the point where it can wield
ever more powerful tools responsibly.
</p>
]]></description>
      </item>
      <item>
         <title>So Long, Twitter and Reddit</title>
         <pubDate>Wed, 23 Aug 2023 22:05:07 GMT</pubDate>

         <link>https://andrewkelley.me/post/goodbye-twitter-reddit.html</link>
         <guid>https://andrewkelley.me/post/goodbye-twitter-reddit.html</guid>
         <description><![CDATA[<h1>So Long, Twitter and Reddit</h1>

<p>It's been over three years since my last blog post. I think that
the website code started to feel like it had bitrotted, and so making new blog
posts became onerous. But I've taken the time to redo this website
using <a href="https://ziglang.org/">Zig</a>, so it's quite easy to add blog
posts now, and plus I added
<a href="https://developer.mozilla.org/en-US/docs/Web/CSS/@media/prefers-color-scheme">dark mode</a>.</p>

<h2>Twitter</h2>

<p>Over the past few years I amassed a healthy number of followers on Twitter by posting
screenshots, demos, and small progress updates for the Zig programming language. It
felt good to gain an audience and have my perspective and efforts weighed more heavily in
the court of public opinion.</p>

<p>However, I always knew that it would need to end. It's not really possible to use Twitter
without at least a little bit of doomscrolling, and that activity took a toll on me emotionally.
I think that the core concept of a "tweet" is fundamentally unhealthy, because by design it
promotes angry and extreme content. Nuance and subtlety is impossible to distinguish from
dog whistling. Most users don't understand that by "dunking" on someone, they are actually
promoting the content. In summary, it gave me a darker feeling about the world in general,
and in exchange, did not offer much to enrich my life.</p>

<p>So, when the <a href="https://knowyourmeme.com/memes/enshittification">enshittification</a>
process kicked into high gear with the
<a href="https://en.wikipedia.org/wiki/Acquisition_of_Twitter_by_Elon_Musk">Acquisition of Twitter by Elon Musk</a>,
this was the perfect opportunity for me to make my exit.</p>

<p>It has now been a few months since I downloaded all my data and deleted my
account, and in retrospect it was so, so worth it. I feel more optimistic about
the world and human society in general. I interact with people around me with fewer preconceived
judgements about their belief systems. And finally, I have more nuanced conversations with my wife
and friends about politics and other hot topics, without worrying that I'm going to come off
as dog whistling or naysaying.</p>

<p>One thing I do miss is to have the ability to correct misinformation that I stumble upon in
the wild. For example, somebody linked me to <a href="https://twitter.com/ThePrimeagen/status/1681343593208856576">this tweet</a>:</p>

<blockquote>
I cannot stop thinking about this.  Andrew created Zig and ~15 years ago asked on SO how to open a file in C++
</blockquote>

<p>(screenshot of this <a href="https://stackoverflow.com/questions/7880/how-do-you-open-a-file-in-c">StackOverflow question</a>)</p>

<blockquote>
if you take this in any other way than this is incredible and motivating that you can accomplish anything with time + effort then you nuts/sad
</blockquote>

<p>It's a sweet sentiment! Unfortunately it's based on a bit of a misunderstanding.
If you look at the date the question was asked, it was August 11th, 2008.
<a href="https://en.wikipedia.org/wiki/Stack_Overflow">According to Wikipedia</a>, StackOverflow launched on September 15th, 2008. How is this possible?</p>

<p>I'll tell you - if you look at
<a href="https://en.wikipedia.org/wiki/Stack_Overflow#cite_note-launches-1">the citation</a>,
it's a link to
<a href="https://www.joelonsoftware.com/2008/09/15/stack-overflow-launches/">the launch announcement</a> on <a href="https://www.joelonsoftware.com">Joel Spolsky's blog</a> -
a blog that I was a huge fan of when I was 20 years old. The fact is that Stack Overflow actually was annouced before that, on a podcast episode with
<a href="https://blog.codinghorror.com/">Jeff Atwood</a>.</p>

<p>Being a bored intern at Lockheed Martin, because they failed to find me anything to do besides "learn XML", I enthusiastically participated in the launch of Stack Overflow, including asking basic questions such as "How do you open a file in C++?" in order to farm karma points. And it definitely worked! I have edit powers on that website now, and you can bet your ass I use them to clean up
misleading suggestions and stinky opinions whenever I see them.</p>

<p>Now, I don't want to poop on ThePrimeagen's parade, so I can find him some replacement lore.
Here you go, enjoy!
<a href="https://programmersheaven.com/discussion/142689/binary-files">Binary files - Programmers Heaven</a></p>

<blockquote>
 I don't have a clue about binary files. I'm a good programmer, have read more programming books than I can count, but still have no knowlege about binary files. Can you read them? Will they look like text? Is it good to use them? How can I print to a binary file, and then retreive from a binary file?
</blockquote>

<p>I was 14 years old when I wrote <em>this</em> cringefest. I think ThePrimeagen's take can survive transplanted from the Stack Overflow question to this forum post.</p>

<h2>Reddit</h2>

<p>Unlike Twitter, where I find the elemental unit of socializing problematic in and of itself,
I think the Reddit formula is solid gold. This one is a case of malevolent platform stewards.
I expect this from every platform run by for-profit entities, where the goal is not to serve
the users, but to milk them for capital gains. The grim reaper of capitalism has come
for Reddit.</p>

<p>Right now, venture capitalists are freaking out about "AI". Publicly-traded
businesses with access to structured user speech that can be used as
training data are rushing to build a moat around that data. That's why Twitter
prevented viewing tweets from logged out users, and Reddit locked down its APIs
despite it causing a massive protest from the moderators who do the actual
labor to keep the website running.</p>

<p>Detecting that Reddit enshittification was reaching terminal levels, I decided that during
these protests was the ideal time to close the /r/zig subreddit, which I was
currently a moderator of. I closed the subreddit, downloaded all my
user data, and permanently deleted my reddit account, leaving moderation to
<a href="https://kristoff.it/">Loris Cro</a>, VP of Community of
<a href="https://ziglang.org/zsf/">Zig Software Foundation</a>.</p>

<p>Once Loris was the sole moderator, he decided to change the subreddit to read-only mode.
However, this was thwarted today when a troll (check their comment history)
<a href="https://old.reddit.com/r/redditrequest/comments/15rtit7/requesting_rzig/">performed a hostile takeover in broad daylight</a> and
<a href="https://old.reddit.com/r/Zig/comments/15yoxok/rzig_is_reopened/">reopened the sub</a>.
It appears that Reddit admins are perfectly willing to forcibly replace moderators
in an effort to keep the system churning out user data. Those Large Language Models are hungry!</p>

<p>Anyway, I don't consider this to be a problem; it is outside of my control what Reddit does
with its own data. I just want to make it crystal clear that
<strong>Zig, Zig Software Foundation, and myself are not in any way affiliated
with /r/zig</strong>. That subreddit is now run by a third party and all signs
point to a high likelihood of problematic behavior happening there
which I definitely want to avoid being associated with.</p>

<p>Update (2023-08-30): As of today, the moderator of /r/zig is now Jens
Goldberg, a well-regarded member of the Zig community.</p>

<p>If you want an alternative to a zig subreddit, give <a href="https://ziggit.dev/">Ziggit</a>
a try. This is the <a href="https://www.discourse.org/">Discourse forum
software</a> run by dude_the_builder,
a friendly Zig community member, and I find it to be a healthy and rewarding way of socializing
with other community members.</p>

<h2>Moving Forward</h2>

<p>I have a Mastodon account, but I don't love it, for the same reasons I
didn't like Twitter. In addition, I think that way of consuming content
is generally like watching mainstream TV or listening to radio with ads. You're
letting a bunch of people who aren't really that important to you, or qualified
to do the job, be the content curators for you.</p>

<p>Discord has been decent for a while. I suspect enshittification will commence soon, so we
should be on the lookout for an alternative over the next five years.</p>

<p>I'm going to look into setting up an RSS reader for myself and start hunting for
high quality blogs.</p>

<p>And finally, I will be redirecting my micro-blogging energy that was previously wasted on
Twitter into actual-blogging energy here.</p>
]]></description>
      </item>
      <item>
         <title>`zig cc`: a Powerful Drop-In Replacement for GCC/Clang</title>
         <pubDate>Tue, 24 Mar 2020 14:39:47 GMT</pubDate>

         <link>https://andrewkelley.me/post/zig-cc-powerful-drop-in-replacement-gcc-clang.html</link>
         <guid>https://andrewkelley.me/post/zig-cc-powerful-drop-in-replacement-gcc-clang.html</guid>
         <description><![CDATA[<h1>`zig cc`: a Powerful Drop-In Replacement for GCC/Clang</h1>
<p>
If you have heard of <a href="https://ziglang.org/">Zig</a> before, you may know it as
a promising new programming language which is ambitiously trying to overthrow C as the
de-facto systems language. But did you know that it also can straight up compile C code?
</p>
<p>
This has been possible for a while, and you can see some
<a href="https://ziglang.org/#Zig-is-also-a-C-compiler">examples of this on the home page</a>.
What's new is that the <code>zig cc</code> sub-command is available, and it supports
the same options as <a href="https://clang.llvm.org/">Clang</a>, which, in turn, supports
the same options as <a href="https://gcc.gnu.org/">GCC</a>.
</p>
<p>
Now, I'm sure you're feeling pretty skeptical right about now, so let me hook you real
quick before I get into the juicy details.
</p>
<h2>Clang and GCC cannot do this:</h2>
<pre>
<strong>andy@ark ~/tmp&gt; cat hello.c</strong>
#include &lt;stdio.h&gt;

int main(int argc, char **argv) {
    fprintf(stderr, "Hello, World!\n");
    return 0;
}
<strong>andy@ark ~/tmp&gt; clang -o hello.exe hello.c -target x86_64-windows-gnu</strong>
clang-7: warning: argument unused during compilation: '--gcc-toolchain=/nix/store/ificps9si1nvz85f9xa7gjd9h6r5lzg6-gcc-9.2.0' [-Wunused-command-line-argument]
/nix/store/7bhi29ainf5rjrk7k7wyhndyskzyhsxh-binutils-2.31.1/bin/ld: unrecognised emulation mode: i386pep
Supported emulations: elf_x86_64 elf32_x86_64 elf_i386 elf_iamcu elf_l1om elf_k1om
clang-7: <span style="color:#cc0000;font-weight:bold;">error</span>: linker command failed with exit code 1 (use -v to see invocation)
<strong>andy@ark ~/tmp&gt; clang -o hello hello.c -target mipsel-linux-musl</strong>
In file included from hello.c:1:
In file included from /nix/store/8pp3i3hcp7bv0f8jllzqq7gcp9dbzvp9-glibc-2.27-dev/include/stdio.h:27:
In file included from /nix/store/8pp3i3hcp7bv0f8jllzqq7gcp9dbzvp9-glibc-2.27-dev/include/bits/libc-header-start.h:33:
In file included from /nix/store/8pp3i3hcp7bv0f8jllzqq7gcp9dbzvp9-glibc-2.27-dev/include/features.h:452:
/nix/store/8pp3i3hcp7bv0f8jllzqq7gcp9dbzvp9-glibc-2.27-dev/include/gnu/stubs.h:7:11: <span style="color:#cc0000;font-weight:bold;">fatal error</span>: 
      'gnu/stubs-32.h' file not found
# include &lt;gnu/stubs-32.h&gt;
          ^~~~~~~~~~~~~~~~
1 error generated.
<strong>andy@ark ~/tmp&gt; clang -o hello hello.c -target aarch64-linux-gnu</strong>
In file included from hello.c:1:
In file included from /nix/store/8pp3i3hcp7bv0f8jllzqq7gcp9dbzvp9-glibc-2.27-dev/include/stdio.h:27:
In file included from /nix/store/8pp3i3hcp7bv0f8jllzqq7gcp9dbzvp9-glibc-2.27-dev/include/bits/libc-header-start.h:33:
In file included from /nix/store/8pp3i3hcp7bv0f8jllzqq7gcp9dbzvp9-glibc-2.27-dev/include/features.h:452:
/nix/store/8pp3i3hcp7bv0f8jllzqq7gcp9dbzvp9-glibc-2.27-dev/include/gnu/stubs.h:7:11: <span style="color:#cc0000;font-weight:bold;">fatal error</span>: 
      'gnu/stubs-32.h' file not found
# include &lt;gnu/stubs-32.h&gt;
          ^~~~~~~~~~~~~~~~
1 error generated.
</pre>

<h2>`zig cc` can:</h2>
<pre>
<strong>andy@ark ~/tmp&gt; zig cc -o hello.exe hello.c -target x86_64-windows-gnu</strong>
<strong>andy@ark ~/tmp&gt; wine64 hello.exe</strong>
Hello, World!
<strong>andy@ark ~/tmp&gt; zig cc -o hello hello.c -target mipsel-linux-musl</strong>
<strong>andy@ark ~/tmp&gt; qemu-mipsel ./hello</strong>
Hello, World!
<strong>andy@ark ~/tmp&gt; zig cc -o hello hello.c -target aarch64-linux-gnu</strong>
<strong>andy@ark ~/tmp&gt; qemu-aarch64 -L ~/Downloads/glibc/multi-2.31/install/glibcs/aarch64-linux-gnu ./hello</strong>
Hello, World!
</pre>
<h2>Features of `zig cc`</h2>
<p>
<code>zig cc</code> is <em>not the main purpose of the Zig project</em>. It merely
exposes the already-existing capabilities of the Zig compiler via a small frontend layer
that parses C compiler options.
</p>
<h3>Install simply by unzipping a tarball</h3>
<p>
Zig is an open source project, and of course can be
<a href="https://github.com/ziglang/zig/#building-from-source">built and installed from source the usual way</a>. However, the Zig project also has tarballs available on
<a href="https://ziglang.org/download/">the download page</a>.
You can download a 45 MiB tarball, unpack it, and you're done.
You can even have multiple versions at the same time, no problem.
</p>
<p>
Here, rather than downloading the x86_64-linux version, which matches the computer I am
currently using, I'll download the Windows version and run it in
<a href="https://www.winehq.org/">Wine</a> to show how simple installation is:
</p>
<pre>
<strong>andy@ark ~/tmp&gt; wget --quiet https://ziglang.org/builds/zig-windows-x86_64-0.5.0+13d04f996.zip</strong>
<strong>andy@ark ~/tmp&gt; unzip -q zig-windows-x86_64-0.5.0+13d04f996.zip </strong>
<strong>andy@ark ~/tmp&gt; wine64 ./zig-windows-x86_64-0.5.0+13d04f996/zig.exe cc -o hello hello.c -target x86_64-linux</strong>
<strong>andy@ark ~/tmp&gt; ./hello</strong>
Hello, World!
</pre>
<p>
Take a moment to appreciate what just happened here - I downloaded a Windows build of Zig,
ran it in Wine, using it to cross compile for Linux, and then ran the binary natively.
Computers are fun!
</p>
<p>
Compare this to
<a href="https://github.com/llvm/llvm-project/releases/tag/llvmorg-9.0.1">downloading Clang</a>,
which has 380 MiB Linux-distribution-specific tarballs. Zig's Linux tarballs are fully statically
linked, and therefore work correctly on all Linux distributions. The size difference here
comes because the Clang tarball ships with more utilities than a C compiler, as well as
pre-compiled static libraries for both LLVM and Clang. Zig does not ship with any pre-compiled
libraries; instead it ships with source code, and builds what it needs on-the-fly.
</p>
<h3 id="caching-system">Caching System</h3>
<p>
The Zig compiler uses a sophisticated caching system to avoid needlessly rebuilding
artifacts. I carefully designed this caching system to
make optimal use of the file system while maintaining correct semantics - which is
<a href="https://apenwarr.ca/log/20181113">trickier than you might think</a>!
</p>
<p>
The caching system uses a combination of hashing inputs and checking the fstat values
of file paths, while being mindful of mtime granularity. This makes it avoid
needlessly hashing files, while at the same time detecting when a modified file has
the same contents. It always has correct behavior, whether the file system has nanosecond
mtime granularity, second granularity, always sets mtime to zero, or anything in between.
</p>
<p>
You can find a
<a href="https://ziglang.org/download/0.4.0/release-notes.html#Build-Artifact-Caching">detailed description of the caching system in the 0.4.0 release notes</a>.
</p>
<p>
<code>zig cc</code> makes this caching system available when compiling C code. For simple
enough projects, this obviates the need for a Makefile or other build system.
</p>
<pre>
<strong>andy@ark ~/tmp&gt; cat foo.c</strong>
#include &lt;stdio.h&gt;

#include "another_file.c"

int main(int argc, char **argv) {
#include "printf_many_times.c"
}
<strong>andy@ark ~/tmp&gt; cat another_file.c </strong>
void another(void) {}
<strong>andy@ark ~/tmp&gt; time zig cc -c foo.c</strong>
0.12
<strong>andy@ark ~/tmp&gt; time zig cc -c foo.c</strong>
0.01
<strong>andy@ark ~/tmp&gt; touch another_file.c </strong>
<strong>andy@ark ~/tmp&gt; time zig cc -c foo.c</strong>
0.01
<strong>andy@ark ~/tmp&gt; echo "/* add a comment */" &gt;&gt;another_file.c</strong>
<strong>andy@ark ~/tmp&gt; time zig cc -c foo.c</strong>
0.12
<strong>andy@ark ~/tmp&gt; time zig cc -c foo.c</strong>
0.01
</pre>
<p>
Here you can see the caching system is smart enough to find dependencies that are
included with the preprocessor, and smart enough to avoid a full rebuild when the
mtime of another_file.c was updated.
</p>
<p>
One last thing before I move on. I want to point out that this caching system is not
some fluffy bloated feature - rather it is an absolutely critical component to making
cross-compiling work in a usable manner. As we'll see below, other compilers ship with 
pre-compiled, target-specific binaries, while Zig ships with <em>source code only</em>
and cross-compiles on-the-fly, caching the result.
</p>

<h3>Cross Compiling</h3>
<p>I have carefully designed Zig since the very beginning to treat cross compilation
as a first class use case. Now that the <code>zig cc</code> frontend is available,
it brings these capabilities to C code.
</p>
<p>
I showed you above cross-compiling some simple "Hello, World!" programs. But now let's
try a real-world C project.
</p>
<p>
Let's try <a href="https://luajit.org/">LuaJIT</a>!
</p>
<pre>
[~/Downloads]$ <strong>git clone https://github.com/LuaJIT/LuaJIT</strong>
[~/Downloads]$ <strong>cd LuaJIT</strong>
[~/Downloads/LuaJIT]$ <strong>ls</strong>
COPYRIGHT  doc  dynasm  etc  Makefile  README  src
</pre>
<p>
OK so it uses standard Makefiles. Here we go, first let's make sure it works natively
with <code>zig cc</code>.
</p>
<pre>
[~/Downloads/LuaJIT]$ <strong>export CC="zig cc"</strong>
[~/Downloads/LuaJIT]$ <strong>make CC="$CC"</strong>
==== Building LuaJIT 2.1.0-beta3 ====
make -C src
make[1]: Entering directory '/home/andy/Downloads/LuaJIT/src'
HOSTCC    host/minilua.o
HOSTLINK  host/minilua
DYNASM    host/buildvm_arch.h
HOSTCC    host/buildvm.o
HOSTCC    host/buildvm_asm.o
HOSTCC    host/buildvm_peobj.o
HOSTCC    host/buildvm_lib.o
HOSTCC    host/buildvm_fold.o
HOSTLINK  host/buildvm
BUILDVM   lj_vm.S
ASM       lj_vm.o
CC        lj_gc.o
BUILDVM   lj_ffdef.h
CC        lj_err.o
CC        lj_char.o
BUILDVM   lj_bcdef.h
CC        lj_bc.o
CC        lj_obj.o
CC        lj_buf.o
CC        lj_str.o
CC        lj_tab.o
CC        lj_func.o
CC        lj_udata.o
CC        lj_meta.o
CC        lj_debug.o
CC        lj_state.o
CC        lj_dispatch.o
CC        lj_vmevent.o
CC        lj_vmmath.o
CC        lj_strscan.o
CC        lj_strfmt.o
CC        lj_strfmt_num.o
CC        lj_api.o
CC        lj_profile.o
CC        lj_lex.o
CC        lj_parse.o
CC        lj_bcread.o
CC        lj_bcwrite.o
CC        lj_load.o
CC        lj_ir.o
CC        lj_opt_mem.o
BUILDVM   lj_folddef.h
CC        lj_opt_fold.o
CC        lj_opt_narrow.o
CC        lj_opt_dce.o
CC        lj_opt_loop.o
CC        lj_opt_split.o
CC        lj_opt_sink.o
CC        lj_mcode.o
CC        lj_snap.o
CC        lj_record.o
CC        lj_crecord.o
BUILDVM   lj_recdef.h
CC        lj_ffrecord.o
CC        lj_asm.o
CC        lj_trace.o
CC        lj_gdbjit.o
CC        lj_ctype.o
CC        lj_cdata.o
CC        lj_cconv.o
CC        lj_ccall.o
CC        lj_ccallback.o
CC        lj_carith.o
CC        lj_clib.o
CC        lj_cparse.o
CC        lj_lib.o
CC        lj_alloc.o
CC        lib_aux.o
BUILDVM   lj_libdef.h
CC        lib_base.o
CC        lib_math.o
CC        lib_bit.o
CC        lib_string.o
CC        lib_table.o
CC        lib_io.o
CC        lib_os.o
CC        lib_package.o
CC        lib_debug.o
CC        lib_jit.o
CC        lib_ffi.o
CC        lib_init.o
AR        libluajit.a
CC        luajit.o
BUILDVM   jit/vmdef.lua
DYNLINK   libluajit.so
LINK      luajit
warning: unsupported linker arg: -E
OK        Successfully built LuaJIT
make[1]: Leaving directory '/home/andy/Downloads/LuaJIT/src'
==== Successfully built LuaJIT 2.1.0-beta3 ====

[~/Downloads/LuaJIT]$ <strong>ls</strong>
COPYRIGHT  doc  dynasm  etc  Makefile  README  src

[~/Downloads/LuaJIT]$ <strong>./src/</strong>
host/         jit/          libluajit.so  luajit        zig-cache/    

[~/Downloads/LuaJIT]$ <strong>./src/luajit </strong>
LuaJIT 2.1.0-beta3 -- Copyright (C) 2005-2020 Mike Pall. http://luajit.org/
JIT: ON SSE2 SSE3 SSE4.1 BMI2 fold cse dce fwd dse narrow loop abc sink fuse
&gt; <strong>print(3 + 4)</strong>
7
&gt; 
</pre>
<p>
OK so that worked. Now for the real test - can we make it cross compile?
</p>
<pre>
[~/Downloads/LuaJIT]$ <strong>git clean -xfdq</strong>
[~/Downloads/LuaJIT]$ <strong>export CC="zig cc -target aarch64-linux-gnu"</strong>
[~/Downloads/LuaJIT]$ <strong>export HOST_CC="zig cc"</strong>
[~/Downloads/LuaJIT]$ <strong>make CC="$CC" HOST_CC="$HOST_CC" TARGET_STRIP="echo"</strong>
==== Building LuaJIT 2.1.0-beta3 ====
make -C src
make[1]: Entering directory '/home/andy/Downloads/LuaJIT/src'
HOSTCC    host/minilua.o
HOSTLINK  host/minilua
DYNASM    host/buildvm_arch.h
HOSTCC    host/buildvm.o
HOSTCC    host/buildvm_asm.o
HOSTCC    host/buildvm_peobj.o
HOSTCC    host/buildvm_lib.o
HOSTCC    host/buildvm_fold.o
HOSTLINK  host/buildvm
BUILDVM   lj_vm.S
ASM       lj_vm.o
CC        lj_gc.o
BUILDVM   lj_ffdef.h
CC        lj_err.o
CC        lj_char.o
BUILDVM   lj_bcdef.h
CC        lj_bc.o
CC        lj_obj.o
CC        lj_buf.o
CC        lj_str.o
CC        lj_tab.o
CC        lj_func.o
CC        lj_udata.o
CC        lj_meta.o
CC        lj_debug.o
CC        lj_state.o
CC        lj_dispatch.o
CC        lj_vmevent.o
CC        lj_vmmath.o
CC        lj_strscan.o
CC        lj_strfmt.o
CC        lj_strfmt_num.o
CC        lj_api.o
CC        lj_profile.o
CC        lj_lex.o
CC        lj_parse.o
CC        lj_bcread.o
CC        lj_bcwrite.o
CC        lj_load.o
CC        lj_ir.o
CC        lj_opt_mem.o
BUILDVM   lj_folddef.h
CC        lj_opt_fold.o
CC        lj_opt_narrow.o
CC        lj_opt_dce.o
CC        lj_opt_loop.o
CC        lj_opt_split.o
CC        lj_opt_sink.o
CC        lj_mcode.o
CC        lj_snap.o
CC        lj_record.o
CC        lj_crecord.o
BUILDVM   lj_recdef.h
CC        lj_ffrecord.o
CC        lj_asm.o
CC        lj_trace.o
CC        lj_gdbjit.o
CC        lj_ctype.o
CC        lj_cdata.o
CC        lj_cconv.o
CC        lj_ccall.o
CC        lj_ccallback.o
CC        lj_carith.o
CC        lj_clib.o
CC        lj_cparse.o
CC        lj_lib.o
CC        lj_alloc.o
CC        lib_aux.o
BUILDVM   lj_libdef.h
CC        lib_base.o
CC        lib_math.o
CC        lib_bit.o
CC        lib_string.o
CC        lib_table.o
CC        lib_io.o
CC        lib_os.o
CC        lib_package.o
CC        lib_debug.o
CC        lib_jit.o
CC        lib_ffi.o
CC        lib_init.o
AR        libluajit.a
CC        luajit.o
BUILDVM   jit/vmdef.lua
DYNLINK   libluajit.so
libluajit.so
LINK      luajit
warning: unsupported linker arg: -E
luajit
OK        Successfully built LuaJIT
make[1]: Leaving directory '/home/andy/Downloads/LuaJIT/src'
==== Successfully built LuaJIT 2.1.0-beta3 ====

[~/Downloads/LuaJIT]$ <strong>file ./src/luajit </strong>
./src/luajit: ELF 64-bit LSB executable, ARM aarch64, version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux-aarch64.so.1, for GNU/Linux 2.0.0, with debug_info, not stripped
</pre>
<p>
It worked! Will it run in <a href="https://www.qemu.org/">QEMU</a> though?
</p>
<pre>
[~/Downloads/LuaJIT]$ <strong>qemu-aarch64 -L ~/Downloads/glibc/multi-2.31/install/glibcs/aarch64-linux-gnu ./src/luajit</strong>
LuaJIT 2.1.0-beta3 -- Copyright (C) 2005-2020 Mike Pall. http://luajit.org/
JIT: ON fold cse dce fwd dse narrow loop abc sink fuse
&gt; <strong>print(4 + 3)</strong>
7
&gt; 
</pre>
<p>
Amazing. QEMU never fails to impress me.
</p>
<p>
Before we move on, I want to show one more thing. You can see above, in order to run the
foreign-architecture binary, I had to pass
<code>-L ~/Downloads/glibc/multi-2.31/install/glibcs/aarch64-linux-gnu</code>. This is
due to the binary being dynamically linked. You can confirm this with the output from
<code>file</code> above where it says: <code>dynamically linked, interpreter /lib/ld-linux-aarch64.so.1</code>
</p>
<p>
Often, when cross-compiling, it is useful to make a <em>static</em> binary.
In the case of Linux, for example, this will make the resulting binary able to run on
any Linux distribution, rather than only ones with a hard-coded glibc dynamic linker path
of <code>/lib/ld-linux-aarch64.so.1</code>.
</p>
<p>
We can accomplish this by targeting musl rather than glibc:
</p>
<pre>
[~/Downloads/LuaJIT]$ git clean -qxfd
[~/Downloads/LuaJIT]$ export CC="zig cc -target aarch64-linux-musl"
[~/Downloads/LuaJIT]$ make CC="$CC" CXX="$CXX" HOST_CC="$HOST_CC" TARGET_STRIP="echo"
==== Building LuaJIT 2.1.0-beta3 ====
(same output)
==== Successfully built LuaJIT 2.1.0-beta3 ====
[~/Downloads/LuaJIT]$ file src/luajit
src/luajit: ELF 64-bit LSB executable, ARM aarch64, version 1 (SYSV), statically linked, not stripped
[~/Downloads/LuaJIT]$ qemu-aarch64 ./src/luajit
LuaJIT 2.1.0-beta3 -- Copyright (C) 2005-2020 Mike Pall. http://luajit.org/
JIT: ON fold cse dce fwd dse narrow loop abc sink fuse
&gt; print(11 + 22)
33
</pre>
<p>
Here you can see the <code>file</code> command reported <em>statically linked</em>,
and in the qemu command, the <code>-L</code> parameter was not needed.
</p>

<h2>Use Cases of `zig cc`</h2>
<p>
Alright, so I've given you a taste of what <code>zig cc</code> can do, but now I will
list explicitly what I consider to be the use cases:
</p>
<h3>Experimentation</h3>
<p>
Sometimes you just want a tool that you can use to try out different things. It can quickly
answer questions such as "What assembly does this code generate on MIPS vs ARM?". The widely
popular <a href="https://godbolt.org/">Compiler Explorer</a> serves this purpose.
</p>
<p>
<code>zig cc</code> provides a lightweight tool which can also answer questions such as,
"What happens if I swap out glibc for <a href="https://musl.libc.org/">musl</a>?" and
"How big is this executable when cross-compiled for Windows?".
<a href="https://twitter.com/andy_kelley/status/1242183564512366595">Here's me using Zig to
quickly find out what the maximum UDP packet size is on Linux</a>.
</p>
<p>
Since Zig is so easy to install - and it actually works everywhere without patches,
even Linux distributions such as <a href="https://nixos.org/">NixOS</a> -
it can often be a more convenient tool for running quick C test programs on your computer.
</p>
<p>
At the time of this writing, LLVM 10 was just released two hours ago.
It will take days or weeks for it to become available in various system package managers.
But you can already
<a href="https://ziglang.org/download/">download a master branch build of Zig</a>
and play with the new features of Clang/LLVM 10. For example, improved RISC-V support!
</p>
<pre>
andy@ark ~/tmp&gt; <strong>zig cc -o hello hello.c -target riscv64-linux-musl</strong>
andy@ark ~/tmp&gt; <strong>qemu-riscv64 ./hello</strong>
Hello, World!
</pre>

<h3>Bundling a C compiler as part of a larger project</h3>
<p>
With Zig tarballs weighing in at under 45 MiB, zero system dependencies, no configuration,
and MIT license, it makes for an ideal candidate when you need to bundle a C compiler along
with another project.
</p>
<p>
For example, maybe you have
<a href="https://www.acton-lang.org/">a programming language that compiles to C</a>.
Zig is an obvious choice for what C compiler to ship with your language.
</p>
<p>
Or maybe you want to make a batteries-included
<a href="https://en.wikipedia.org/wiki/Integrated_development_environment">IDE</a>
that ships with a compiler.
</p>

<h3>Lightweight alternative to a cross compilation environment</h3>
<p>
If you're trying to build something with a large dependency tree, you'll probably want to
use a full cross compilation environment, such as <a href="https://mxe.cc/">mxe.cc</a>
or <a href="http://musl.cc/">musl.cc</a>.
</p>
<p>
But if you don't need such a sledgehammer, <code>zig cc</code> could be a useful alternative,
especially if your goal is to compile for N different targets. Consider that musl.cc lists different
tarballs for each architecture, each weighing in at roughly 85 MiB. Meanwhile Zig weighs in at 45 MiB
and it supports all those architectures, plus glibc and Windows.
</p>

<h3>An alternative to installing MSVC on Windows</h3>
<p>
You could spend days - literally! - waiting for Microsoft Visual Studio to install,
or you could install Zig and
<a href="https://code.visualstudio.com/">VS Code</a> in a matter of minutes.
</p>

<h2>Under the Hood</h2>
<p>If <code>zig cc</code> is built on top of Clang, why doesn't Clang just do this?
What exactly is Zig doing on top of Clang to make this work?</p>
<p>
The answer is, <em>a lot</em>, actually. I'll go over how it works here.
</p>
<h3>compiler-rt</h3>
<p>
compiler-rt is a library that provides "polyfill" implementations of language-supported features
when the target does not have machine code instructions for it. For example, compiler-rt has
the function <code>__muldi3</code> to perform signed 64-bit integer multiplication on architectures
that do not have a 64-bit wide integer multiplication instruction.
</p>
<p>
In the GNU world, compiler-rt is named <strong>libgcc</strong>.
</p>
<p>
Most C compilers ship with this library pre-built for the target.
For example, on an Ubuntu (Bionic) system, with the <code>build-essential</code> package installed,
you can find this at <code>/lib/x86_64-linux-gnu/libgcc_s.so.1</code>.
</p>
<p>
If you download <a href="https://github.com/llvm/llvm-project/releases/download/llvmorg-9.0.1/clang+llvm-9.0.1-x86_64-linux-gnu-ubuntu-16.04.tar.xz">clang+llvm-9.0.1-x86_64-linux-gnu-ubuntu-16.04.tar.xz</a> and take a look
around, clang actually does not even ship with compiler-rt. Instead, it relies on the system libgcc
noted above. This is one reason that this tarball is Ubuntu-specific and does not work on other
Linux distributions, 
<a href="https://www.leidinger.net/blog/2010/09/28/the-freebsd-linuxulator-explained-for-users/">FreeBSD's Linuxulator</a>,
or <a href="https://en.wikipedia.org/wiki/Windows_Subsystem_for_Linux">WSL</a>,
which have system files in different locations.
</p>
<p>
Zig's strategy with compiler-rt is that we have
<a href="https://github.com/ziglang/zig/blob/0.5.0/lib/std/special/compiler_rt.zig">our own implementation of this library</a>,
written in Zig. Most of it is ported from
<a href="https://github.com/llvm/llvm-project/tree/llvmorg-10.0.0-rc6/compiler-rt">LLVM's compiler-rt project</a>,
but we also have some of our own improvements on top of this.
</p>
<p>
Anyway, rather than depending on system compiler-rt being installed, or shipping a pre-compiled
library, Zig ships its compiler-rt <em>in source form</em>, and lazily builds compiler-rt 
for the compilation target, and then caches the result using
<a href="#caching-system">the caching system discussed above</a>.
</p>
<p>
Zig's compiler-rt is <a href="https://github.com/ziglang/zig/issues/1290">not yet complete</a>.
However, completing it is a prerequisite for releasing Zig version 1.0.0.
</p>

<h3>libc</h3>
<p>
When C code calls <code>printf</code>, <code>printf</code> has to be implemented <em>somewhere</em>,
and that somewhere is libc.
</p>
<p>
Some operating systems, such as <a href="https://www.freebsd.org/">FreeBSD</a> and macOS, have a
designated system libc, and it is the kernel syscall interface. On others, such as
Windows and Linux, libc is optional, and therefore there are multiple options of which
libc to use, if any.
</p>
<p>
As of the time of this writing, Zig can provide libcs for the following targets:
</p>
<pre>
andy@ark ~&gt; zig targets | jq .libc
[
  "aarch64_be-linux-gnu",
  "aarch64_be-linux-musl",
  "aarch64_be-windows-gnu",
  "aarch64-linux-gnu",
  "aarch64-linux-musl",
  "aarch64-windows-gnu",
  "armeb-linux-gnueabi",
  "armeb-linux-gnueabihf",
  "armeb-linux-musleabi",
  "armeb-linux-musleabihf",
  "armeb-windows-gnu",
  "arm-linux-gnueabi",
  "arm-linux-gnueabihf",
  "arm-linux-musleabi",
  "arm-linux-musleabihf",
  "arm-windows-gnu",
  "i386-linux-gnu",
  "i386-linux-musl",
  "i386-windows-gnu",
  "mips64el-linux-gnuabi64",
  "mips64el-linux-gnuabin32",
  "mips64el-linux-musl",
  "mips64-linux-gnuabi64",
  "mips64-linux-gnuabin32",
  "mips64-linux-musl",
  "mipsel-linux-gnu",
  "mipsel-linux-musl",
  "mips-linux-gnu",
  "mips-linux-musl",
  "powerpc64le-linux-gnu",
  "powerpc64le-linux-musl",
  "powerpc64-linux-gnu",
  "powerpc64-linux-musl",
  "powerpc-linux-gnu",
  "powerpc-linux-musl",
  "riscv64-linux-gnu",
  "riscv64-linux-musl",
  "s390x-linux-gnu",
  "s390x-linux-musl",
  "sparc-linux-gnu",
  "sparcv9-linux-gnu",
  "wasm32-freestanding-musl",
  "x86_64-linux-gnu",
  "x86_64-linux-gnux32",
  "x86_64-linux-musl",
  "x86_64-windows-gnu"
]
</pre>
<p>
In order to provide libc on these targets, Zig ships with a subset of the source files
for these projects:
</p>
<ul>
  <li>musl v1.2.0</li>
  <li><a href="https://mingw-w64.org/">mingw-w64</a> v7.0.0</li>
  <li>glibc 2.31</li>
</ul>
<p>
For each libc, there is a
<a href="https://github.com/ziglang/zig/wiki/Updating-libc">process for upgrading to a new release</a>.
This process is a sort of pre-processing step. We still end up with source files, but we
de-duplicate non-multi-arch source files into multi-arch source files.
</p>

<h4>glibc</h4>
<p>
glibc is the most involved. The first step is building glibc for every target that it supports,
which takes upwards of 24 hours and 74 GiB of disk space.
</p>
<p>
From here, the
<a href="https://github.com/ziglang/zig/blob/dc44fe053c609f389e375f6857f96b6bb3794897/tools/process_headers.zig">process_headers tool</a>
inspects all the header files from all the targets, and identifies which files are the same across
all targets, and which header files are target-specific. They are then sorted into the
corresponding directories in Zig's source tree, in:
</p>
<ul>
  <li>lib/libc/include/generic-glibc/</li>
  <li>lib/libc/include/$ARCH-linux-$ABI/ (there are multiple of these directories)</li>
</ul>
<p>
Additionally, Linux header files are not included in glibc, and so the same process is applied to
Linux header files, with the directories:
</p>
<ul>
  <li>lib/libc/include/any-linux-any/</li>
  <li>lib/libc/include/$ARCH-linux-any/</li>
</ul>
<p>
That takes care of the header files, but now we have the problem of dynamic linking against
glibc, without touching any system files.
</p>
<p>
For this, we have the
<a href="https://github.com/ziglang/zig/blob/dc44fe053c609f389e375f6857f96b6bb3794897/tools/update_glibc.zig">update_glibc tool</a>.
Given the path to the glibc source directory, it finds all the <code>.abilist</code> text files
and uses them to produce 3 simple but crucial files:
</p>
<ul>
  <li><a href="https://github.com/ziglang/zig/blob/master/lib/libc/glibc/vers.txt">vers.txt</a>
    - the list of all glibc versions.
  </li>
  <li><a href="https://github.com/ziglang/zig/blob/master/lib/libc/glibc/fns.txt">fns.txt</a>
    - the list of all symbols that glibc provides, followed by the library it appears in
    (for example libm, libpthread, libc, librt).
  </li>
  <li><a href="https://github.com/ziglang/zig/blob/master/lib/libc/glibc/abi.txt">abi.txt</a>
    - for each target, for each function, tells which versions of glibc, if any, it appears in.
  </li>
</ul>
<p>
Together, these files amount to only 192 KB (27 KB gzipped), and they allow Zig to target any
version of glibc.
</p>
<p>
Yes, I did not make a typo there. Zig can target any of the 42 versions of glibc for any of the
architectures listed above. I'll show you:
</p>
<pre>
andy@ark ~/tmp&gt; <strong>cat rand.zig </strong>
const std = @import("std");

pub fn main() anyerror!void {
    var buf: [10]u8 = undefined;
    _ = std.c.getrandom(&amp;buf, buf.len, 0);
    std.debug.warn("random bytes: {x}\n", .{buf});
}
andy@ark ~/tmp&gt; <strong>zig build-exe rand.zig -lc -target native-native-gnu.2.25</strong>
andy@ark ~/tmp&gt; <strong>./rand</strong>
random bytes: e2059382afb599ea6d29
andy@ark ~/tmp&gt; <strong>zig build-exe rand.zig -lc -target native-native-gnu.2.24</strong>
lld: error: undefined symbol: getrandom
&gt;&gt;&gt; referenced by rand.zig:5 (/home/andy/tmp/rand.zig:5)
&gt;&gt;&gt;               ./rand.o:(main.0)
</pre>
<p>
Sure enough, if you look at the
<a href="http://man7.org/linux/man-pages/man2/getrandom.2.html">man page for getrandom</a>,
it says:
</p>
<blockquote>
Support was added to glibc in version 2.25.
</blockquote>
<p>
When no explicit glibc version is requested, and the target OS is the native (host) OS,
Zig detects the native glibc version by inspecting the Zig executable's own dynamically
linked libraries, looking for glibc, and checking the version. It turns out you can look for
<code>libc.so.6</code> and then <code>readlink</code> on that, and it will look something
like <code>libc-2.27.so</code>. When this strategy does not work, Zig looks at
<code>/usr/bin/env</code>, looking for the same thing. Since this file path is hard-coded
into countless shebang lines, it's a pretty safe bet to find out the dynamic linker path and
glibc version (if any) of the native system!
</p>
<p>
<code>zig cc</code> currently does not provide a way to choose a specific glibc version
(because C compilers do not provide a way), and so Zig chooses the native version for
compiling natively, and the default (2.17) for cross-compiling.
However, I'm sure this problem can be solved, even when using <code>zig cc</code>. For example,
maybe it could support an environment variable, or simply introduce an extra command line
option that does not conflict with any Clang options.
</p>
<p>
When you request a certain version of glibc, Zig uses those text files noted above to
create dummy <code>.so</code> files to link against, which contain exactly the correct
set of symbols (with appropriate name mangling) based on the requested version.
The symbols will be resolved at runtime, by the dynamic linker on the target platform.
</p>
<p>
In this way, most of libc in the glibc case resides on the target file system. But not all of it!
There are still the "C runtime start files":
</p>
<ul>
  <li>Scrt1.o</li>
  <li>crti.o</li>
  <li>crtn.o</li>
</ul>
<p>
These are statically compiled into every binary that dynamically links glibc, and their
<a href="https://en.wikipedia.org/wiki/Application_binary_interface">ABI</a> is therefore
Very Very Stable.
</p>
<p>
And so, Zig bundles a small subset of glibc's source files needed to build these object
files from source for every target. The total size of this comes out to 1.4 MiB (252 KB gzipped).
I do think there is some room for improvement here, but I digress.
</p>
<p>
There are a couple of patches to this small subset of glibc source files, which simplify them
to avoid including too many .h files, since the end result that we need is some bare bones object
files, and not all of glibc.
</p>
<p>
And finally, we certainly do not ship the build system of glibc with Zig! I manually inspected,
audited, and analyzed glibc's build system, and then by hand wrote code in the Zig
compiler which hooks into Zig's <a href="#caching-system">caching system</a> and performs a minimal
build of only these start files, as needed.
</p>

<h4>musl</h4>
<p>
The process for preparing musl to ship with Zig is much simpler by comparison.
</p>
<p>
It still involves building musl for every target architecture that it supports,
but in this case only the <code>install-headers</code> target has to be run,
and it takes less than a minute, even to do it for all targets.
</p>
<p>
The same
<a href="https://github.com/ziglang/zig/blob/dc44fe053c609f389e375f6857f96b6bb3794897/tools/process_headers.zig">process_headers tool</a>
tool used for glibc headers is used on the musl headers:
</p>
<ul>
  <li>lib/libc/include/generic-musl/</li>
  <li>lib/libc/include/$ARCH-linux-$ABI/ (there are multiple of these directories)</li>
</ul>
<p>
Unlike glibc, musl supports building statically. Zig currently assumes a static libc
when musl is chosen, and does not support dynamically linking against musl, although
that could potentially be added in the future.
</p>
<p>
And so for musl, zig actually bundles most - but still not all - of musl's source files.
Everything in <code>arch</code>, <code>crt</code>, <code>compat</code>, <code>src</code>, and <code>include</code> gets copied in.
</p>
<p>
Again much like glibc, I carefully studied musl's build system, and then hand-coded logic
in the Zig compiler to build these source files. In musl's case it is simpler - just a bit
of logic having to do with the file extension, and whether to override files with an
architecture-specific file. The only file that needs to be patched (by hand) is
<code>version.h</code>, which is normally generated during the configure phase in musl's build
system.
</p>
<p>
I really appreciate Rich Felker's efforts to make musl simple to utilize in this way,
and he has been incredibly helpful in the <code>#musl</code> IRC channel when I ask
questions.
<a href="why-donating-to-musl-libc-project.html">I proudly sponsor Rich Felker for $150/month</a>.
</p>

<h4>mingw-w64</h4>
<p>
mingw-w64 was an absolute joy to support in Zig. The beautiful thing about this project is that
they have already been transitioning into having one set of header files that applies to all
architectures (using <code>#ifdefs</code> only where needed). One set of header files
is sufficient to support all four architectures: arm, aarch64, x86, and x86_64. 
</p>
<p>
So for updating headers, all we have to do is build mingw-w64, then:
</p>
<pre>
mv $INSTALLPREFIX/include $ZIGSRC/lib/libc/include/any-windows-any
</pre>
<p>
After doing this for all 3 libcs, the libc/include directory looks like this:
</p>
<pre>
aarch64_be-linux-any   i386-linux-musl           powerpc-linux-any
aarch64_be-linux-gnu   mips64el-linux-any        powerpc-linux-gnu
aarch64-linux-any      mips64el-linux-gnuabi64   powerpc-linux-musl
aarch64-linux-gnu      mips64el-linux-gnuabin32  riscv32-linux-any
aarch64-linux-musl     mips64-linux-any          riscv64-linux-any
any-linux-any          mips64-linux-gnuabi64     riscv64-linux-gnu
any-windows-any        mips64-linux-gnuabin32    riscv64-linux-musl
armeb-linux-any        mips64-linux-musl         s390x-linux-any
armeb-linux-gnueabi    mipsel-linux-any          s390x-linux-gnu
armeb-linux-gnueabihf  mipsel-linux-gnu          s390x-linux-musl
arm-linux-any          mips-linux-any            sparc-linux-gnu
arm-linux-gnueabi      mips-linux-gnu            sparcv9-linux-gnu
arm-linux-gnueabihf    mips-linux-musl           x86_64-linux-any
arm-linux-musl         powerpc64le-linux-any     x86_64-linux-gnu
generic-glibc          powerpc64le-linux-gnu     x86_64-linux-gnux32
generic-musl           powerpc64-linux-any       x86_64-linux-musl
i386-linux-any         powerpc64-linux-gnu
i386-linux-gnu         powerpc64-linux-musl
</pre>
<p>
When Zig generates a C command line to send to clang, it puts the appropriate
include paths using <code>-I</code> depending on the target. For example, if the
target is <code>aarch64-linux-musl</code>, then the following command line parameters
are appended:
</p>
<ul>
  <li><code>-I$LIB/libc/include/aarch64-linux-musl</code></li>
  <li><code>-I$LIB/libc/include/aarch64-linux-any</code></li>
  <li><code>-I$LIB/libc/include/generic-musl</code></li>
</ul>
<p>
Anyway back to mingw-w64.
</p>
<p>
Again, Zig includes a subset of source files from mingw-w64 with a few patches applied
to make things compile successfully.
</p>
<p>
The Zig compiler code that builds mingw-w64 from source files emulates only the parts of
the build system that are needed for this subset. This includes preprocessing <code>.def.in</code>
files to get <code>.def</code> files, and then in-turn using LLD to generate <code>.lib</code> files
from the <code>.def</code> files, which allows Zig to provide <code>.lib</code> files for
any Windows DLL, such as kernel32.dll or even opengl32.dll.
</p>

<h3>Invoking Clang Without a System Dependency</h3>
<p>
Since Zig already links against Clang libraries for the
<a href="https://ziglang.org/#Integration-with-C-libraries-without-FFIbindings">translate-c feature</a>,
it was not much more cost to expose the <code>main()</code> entry point from Zig.
So that's exactly what we do:
</p>
<ul>
  <li><code>llvm-project/clang/tools/driver/driver.cpp</code> is copied to <code>$ZIGGIT/src/zig_clang_driver.cpp</code></li>
  <li><code>llvm-project/clang/tools/driver/cc1_main.cpp</code> is copied to <code>$ZIGGIT/src/zig_clang_cc1_main.cpp</code></li>
  <li><code>llvm-project/clang/tools/driver/cc1as_main.cpp</code> is copied to <code>$ZIGGIT/src/zig_clang_cc1as_main.cpp</code></li>
</ul>
<p>
The following patch is applied:
</p>
<pre>
--- a/src/zig_clang_driver.cpp
+++ b/src/zig_clang_driver.cpp
@@ -206,8 +205,6 @@
                     void *MainAddr);
 extern int cc1as_main(ArrayRef&lt;const char *&gt; Argv, const char *Argv0,
                       void *MainAddr);
<span class="diff0">-extern int cc1gen_reproducer_main(ArrayRef&lt;const char *&gt; Argv,</span>
<span class="diff0">-                                  const char *Argv0, void *MainAddr);</span>
 
 static void insertTargetAndModeArgs(const ParsedClangName &amp;NameParts,
                                     SmallVectorImpl&lt;const char *&gt; &amp;ArgVector,
@@ -330,19 +327,18 @@
   if (Tool == "-cc1as")
     return cc1as_main(makeArrayRef(ArgV).slice(2), ArgV[0],
                       GetExecutablePathVP);
<span class="diff0">-  if (Tool == "-cc1gen-reproducer")</span>
<span class="diff0">-    return cc1gen_reproducer_main(makeArrayRef(ArgV).slice(2), ArgV[0],</span>
<span class="diff0">-                                  GetExecutablePathVP);</span>
   // Reject unknown tools.
   llvm::errs() &lt;&lt; "error: unknown integrated tool '" &lt;&lt; Tool &lt;&lt; "'. "
                &lt;&lt; "Valid tools include '-cc1' and '-cc1as'.\n";
   return 1;
 }
 
<span class="diff0">-int main(int argc_, const char **argv_) {</span>
<span class="diff1">+extern "C" int ZigClang_main(int argc_, const char **argv_);</span>
<span class="diff1">+int ZigClang_main(int argc_, const char **argv_) {</span>
   noteBottomOfStack();
   llvm::InitLLVM X(argc_, argv_);
<span class="diff0">-  SmallVector&lt;const char *, 256&gt; argv(argv_, argv_ + argc_);</span>
<span class="diff1">+  size_t argv_offset = (strcmp(argv_[1], "-cc1") == 0 || strcmp(argv_[1], "-cc1as") == 0) ? 0 : 1;</span>
<span class="diff1">+  SmallVector&lt;const char *, 256&gt; argv(argv_ + argv_offset, argv_ + argc_);</span>
 
   if (llvm::sys::Process::FixupStandardFileDescriptors())
     return 1;
</pre>
<p>
This disables some cruft, and then renames <code>main</code> to <code>ZigClang_main</code> so that
it can be called like any other function. Next, in Zig's actual <code>main</code>, it looks
for <code>clang</code> as the first parameter, and calls it.
</p>
<p>
So, <code>zig clang</code> is low-level undocumented API that Zig exposes for directly invoking Clang.
But <code>zig cc</code> is much higher level than that. When Zig needs to compile C code,
it invokes itself as a child process, taking advantage of <code>zig clang</code>. <code>zig cc</code>
on the other hand, has a more difficult job: it must parse Clang's command line options and
map those to the Zig compiler's settings, so that ultimately <code>zig clang</code> can be invoked
as a child process.
</p>

<h3>Parsing Clang Command Line Options</h3>
<p>
When using <code>zig cc</code>, Zig acts as a proxy between the user and Clang. It does not need
to understand all the parameters, but it does need to understand some of them, such as
the target. This means that Zig must understand when a C command line parameter expects
to "consume" the next parameter on the command line.
</p>
<p>
For example, <code>-z -target</code> would mean to pass <code>-target</code> to the linker,
whereas <code>-E -target</code> would mean that the next parameter specifies the target.
</p>
<p>Clang has a
<a href="https://clang.llvm.org/docs/ClangCommandLineReference.html">long list of command line options</a> and so it would be foolish to try to hard-code all of them.
</p>
<p>
Fortunately, LLVM has a file "options.td" which describes all of its command line parameter options
in some obscure format. But fortunately again, LLVM comes with the <code>llvm-tblgen</code> tool
that can dump it as JSON format.
</p>
<p>
Zig has an
<a href="https://github.com/ziglang/zig/blob/dc44fe053c609f389e375f6857f96b6bb3794897/tools/update_clang_options.zig">update_clang_options tool</a>
which processes this JSON dump and produces a
<a href="https://github.com/ziglang/zig/blob/dc44fe053c609f389e375f6857f96b6bb3794897/src-self-hosted/clang_options_data.zig">big sorted list of Clang's command line options</a>.
</p>
<p>
Combined with a list of "known options" which correspond to Zig compiler options,
this is used to make an iterator API that <code>zig cc</code> uses to parse command line
parameters and instantiate a Zig compiler instance. Any Clang options that Zig is not
aware of are forwarded to Clang directly. Some parameters are handled specially.
</p>

<h3>Linking</h3>
<p>
This part is pretty straightforward. Zig depends on LLD for linking rather than
shelling out to the system linker, like GCC and Clang do.
</p>
<p>
When you use <code>-o</code> with <code>zig cc</code>, Clang is not actually acting as
a linker driver here. Zig is still the linker driver.
</p>

<h2>Everybody Wins</h2>
<p>
Now that I've spent this entire blog article comparing Zig and Clang as if they are
competitors, let me make it absolutely clear that both of these are harmonious,
mutually beneficial open-source projects. It's pretty obvious how Clang and the entire
LLVM project are massively beneficial to the Zig project, since Zig builds on top of them.
</p>
<p>
But it works the other way, too.
</p>
<p>
With Zig's focus on cross-compiling, its test suite has been expanding rapidly to cover
a large number of architectures and operating systems, leading to
<a href="https://github.com/ziglang/zig/issues?q=is%3Aissue+label%3Aupstream+is%3Aclosed">dozens of bugs reported upstream and patches sent</a>, including, for example:
</p>
<ul>
  <li><a href="https://bugs.llvm.org/show_bug.cgi?id=43268">Regression discovered in LLVM 9 release candidate</a></li>
  <li><a href="https://bugs.winehq.org/show_bug.cgi?id=47979">Bug fixed in Wine's NtDll</a></li>
  <li><a href="https://github.com/ziglang/zig/issues/3338#issuecomment-536771508">Directly working with RISC-V target developers</a></li>
  <li><a href="https://bugs.llvm.org/show_bug.cgi?id=43768#c3">Bug fixes in LLVM's MIPS code generation</a></li>
</ul>
<p>Everybody wins.</p>

<h2>This is still experimental!</h2>
<p>
I have only recently landed <code>zig cc</code> support last week, and it is still experimental.
Please do not expect it to be production quality yet.
</p>
<p>
Zig's 0.6.0 release is right around the corner, scheduled for April 13th. I will be sure to provide
an update on the release notes on how stable and robust you can expect <code>zig cc</code> to be
in the 0.6.0 release.
</p>
<p>
There are some follow-up issues related to <code>zig cc</code> which are still open:
</p>
<ul>
  <li><a href="https://github.com/ziglang/zig/issues/4784">improve zig cc flag integration</a></li>
  <li><a href="https://github.com/ziglang/zig/issues/4785">using zig as a drop in replacement for msvc</a></li>
  <li><a href="https://github.com/ziglang/zig/issues/4786">support compiling and linking c++ code</a></li>
  <li><a href="https://github.com/ziglang/zig/issues/4787">use case: directly symlink zig binary to /usr/bin/cc</a></li>
</ul>
As always, <a href="https://github.com/ziglang/zig/blob/master/CONTRIBUTING.md">Contributions are most welcome</a>.

<h2>💖 Sponsor Zig 💖</h2>
<p><a href="https://github.com/sponsors/andrewrk">Sponsor Andrew Kelley on GitHub</a></p>
<p>
If you're reading this and you already sponsor me, thank you so much! I wake up every day
absolutely thrilled that I get to do this for my full time job.
</p>
<p>
As Zig has been gaining popularity, demands for my time have been growing faster than
funds to hire another full-time programmer. Every recurring donation helps, and if the funds keep
growing then soon enough the Zig project will have two full-time programmers.
</p>
<p>
That's all folks. I hope you and your loved ones are well.
</p>
]]></description>
      </item>
      <item>
         <title>Why I'm donating $150/month (10% of my income) to the musl libc project</title>
         <pubDate>Mon, 24 Jun 2019 20:15:06 GMT</pubDate>

         <link>https://andrewkelley.me/post/why-donating-to-musl-libc-project.html</link>
         <guid>https://andrewkelley.me/post/why-donating-to-musl-libc-project.html</guid>
         <description><![CDATA[<style>
#prog {
  background-color: #171717;
  width: 350px;
  height: 30px;
  border: 1px solid black;
}
#done {
  background-color: #309e07;
  width: 100%;
  height: 30px;
  line-height: 30px;
  text-align: center;
  color: black;
  font-weight: bold;
}
</style>
<h1>Why I'm donating $150/month (10% of my income) to the musl libc project</h1>
<p>
One year ago, <a href="/post/full-time-zig.html">I quit my day job to work on Zig full time</a>.
Since then, the project has seen a steady growth in funding. Thanks to the people donating,
Zig is on track to become fully sustainable before my savings run out.
</p>
<p>
This support has allowed me to focus on steady improvements to the language and tooling. In the last
year, Zig has released two versions:
</p>
<ul>
  <li>0.3.0 (<a href="https://ziglang.org/download/0.3.0/release-notes.html">release notes</a>)</li>
  <li>0.4.0 (<a href="https://ziglang.org/download/0.4.0/release-notes.html">release notes</a>)</li>
</ul>
<p>
These release notes serve as my accountability to people who donate, and if you have a look at them, I hope you can agree with me that they speak for themselves.
</p>
<h2>The V language drama</h2>
<p>
I know that open-source project funding has been on people's minds lately, as we've watched the Internet's reaction to the
bizarre open-sourcing of the V language unfold. As to how this relates to musl libc, stick with me - let me give you the talking
points of this situation:
</p>
<ol>
  <li>
    The V language website was published several months ago, claiming that it could already do incredible things, such as automatically
    translate C++ code to V code, and compile 1.2 million lines of code per second per CPU core by directly outputting x64 machine code.
    These features already work today, it said, and they will be released soon. Please donate. Meanwhile, the project was closed-source.
    Binaries and release notes were behind a paywall.
  </li>
  <li>
    Over the next few months, people saw these amazing things it was claimed that V could do, and started donating money.
    It even looked like the language was already available, with download links including file sizes and file names, but if you
    clicked the download link, it did an alert() pop-up box saying "June 20". The project climbed up to $927/month in donations.
  </li>
  <li>
    On 2019-06-20, the V author released a macOS binary of the V language, and it immediately became apparent that the many fantastic
    claims on the V website were unsubstantiated. People were quick to point this out,
    <a href="https://news.ycombinator.com/item?id=20230351">including myself</a>.
  </li>
  <li>
    In response to the backlash, the V author redacted the macOS binary, and said that he would do a full source release in 2 days.
  </li>
  <li>
    On 2019-06-22, he updated the V language website with "WIP" labels to point out which features of the project are not available, and
    within an hour after that, published the V project source code. At this point, I made the <a href="https://github.com/vlang/v/issues/292#issuecomment-504689104">following statement</a>:

    <blockquote>
      <p>Now that the website has the "WIP" label to communicate which features are not available, and the source is released, I no longer consider this project to be fraudulent. The information is available to everyone, and people who donate on Patreon are making an informed choice.</p>
      <p>I'm genuinely glad it turned out this way. Good luck on your endeavor, and welcome to the programming languages club.</p>
    </blockquote>
  </li>
  <li>
    At this point, the Internet discovered the source release. <a href="http://web.archive.org/web/20190623042334/https://github.com/vlang/v/issues/319">Shitposters, trolls, and genuine people all started paying
      attention to the project at once</a>. Most people were unaware that the V website had been modified to retract the
    false claims, which is quite understandable given that they had been there just yesterday. Now that it
    was crystal clear that the V project had been intentionally misleading, the backlash intensified. In 2 days, V lost $100/month in
    donations.
  </li>
  <li>
    Finally, there was a backlash on the backlash. To some people, $927/month was "table scraps". From this perspective,
    people were piling on a hate-train over a paltry sum of cash. Maybe other open-source projects should take a hint
    from V and do a little bit more marketing. Donations to open-source projects are not a
    <a href="https://en.wikipedia.org/wiki/Zero-sum_game">zero sum game</a>.
  </li>
</ol>
<p>
One thing that is crystal clear is that the V author succeeded in creating hype. He got people excited about ambitious features - so
excited that they were willing to donate some cash.
This <a href="https://lobste.rs/s/rh1pbo/v_source_code_released#c_jyakpy">made me think</a> of another open source project in the opposite category.
</p>
<h2>musl libc: a project with no hype but huge impact</h2>
<p>
<a href="https://www.musl-libc.org/">musl libc</a> is an alternative to GNU libc for Linux, created by
<a href="http://ewontfix.com/">Rich Felker</a>, and with a healthy community of high-quality contributors.
It's been around for years, yet making less than V in donations.
</p>
<p>
The Zig project owes a lot to musl, for many reasons:
</p>

<h3>Mentorship</h3>
<p>Especially in the early stages of the Zig project, but still to this day, I asked question after question in the <code>#musl</code>
IRC channel, and the musl community patiently and <em>expertly</em> indulged them. Sometimes my questions were not even musl-related,
but just more about how the Linux kernel works.
</p>
<p>
I didn't start off as an expert systems programmer when I started Zig, but
the musl community has mentored me over the years.</p>

<h3>Zig bundles musl</h3>
<p>
Zig ships with musl source code. This allows Zig projects to cross compile for Linux. For example, on Windows, you can:
</p>
<pre><code>&gt; zig.exe build-exe --c-source hello.c -target x86_64-linux-musl</code></pre>
<p>
Copy the resulting ELF binary to a Linux machine and it will run. (Or run that in the Windows Subsystem for Linux).
</p>
<p>
This works because Zig lazily builds musl from source on demand, for the selected target. It is in no small part thanks to musl's
simplicity and well-designed nature that this is possible.
</p>
<p>
It's also useful to create static builds on Linux where the native libc is glibc.
</p>

<h3>Much of the Zig standard library is ported from musl</h3>
<p>
musl is such a high quality codebase, that most of the Zig standard library's interface to Linux is a direct port of musl code.
</p>
<p>
This has prevented countless bugs and made things "just work" in general. Without this head start, the Zig project would have
had to spend more time on Linux system interface and less on everything else.
</p>
<p>
For example, thanks to <a href="https://tiehu.is/">Marc Tiehuis</a> contributions, the Zig standard library has
all the math functions you would expect to find in libm, and they are available <a href="https://ziglang.org/documentation/master/#Introducing-the-Compile-Time-Concept">at compile-time</a> as well as runtime.
</p>

<h3>Float literal parsing adapted and ported from musl</h3>
<p>
Zig has 128-bit floating point literals, so that compile-time computations can be done in a higher level of precision,
before being casted to the floating point type of choice.
</p>
<p>
This is tricky to implement in the C++ non-self-hosted compiler, however, because libc does not have a 128-bit <code>strtod</code>
function. So I took musl's <code>strtold</code> function, which works with the type <code>long double</code>, and then
<a href="https://github.com/ziglang/zig/commit/4615ed5ea003516c8235728ac3f5f0ee2ccea8a7">ported the code into Zig</a>,
making all the #ifdefs assume that <code>long double</code> is 128-bits (which is only true on some architectures), and replacing all the
math with <a href="http://www.jhauser.us/arithmetic/SoftFloat.html">SoftFloat</a> so that it would work on any architecture, no matter
what <code>long double</code> maps to.
</p>

<h3>Alpine Linux is based on musl</h3>
<p>
Many people are aware of <a href="https://www.alpinelinux.org/">Alpine Linux</a> because it has become a popular Linux distribution
to use with <a href="https://www.docker.com/">Docker</a>, mainly due to its simplicity and small binary size. This is in large part
thanks to the fact that the system uses musl as its libc rather than glibc.
</p>
<p>
The Zig project uses Alpine Linux to create the static Linux builds on <a href="https://ziglang.org/download/">ziglang.org/download</a>.
</p>
<hr>
<p>
And so I have decided to <a href="https://www.patreon.com/musl">donate $150/month to the musl project</a>, even though that represents 10% of my income.
I'm putting my money where my mouth is. But there's more - please read on...
</p>
<h2>This is a marketing stunt</h2>
<p>Now if I'm being honest about my motivations for this blog post, it's that I want to prove that
<strong>open source funding is not a zero-sum game</strong>. If there's anything we've learned from the V language Internet drama that has unfolded over the past few days, it's that open source projects have to do marketing if they want to get financial support.</p>
<p>
But I want to set a better example of what that might look like. I don't think you have to trick people.
This blog post is a marketing stunt intended to show that you can do things to get attention and funding, without being dishonest.
</p>
<p>
<strong>Will you help me prove my point?</strong>
</p>
<p>If enough people start <a href="https://github.com/users/andrewrk/sponsorship">pledging donations to Zig</a> when reading this,
and the $150/month is recovered, then this little stunt will have been a proof-of-concept of one open source project
running a fund-raiser for another one. (To be clear my donations to musl are not conditional; regardless of the outcome
I will continue to donate.)
</p>
<p>
Worth a shot, right?
</p>
<p>
Since making news headlines, several generous people have taken me up on this experiment:
</p>
<ul>
  <li>Marko Mikulicic - $4</li>
  <li>Tristan Hume - $5</li>
  <li>Christine Dodrill - $15</li>
  <li>komu - $1</li>
  <li>Dylan Baker - $3</li>
  <li>Dave Voutila - $6.25</li>
  <li>Tyler Philbrick - $3</li>
  <li>Ben - $1</li>
  <li>Tanner Schultz - $3</li>
  <li>Martin DeMello - $5</li>
  <li>Thomas Ballinger - $2</li>
  <li>Steven Branson - $5</li>
  <li>Zachary Feldman - $10</li>
  <li>Dan Gallagher - $10</li>
  <li>Quetzal Bradley - $10</li>
  <li>Brian Orr - $5</li>
  <li>YVT - $5</li>
  <li>Dexter Haslem - $5</li>
  <li>Wes - $2.40</li>
  <li>Aeron Avery - $10</li>
  <li>Carsten Dreesbach - $5</li>
  <li>Ville Tuulos - $5</li>
  <li>Curtis Fenner - $3</li>
  <li>Nathan Sculli - $5</li>
  <li>Alec Nunn - $5</li>
  <li>Gurpreet Singh - $5</li>
  <li>Haze Booth - $20</li>
  <li>Don Harris - $20</li>
</ul>
<p>
That's a total of 178.65, well beyond the $150 that I started donating to musl.
</p>
</style>
<div id="prog">
  <div id="done">100%</div>
</div>
<p>
Thank you so much! I'm really grateful for all your support. This experiment was a success!
Today, the Zig project ran a fundraiser for the musl libc project, and it worked!
</p>
]]></description>
      </item>
      <item>
         <title>Using Zig to Provide Stack Traces on Kernel Panic for a Bare Bones Operating System</title>
         <pubDate>Tue, 04 Dec 2018 18:02:47 GMT</pubDate>

         <link>https://andrewkelley.me/post/zig-stack-traces-kernel-panic-bare-bones-os.html</link>
         <guid>https://andrewkelley.me/post/zig-stack-traces-kernel-panic-bare-bones-os.html</guid>
         <description><![CDATA[<script>
  Prism.languages['zig'] = Prism.languages.extend('clike', {
    'keyword': /\b(test|fn|import|cImport|const|var|extern|volatile|export|pub|noalias|inline|struct|enum|break|return|continue|asm|defer|if|else|switch|try|catch|while|for|null|undefined|true|false|comptime|setCold|ptrToInt|returnAddress)\b/,
    'property': /\b(bool|i8|u8|i16|u16|i32|u32|i64|u64|isize|usize|f32|f64|void|unreachable|type|error|c_short|c_ushort|c_int|c_uint|c_long|c_ulong|c_longlong|c_ulonglong|noreturn)\b/,
  });
</script>
<h1>Using Zig to Provide Stack Traces on Kernel Panic for a Bare Bones Operating System</h1>
<p>Last week, I reached an exciting milestone in my career as a programmer.
<a href="https://www.youtube.com/watch?v=gihdpLtHi9Q">For the first time in my life, I ran code directly on hardware</a>, with no Operating System sitting between the bare metal and my code.
</p>
<p>
For some context, the goal of this side project is to create a 2-4 player arcade game
running directly on a Raspberry Pi 3+.
</p>
<img src="https://andrewkelley.me/img/clashos-button-small.jpg">
<p>
Having just gotten Hello World working, this software can do
little more than send a message over the serial uART on bootup:
</p>
<pre>
Hello World! ClashOS 0.0
</pre>
<p>
So what's the next step? Bootloader? File system? <em>Graphics?!</em>
</p>
<p>
Well, when coding all of these things, I'm inevitably going to run into bugs
and crashes. And when this happens, I want to quickly understand what's gone
wrong. Normally, in Zig, when something goes wrong, you get a nice
<a href="https://ziglang.org/download/0.3.0/release-notes.html#stack-traces">stack trace</a>,
like this:
</p>
<img src="https://ziglang.org/download/0.3.0/stack-traces-linux.png">
<p>This example, however, is targeting Linux, whereas this arcade game project is
<em>freestanding</em>. The equivalent code in freestanding actually just hangs
the CPU:
</p>
<pre><code class="language-diff">--- a/src/main.zig
+++ b/src/main.zig
     serial.log("Hello World! ClashOS 0.0\n");
<span class="diff1">+    var x: u8 = 255;
+    x += 1;
+    serial.log("got here\n");</span></code></pre>
<p>When run, you'll see we never get to the "got here" message, but also
we don't get any kind of error message printed or anything.</p>
<pre>
Hello World! ClashOS 0.0
</pre>
<p>To understand this, we can look at the <strong>default panic handler</strong>.
In Zig, you can provide a <code class="language-zig">pub fn panic</code> in your
<a href="https://ziglang.org/documentation/master/#Root-Source-File">root source file</a>
alongside <code class="language-zig">pub fn main</code>. But if you do not provide this
function, <a href="https://github.com/ziglang/zig/blob/0.3.0/std/special/panic.zig">the default</a>
is used:
</p>
<pre><code class="language-zig">pub fn panic(msg: []const u8, error_return_trace: ?*builtin.StackTrace) noreturn {
    @setCold(true);
    switch (builtin.os) {
        builtin.Os.freestanding =&gt; {
            while (true) {}
        },
        else =&gt; {
            const first_trace_addr = @ptrToInt(@returnAddress());
            std.debug.panicExtra(error_return_trace, first_trace_addr, "{}", msg);
        },
    }
}</code></pre>
<p>
Here we can see why the code earlier is hanging - the default panic handler for the
freestanding target is simply <code class="language-zig">while(true) {}</code>.
</p>
<p>
So we can make an immediate improvement by creating our own panic handler:
</p>
<pre><code class="language-zig">pub fn panic(message: []const u8, stack_trace: ?*builtin.StackTrace) noreturn {
    serial.write("\n!KERNEL PANIC!\n");
    serial.write(message);
    serial.write("\n");
    while(true) {}
}</code></pre>
<p>And now, the output of booting the Raspberry Pi:</p>
<pre>
Hello World! ClashOS 0.0

!KERNEL PANIC!
integer overflow
</pre>
<p>
Already this is much better. We can see that an integer overflow caused a kernel panic.
But an integer overflow could occur anywhere. Wouldn't it be nice to have a full stack
trace printed?
</p>
<p>
Yes, yes it would. The first thing I needed to make this work is access to the DWARF
debugging info from inside the kernel. But I don't even have a file system. How can that work?
</p>
<p>
Easy! Just put the DWARF info directly into the kernel's memory. I modified my linker script
to do just that:
</p>
<pre><code class="language-ld">    .rodata : ALIGN(4K) {
        *(.rodata)
        __debug_info_start = .;
        KEEP(*(.debug_info))
        __debug_info_end = .;
        __debug_abbrev_start = .;
        KEEP(*(.debug_abbrev))
        __debug_abbrev_end = .;
        __debug_str_start = .;
        KEEP(*(.debug_str))
        __debug_str_end = .;
        __debug_line_start = .;
        KEEP(*(.debug_line))
        __debug_line_end = .;
        __debug_ranges_start = .;
        KEEP(*(.debug_ranges))
        __debug_ranges_end = .;
    }</code></pre>
<p>
And to my dismay, these error messages stared back at me:
</p>
<pre>
lld: error: incompatible section flags for .rodata
&gt;&gt;&gt; /home/andy/dev/clashos/zig-cache/clashos.o:(.debug_info): 0x0
&gt;&gt;&gt; output section .rodata: 0x12

lld: error: incompatible section flags for .rodata
&gt;&gt;&gt; &lt;internal&gt;:(.debug_str): 0x30
&gt;&gt;&gt; output section .rodata: 0x12

lld: error: incompatible section flags for .rodata
&gt;&gt;&gt; /home/andy/dev/clashos/zig-cache/clashos.o:(.debug_line): 0x0
&gt;&gt;&gt; output section .rodata: 0x32
</pre>
<p>
After a <a href="https://bugs.llvm.org/show_bug.cgi?id=39862">back and forth on a bug report</a>, 
George Rimar suggested simply deleting that particular check, as it might have been an
overly strict enforcement. I tried this in my LLD fork, and it worked! Debug information
was now linked into my kernel images. After completing the rest of the steps in this blog post,
I <a href="https://reviews.llvm.org/D55276">submitted a patch upstream</a>,
which Rui has already merged into LLD. This will be released with
LLVM 8, and in the meantime Zig's LLD fork has the patch.
</p>
<p>
At this point it was a simple matter of writing the glue code between my kernel and the
Zig Standard Library's stack trace facilities. In Zig, you don't have to intentionally support
freestanding mode. Code which has no dependencies on a particular operating system will work
in freestanding mode thanks to Zig's <em>lazy top level declaration analysis</em>.
Because the standard library stack trace code does not call any OS API, it therefore
supports freestanding mode.
</p>
<p>
The Zig std lib API for opening debug information from an ELF file looks like this:
</p>
<pre><code class="language-zig">pub fn openElfDebugInfo(
    allocator: *mem.Allocator,
    elf_seekable_stream: *DwarfSeekableStream,
    elf_in_stream: *DwarfInStream,
) !DwarfInfo</code></pre>
<p>
But this kernel is so bare bones, it's not even in an ELF file. It's booting directly from a binary blob.
We just have the debug info sections mapped directly into memory. For that we can use
a lower level API:
<p>
<pre><code class="language-zig">/// Initialize DWARF info. The caller has the responsibility to initialize most
/// the DwarfInfo fields before calling. These fields can be left undefined:
/// * abbrev_table_list
/// * compile_unit_list
pub fn openDwarfDebugInfo(di: *DwarfInfo, allocator: *mem.Allocator) !void</code></pre>
<p>And <code class="language-zig">DwarfInfo</code> is defined like this:</p>
<pre><code class="language-zig">pub const DwarfInfo = struct {
    dwarf_seekable_stream: *DwarfSeekableStream,
    dwarf_in_stream: *DwarfInStream,
    endian: builtin.Endian,
    debug_info: Section,
    debug_abbrev: Section,
    debug_str: Section,
    debug_line: Section,
    debug_ranges: ?Section,
    abbrev_table_list: ArrayList(AbbrevTableHeader),
    compile_unit_list: ArrayList(CompileUnit),
};</code></pre>
<p>
To hook these up, the glue code needs to initialize the fields of
<code class="language-zig">DwarfInfo</code> with the offsets into a
<code class="language-zig">std.io.SeekableStream</code>, which can
be implemented as a simple pointer to memory. By declaring <strong>external variables</strong>
and then looking at their <em>addresses</em>, we can find out where in memory the symbols
defined in the linker script are.
</p>
<p>
For <code>.debug_abbrev</code>, <code>.debug_str</code>, and <code>.debug_ranges</code>, I had to
set the offset to 0 to
<a href="https://bugs.llvm.org/show_bug.cgi?id=39862#c6">workaround</a>
LLD thinking that the sections start at 0 for some reason.
</p>
<pre><code class="language-zig">var kernel_panic_allocator_bytes: [100 * 1024]u8 = undefined;
var kernel_panic_allocator_state = std.heap.FixedBufferAllocator.init(kernel_panic_allocator_bytes[0..]);
const kernel_panic_allocator = &amp;kernel_panic_allocator_state.allocator;

extern var __debug_info_start: u8;
extern var __debug_info_end: u8;
extern var __debug_abbrev_start: u8;
extern var __debug_abbrev_end: u8;
extern var __debug_str_start: u8;
extern var __debug_str_end: u8;
extern var __debug_line_start: u8;
extern var __debug_line_end: u8;
extern var __debug_ranges_start: u8;
extern var __debug_ranges_end: u8;

fn dwarfSectionFromSymbolAbs(start: *u8, end: *u8) std.debug.DwarfInfo.Section {
    return std.debug.DwarfInfo.Section{
        .offset = 0,
        .size = @ptrToInt(end) - @ptrToInt(start),
    };
}

fn dwarfSectionFromSymbol(start: *u8, end: *u8) std.debug.DwarfInfo.Section {
    return std.debug.DwarfInfo.Section{
        .offset = @ptrToInt(start),
        .size = @ptrToInt(end) - @ptrToInt(start),
    };
}

fn getSelfDebugInfo() !*std.debug.DwarfInfo {
    const S = struct {
        var have_self_debug_info = false;
        var self_debug_info: std.debug.DwarfInfo = undefined;

        var in_stream_state = std.io.InStream(anyerror){ .readFn = readFn };
        var in_stream_pos: usize = 0;
        const in_stream = &amp;in_stream_state;

        fn readFn(self: *std.io.InStream(anyerror), buffer: []u8) anyerror!usize {
            const ptr = @intToPtr([*]const u8, in_stream_pos);
            @memcpy(buffer.ptr, ptr, buffer.len);
            in_stream_pos += buffer.len;
            return buffer.len;
        }

        const SeekableStream = std.io.SeekableStream(anyerror, anyerror);
        var seekable_stream_state = SeekableStream{
            .seekToFn = seekToFn,
            .seekForwardFn = seekForwardFn,

            .getPosFn = getPosFn,
            .getEndPosFn = getEndPosFn,
        };
        const seekable_stream = &amp;seekable_stream_state;

        fn seekToFn(self: *SeekableStream, pos: usize) anyerror!void {
            in_stream_pos = pos;
        }
        fn seekForwardFn(self: *SeekableStream, pos: isize) anyerror!void {
            in_stream_pos = @bitCast(usize, @bitCast(isize, in_stream_pos) +% pos);
        }
        fn getPosFn(self: *SeekableStream) anyerror!usize {
            return in_stream_pos;
        }
        fn getEndPosFn(self: *SeekableStream) anyerror!usize {
            return @ptrToInt(&amp;__debug_ranges_end);
        }
    };
    if (S.have_self_debug_info) return &amp;S.self_debug_info;

    S.self_debug_info = std.debug.DwarfInfo{
        .dwarf_seekable_stream = S.seekable_stream,
        .dwarf_in_stream = S.in_stream,
        .endian = builtin.Endian.Little,
        .debug_info = dwarfSectionFromSymbol(&amp;__debug_info_start, &amp;__debug_info_end),
        .debug_abbrev = dwarfSectionFromSymbolAbs(&amp;__debug_abbrev_start, &amp;__debug_abbrev_end),
        .debug_str = dwarfSectionFromSymbolAbs(&amp;__debug_str_start, &amp;__debug_str_end),
        .debug_line = dwarfSectionFromSymbol(&amp;__debug_line_start, &amp;__debug_line_end),
        .debug_ranges = dwarfSectionFromSymbolAbs(&amp;__debug_ranges_start, &amp;__debug_ranges_end),
        .abbrev_table_list = undefined,
        .compile_unit_list = undefined,
    };
    try std.debug.openDwarfDebugInfo(&amp;S.self_debug_info, kernel_panic_allocator);
    return &amp;S.self_debug_info;
}</code></pre>
<p>
You can see that the Zig common practice of accepting an allocator as an argument when
allocation is needed comes in extremely handy for kernel development. We simply statically
allocate a 100 KiB buffer and pass that along as the debug info allocator. If this ever
becomes too small, it can be adjusted.
</p>
<p>
And now it's just a matter of wiring up the <code>panic</code> function:
</p>
<pre><code class="language-zig">pub fn panic(message: []const u8, stack_trace: ?*builtin.StackTrace) noreturn {
    serial.log("\n!KERNEL PANIC! {}\n", message);
    const dwarf_info = getSelfDebugInfo() catch |err| {
        serial.log("unable to get debug info: {}\n", @errorName(err));
        hang();
    };
    const first_trace_addr = @ptrToInt(@returnAddress());
    var it = std.debug.StackIterator.init(first_trace_addr);
    while (it.next()) |return_address| {
        std.debug.printSourceAtAddressDwarf(
            dwarf_info,
            serial_out_stream,
            return_address,
            true, // tty color on
            printLineFromFile,
        ) catch |err| {
            serial.log("missed a stack frame: {}\n", @errorName(err));
            continue;
        };
    }
    hang();
}

fn hang() noreturn {
    while (true) {}
}

fn printLineFromFile(out_stream: var, line_info: std.debug.LineInfo) anyerror!void {
    serial.log("TODO print line from the file\n");
}</code></pre>
<p>
Here it is in action:
</p>
<pre>
Hello World! ClashOS 0.0

!KERNEL PANIC! integer overflow
/home/andy/dev/clashos/src/main.zig:166:7: 0x15e0 in ??? (clashos)
TODO print line from the file
      ^
???:?:?: 0x1c in ??? (???)
</pre>
<p>
Great progress.
</p>
<p>That last line is coming from the startup assembly code, which has no
source mapping. But we can see that it's correct, by looking at
the disassembly of the kernel:</p>
<pre>
0000000000000000 &lt;_start&gt;:
       0:	d53800a0 	mrs	x0, mpidr_el1
       4:	d2b82001 	mov	x1, #0xc1000000
       8:	8a210000 	bic	x0, x0, x1
       c:	b4000040 	cbz	x0, 14 &lt;master&gt;
      10:	14000003 	b	1c &lt;__hang&gt;

0000000000000014 &lt;master&gt;:
      14:	b26503ff 	mov	sp, #0x8000000
      18:	9400054b 	bl	1544 &lt;kernel_main&gt;

000000000000001c &lt;__hang&gt;:
      1c:	d503205f 	wfe
      20:	17ffffff 	b	1c &lt;__hang&gt;
</pre>
<p>You can see that <code>0x1c</code> is indeed the return address
(the next instruction that will be executed when the function returns)
of the function call to <code>kernel_main</code>.
</p>
<p>
Let's shuffle the code around into more files and make that trace longer.
</p>
<pre>
Hello World! ClashOS 0.0

!KERNEL PANIC!
integer overflow
/home/andy/dev/clashos/src/serial.zig:42:7: 0x1b10 in ??? (clashos)
TODO print line from the file
      ^
/home/andy/dev/clashos/src/main.zig:58:16: 0x1110 in ??? (clashos)
TODO print line from the file
               ^
/home/andy/dev/clashos/src/main.zig:67:18: 0xecc in ??? (clashos)
TODO print line from the file
                 ^
???:?:?: 0x1c in ??? (???)
</pre>
<p>
It's looking good. But can we add a cherry on top, and make it print the source
lines?
</p>
<p>
Again we don't have a file system. How can the <code>printLineFromFile</code> function
have access to source files?
</p>
<p>
Sometimes the simplest solution is the best one. How about the kernel just has its own
source code in memory?
</p>
<p>
That's really easy to hook up in Zig:
</p>
<pre><code class="language-zig">const source_files = [][]const u8{
    "src/debug.zig",
    "src/main.zig",
    "src/mmio.zig",
    "src/serial.zig",
};
fn printLineFromFile(out_stream: var, line_info: std.debug.LineInfo) anyerror!void {
    inline for (source_files) |src_path| {
        if (std.mem.endsWith(u8, line_info.file_name, src_path)) {
            const contents = @embedFile("../" ++ src_path);
            try printLineFromBuffer(out_stream, contents[0..], line_info);
            return;
        }
    }
    try out_stream.print("(source file {} not added in std/debug.zig)\n", line_info.file_name);
}
</code></pre>
<p>
Here we take advantage of <a href="https://ziglang.org/documentation/master/#inline-for">inline for</a>
as well as <a href="https://ziglang.org/documentation/master/#embedFile">@embedFile</a>, and all the sudden
we can print lines of code from our own source files. <code>printLineFromBuffer</code>
<a href="https://github.com/andrewrk/clashos/blob/bf8e57ac220715d0698ab910d337ea590c4b4e33/src/debug.zig#L133-L152">left as an exercise for the reader</a>.
</p>
<p>
So let's see that output again:
</p>
<pre>
Hello World! ClashOS 0.0

!KERNEL PANIC!
integer overflow
/home/andy/dev/clashos/src/serial.zig:42:7: 0x1b10 in ??? (clashos)
    x += 1;
      ^
/home/andy/dev/clashos/src/main.zig:58:16: 0x1110 in ??? (clashos)
    serial.boom();
               ^
/home/andy/dev/clashos/src/main.zig:67:18: 0xecc in ??? (clashos)
    some_function();
                 ^
???:?:?: 0x1c in ??? (???)
</pre>
<p>Beautiful. Here's a picture so you can see it with color:</p>
<img src="https://andrewkelley.me/img/clashos-stack-trace.png">
<p>
And now all of the
<a href="https://ziglang.org/documentation/master/#Undefined-Behavior">protections that Zig offers against undefined behavior</a>
will result in output like this.
</p>

<h3>Conclusion</h3>
<p>
One of my big goals with Zig is to improve the embedded and OS development
process. I hope you're as excited as I am about the potential here.
</p>
<p>
If this blog post captured your interest, maybe you would like to check out this
<a href="https://github.com/andrewrk/HellOS">Hello World x86 Kernel</a>, which comes with a
<a href="https://www.youtube.com/watch?v=yUge-ujPxzQ">video of me live coding it</a>.
Thanks to an insightful Twitch comment, it works with only Zig code, no assembly required.
</p>
]]></description>
      </item>
      <item>
         <title>String Matching based on Compile Time Perfect Hashing in Zig</title>
         <pubDate>Sat, 15 Sep 2018 16:06:33 GMT</pubDate>

         <link>https://andrewkelley.me/post/string-matching-comptime-perfect-hashing-zig.html</link>
         <guid>https://andrewkelley.me/post/string-matching-comptime-perfect-hashing-zig.html</guid>
         <description><![CDATA[<script>
  Prism.languages['zig'] = Prism.languages.extend('clike', {
    'keyword': /\b(test|fn|import|cImport|const|var|extern|volatile|export|pub|noalias|inline|struct|enum|break|return|continue|asm|defer|if|else|switch|while|for|null|undefined|true|false|comptime)\b/,
    'property': /\b(bool|i8|u8|i16|u16|i32|u32|i64|u64|isize|usize|f32|f64|void|unreachable|type|error|c_short|c_ushort|c_int|c_uint|c_long|c_ulong|c_longlong|c_ulonglong)\b/,
  });
  Prism.languages['llvm'] = Prism.languages.extend('clike', {
    'keyword': /\b(private|constant|declare|define|c|noreturn|nounwind|alloca|br|store|load|getelementptr|and|icmp|eq|zext|call|unreachable|add|target|datalayout|triple|unnamed_addr|align|inbounds|uwtable|sext|ret|ne|phi|global|to|zeroinitializer|shl|or|ult|switch)\b/,
    'number': null,
    'comment': /;.*/,
    'operator': /\w[\w\d_]*:/,
    'regex': /[@%]\.?\w+/,
    'property': /\b(i8|i1|i32|i64|i16|void|label)\*?\b/,
    'punctuation': null,
    'function': null,
  });
</script>
<h1>String Matching based on Compile Time Perfect Hashing in Zig</h1>
<p>Inspired by <a href="https://smilingthax.github.io/slides/cttrie/">cttrie - Compile time TRIE based string matching</a>,
I decided to see what this solution would look like in Zig.
</p>
<p>
Here's the API I came up with:
</p>
<pre><code class="language-zig">const ph = perfectHash([][]const u8{
    "a",
    "ab",
    "abc",
});
switch (ph.hash(target)) {
    ph.case("a") =&gt; std.debug.warn("handle the a case"),
    ph.case("ab") =&gt; std.debug.warn("handle the ab case"),
    ph.case("abc") =&gt; std.debug.warn("handle the abc case"),
    else =&gt; unreachable,
}</code></pre>
<p>
It notices at compile-time if you forget to declare one of the cases. For example, if I
comment out the last item:
</p>
<pre><code class="language-zig">const ph = perfectHash([][]const u8{
    "a",
    "ab",
    //"abc",
});</code></pre>
<p>When compiled this gives:</p>
<pre>perfect-hashing.zig:147:13: error: case value 'abc' not declared
            @compileError("case value '" ++ s ++ "' not declared");
            ^
perfect-hashing.zig:18:16: note: called from here
        ph.case("abc") =&gt; std.debug.warn("handle the abc case\n"),
               ^</pre>
<p>
It also has runtime safety if you pass in a string that was not prepared for:
</p>
<pre><code class="language-zig">const std = @import("std");
const assert = std.debug.assert;

test "perfect hashing" {
    basedOnLength("zzz");
}

fn basedOnLength(target: []const u8) void {
    const ph = perfectHash([][]const u8{
        "a",
        "ab",
        "abc",
    });
    switch (ph.hash(target)) {
        ph.case("a") =&gt; @panic("wrong one a"),
        ph.case("ab") =&gt; {}, // test pass
        ph.case("abc") =&gt; @panic("wrong one abc"),
        else =&gt; unreachable,
    }
}</code></pre>
<p>When run this gives:</p>
<pre>Test 1/1 perfect hashing...attempt to perfect hash zzz which was not declared
perfect-hashing.zig:156:36: 0x205a21 in ??? (test)
                    std.debug.panic("attempt to perfect hash {} which was not declared", s);
                                   ^
perfect-hashing.zig:15:20: 0x2051bd in ??? (test)
    switch (ph.hash(target)) {
                   ^
perfect-hashing.zig:5:18: 0x205050 in ??? (test)
    basedOnLength("zzz");
                 ^
/home/andy/dev/zig/build/lib/zig/std/special/test_runner.zig:13:25: 0x2238ea in ??? (test)
        if (test_fn.func()) |_| {
                        ^
/home/andy/dev/zig/build/lib/zig/std/special/bootstrap.zig:96:22: 0x22369b in ??? (test)
            root.main() catch |err| {
                     ^
/home/andy/dev/zig/build/lib/zig/std/special/bootstrap.zig:70:20: 0x223615 in ??? (test)
    return callMain();
                   ^
/home/andy/dev/zig/build/lib/zig/std/special/bootstrap.zig:64:39: 0x223478 in ??? (test)
    std.os.posix.exit(callMainWithArgs(argc, argv, envp));
                                      ^
/home/andy/dev/zig/build/lib/zig/std/special/bootstrap.zig:37:5: 0x223330 in ??? (test)
    @noInlineCall(posixCallMainAndExit);
    ^

Tests failed. Use the following command to reproduce the failure:
/home/andy/dev/zig/build/zig-cache/test</pre>
<p>
So there's the API. How does it work?
</p>
<p>
Here's the implementation of the <code>perfectHash</code> function:
</p>
<pre><code class="language-zig">fn perfectHash(comptime strs: []const []const u8) type {
    const Op = union(enum) {
        /// add the length of the string
        Length,

        /// add the byte at index % len
        Index: usize,

        /// right shift then xor with constant
        XorShiftMultiply: u32,
    };
    const S = struct {
        fn hash(comptime plan: []Op, s: []const u8) u32 {
            var h: u32 = 0;
            inline for (plan) |op| {
                switch (op) {
                    Op.Length =&gt; {
                        h +%= @truncate(u32, s.len);
                    },
                    Op.Index =&gt; |index| {
                        h +%= s[index % s.len];
                    },
                    Op.XorShiftMultiply =&gt; |x| {
                        h ^= x &gt;&gt; 16;
                    },
                }
            }
            return h;
        }

        fn testPlan(comptime plan: []Op) bool {
            var hit = [1]bool{false} ** strs.len;
            for (strs) |s| {
                const h = hash(plan, s);
                const i = h % hit.len;
                if (hit[i]) {
                    // hit this index twice
                    return false;
                }
                hit[i] = true;
            }
            return true;
        }
    };

    var ops_buf: [10]Op = undefined;

    const plan = have_a_plan: {
        var seed: u32 = 0x45d9f3b;
        var index_i: usize = 0;
        const try_seed_count = 50;
        const try_index_count = 50;

        while (index_i &lt; try_index_count) : (index_i += 1) {
            const bool_values = if (index_i == 0) []bool{true} else []bool{ false, true };
            for (bool_values) |try_length| {
                var seed_i: usize = 0;
                while (seed_i &lt; try_seed_count) : (seed_i += 1) {
                    comptime var rand_state = std.rand.Xoroshiro128.init(seed + seed_i);
                    const rng = &amp;rand_state.random;

                    var ops_index = 0;

                    if (try_length) {
                        ops_buf[ops_index] = Op.Length;
                        ops_index += 1;

                        if (S.testPlan(ops_buf[0..ops_index]))
                            break :have_a_plan ops_buf[0..ops_index];

                        ops_buf[ops_index] = Op{ .XorShiftMultiply = rng.scalar(u32) };
                        ops_index += 1;

                        if (S.testPlan(ops_buf[0..ops_index]))
                            break :have_a_plan ops_buf[0..ops_index];
                    }

                    ops_buf[ops_index] = Op{ .XorShiftMultiply = rng.scalar(u32) };
                    ops_index += 1;

                    if (S.testPlan(ops_buf[0..ops_index]))
                        break :have_a_plan ops_buf[0..ops_index];

                    const before_bytes_it_index = ops_index;

                    var byte_index = 0;
                    while (byte_index &lt; index_i) : (byte_index += 1) {
                        ops_index = before_bytes_it_index;

                        ops_buf[ops_index] = Op{ .Index = rng.scalar(u32) % try_index_count };
                        ops_index += 1;

                        if (S.testPlan(ops_buf[0..ops_index]))
                            break :have_a_plan ops_buf[0..ops_index];

                        ops_buf[ops_index] = Op{ .XorShiftMultiply = rng.scalar(u32) };
                        ops_index += 1;

                        if (S.testPlan(ops_buf[0..ops_index]))
                            break :have_a_plan ops_buf[0..ops_index];
                    }
                }
            }
        }

        @compileError("unable to come up with perfect hash");
    };

    return struct {
        fn case(comptime s: []const u8) usize {
            inline for (strs) |str| {
                if (std.mem.eql(u8, str, s))
                    return hash(s);
            }
            @compileError("case value '" ++ s ++ "' not declared");
        }
        fn hash(s: []const u8) usize {
            if (std.debug.runtime_safety) {
                const ok = for (strs) |str| {
                    if (std.mem.eql(u8, str, s))
                        break true;
                } else false;
                if (!ok) {
                    std.debug.panic("attempt to perfect hash {} which was not declared", s);
                }
            }
            return S.hash(plan, s) % strs.len;
        }
    };
}</code></pre>
<p>
Here's what this is doing:
</p>
<ul>
  <li>Create the concept of a hashing operation. The operations that can happen are:
    <ul>
      <li><code>Length</code> - use wraparound addition to add the length of the string to the hash value.</li>
      <li><code>Index</code> - use wraparound addition to add the byte from the string at the specified index.
      Use modulus arithmetic in case the index is out of bounds.</li>
      <li><code>XorShiftMultiply</code> - perform a right shift by 16 bits of the hash value and then xor with
        the specified value (which will come from a random number generator executed at compile time).</li>
    </ul>
  </li>
  <li>Next, we're going to try a series of "plans" which is a sequence of operations strung together.
    We have this function <code>testPlan</code> which performs the hash operations on all the strings
    and sees if there are any collisions. If we ever find a plan that results in no collisions, we have
    found a perfect hashing strategy.
  </li>
  <li>
    First we test a plan that is simply the <code>Length</code> operation. If this works, then all the hash
    function has to do is take the length of the string mod the number of strings. Easy. You may notice
    this is true for the above example. Don't worry, I have a more complicated example below.
  </li>
  <li>
    Next we iterate over the count of how many different bytes we are willing to look at in the hash function.
    We start with 0. If we can xor the length with a random number and it fixes the collisions, then we're done.
    We try 50 seeds before giving up and deciding to inspect a random byte from the string in the hash function.
    If the addition of inspecting a random byte from the string to hash doesn't solve the problem,
    we try 50 seeds in order to choose different random byte indexes. If that still doesn't work,
    we look at 2 random bytes, again trying 50 different seeds with these 2 bytes.
    And so on until every combination of 50 seeds x 50 bytes inspected, at which point we give up and
    emit a compile error "unable to come up with perfect hash".
  </li>
</ul>
<p>We can use <code>@compileLog</code> to see the plan that the function came up with:</p>
<pre><code class="language-zig">for (plan) |op| {
    @compileLog(@TagType(Op)(op));
}</code></pre>
<p>This outputs:</p>
<pre>
| @TagType(Op).Length
</pre>
<p>So this means that the hash function only has to look at the length for this example.</p>
<p>The nice thing about this is that the plan is all known at compile time. Indeed, you can see
that the <code>hash</code> function uses <a href="https://ziglang.org/documentation/master/#inline-for">inline for</a>
to iterate over the operations in the plan.
This means that LLVM is able to fully optimize the hash function.
</p>
<p>Here's what it gets compiled to in release mode for x86_64:</p>
<pre>
0000000000000000 &lt;basedOnLength&gt;:
   0:	89 f0                	mov    %esi,%eax
   2:	b9 ab aa aa aa       	mov    $0xaaaaaaab,%ecx
   7:	48 0f af c8          	imul   %rax,%rcx
   b:	48 c1 e9 21          	shr    $0x21,%rcx
   f:	8d 04 49             	lea    (%rcx,%rcx,2),%eax
  12:	29 c6                	sub    %eax,%esi
  14:	48 89 f0             	mov    %rsi,%rax
  17:	c3                   	retq   
  18:	0f 1f 84 00 00 00 00 	nopl   0x0(%rax,%rax,1)
  1f:	00 
</pre>
<p>You can see there is not even a jump instruction in there. And it will output numbers in sequential order from 0,
so that the switch statement can be a jump table. Here is the LLVM IR of the switch:</p>
<pre><code class="language-llvm"> %2 = call fastcc i64 @"perfectHash((struct []const []const u8 constant))_hash"(%"[]u8"* %0), !dbg !736
  store i64 %2, i64* %1, align 8, !dbg !736
  %3 = load i64, i64* %1, align 8, !dbg !740
  switch i64 %3, label %SwitchElse [
    i64 1, label %SwitchProng
    i64 2, label %SwitchProng1
    i64 0, label %SwitchProng2
  ], !dbg !740</code></pre>
<p>
How about a harder example?
</p>
<pre><code class="language-zig">@setEvalBranchQuota(100000);
const ph = perfectHash([][]const u8{
    "one",
    "two",
    "three",
    "four",
    "five",
});
switch (ph.hash(target)) {
    ph.case("one") =&gt; std.debug.warn("handle the one case"),
    ph.case("two") =&gt; std.debug.warn("handle the two case"),
    ph.case("three") =&gt; std.debug.warn("handle the three case"),
    ph.case("four") =&gt; std.debug.warn("handle the four case"),
    ph.case("five") =&gt; std.debug.warn("handle the five case"),
    else =&gt; unreachable,
}</code></pre>
<p>
This example is interesting because there are 2 pairs of length collisions (one/two, four/five) and 2 pairs of byte collisions (two/three, four/five).
</p>
<p>
Here we have to use <a href="https://ziglang.org/documentation/master/#setEvalBranchQuota">@setEvalBranchQuota</a>
because it takes a bit of computation to come up with the answer.
</p>
<p>
Again, the hash function comes up with a mapping to 0, 1, 2, 3, 4 (but not necessarily in the same order as specified):
</p>
<pre><code class="language-llvm"> %2 = call fastcc i64 @"perfectHash((struct []const []const u8 constant))_hash.11"(%"[]u8"* %0), !dbg !749
  store i64 %2, i64* %1, align 8, !dbg !749
  %3 = load i64, i64* %1, align 8, !dbg !753
  switch i64 %3, label %SwitchElse [
    i64 4, label %SwitchProng
    i64 2, label %SwitchProng1
    i64 3, label %SwitchProng2
    i64 0, label %SwitchProng3
    i64 1, label %SwitchProng4
  ], !dbg !753</code></pre>
<p>
And the optimized assembly:
</p>
<pre>0000000000000020 &lt;basedOnOtherStuff&gt;:
  20:	48 83 fe 22          	cmp    $0x22,%rsi
  24:	77 0b                	ja     31 &lt;basedOnOtherStuff+0x11&gt;
  26:	b8 22 00 00 00       	mov    $0x22,%eax
  2b:	31 d2                	xor    %edx,%edx
  2d:	f7 f6                	div    %esi
  2f:	eb 05                	jmp    36 &lt;basedOnOtherStuff+0x16&gt;
  31:	ba 22 00 00 00       	mov    $0x22,%edx
  36:	0f b6 04 17          	movzbl (%rdi,%rdx,1),%eax
  3a:	05 ef 3d 00 00       	add    $0x3def,%eax
  3f:	b9 cd cc cc cc       	mov    $0xcccccccd,%ecx
  44:	48 0f af c8          	imul   %rax,%rcx
  48:	48 c1 e9 22          	shr    $0x22,%rcx
  4c:	8d 0c 89             	lea    (%rcx,%rcx,4),%ecx
  4f:	29 c8                	sub    %ecx,%eax
  51:	c3                   	retq   </pre>
<p>
We have a couple of jumps here, but no loop. This code looks at only 1 byte from
the target string to determine the corresponding case index.
We can see that more clearly with the <code>@compileLog</code> snippet from earlier:
</p>
<pre>| @TagType(Op).XorShiftMultiply
| @TagType(Op).Index</pre>
<p>
So it initializes the hash with a constant value of predetermined random bits, then adds
the byte from a randomly chosen index of the string. Using <code>@compileLog(plan[1].Index)</code>
I determined that it is choosing the value 34, which means that:
</p>
<ul>
  <li>For <code>one</code>, 34 % 3 == 1, it looks at the 'n'</li>
  <li>For <code>two</code>, 34 % 3 == 1, it looks at the 'w'</li>
  <li>For <code>three</code>, 34 % 5 == 4, it looks at the 'e'</li>
  <li>For <code>four</code>, 34 % 4 == 2, it looks at the 'u'</li>
  <li>For <code>five</code>, 34 % 4 == 2, it looks at the 'i'</li>
</ul>
<p>
So the perfect hashing exploited that these bytes are all different, and when combined with
the random constant that it found, the final modulus spreads out the value across the full
range of 0, 1, 2, 3, 4.
</p>
<h3>Can we do better?</h3>
<p>
There are lots of ways this can be improved - this is just a proof of concept.
For example, we could start by looking for a perfect hash in the subset of bytes that are
within the smallest string length. For example, if we used index 1 in the above example,
we could avoid the jumps and remainder division instructions in the hash function.
</p>
<p>
Another way this can be improved, to reduce compile times, is to have
the <code>perfectHash</code> function accept the RNG seed as a parameter.
If the seed worked first try, great. Otherwise the function would find the seed
that does work, and emit a compile error instructing the programmer to switch
the seed argument to the good one, saving time in future compilations.
</p>
<h3>Conclusion</h3>
<p>
If you like this, you should check out <a href="https://ziglang.org/">Zig</a>.
Consider <a href="https://github.com/users/andrewrk/sponsorship">becoming a sponsor</a>.
</p>
]]></description>
      </item>
      <item>
         <title>I Quit My Cushy Job at OkCupid to Live on Donations to Zig</title>
         <pubDate>Thu, 07 Jun 2018 14:20:30 GMT</pubDate>

         <link>https://andrewkelley.me/post/full-time-zig.html</link>
         <guid>https://andrewkelley.me/post/full-time-zig.html</guid>
         <description><![CDATA[<script>
  Prism.languages['zig'] = Prism.languages.extend('clike', {
    'keyword': /\b(fn|import|cImport|const|var|extern|volatile|export|pub|noalias|inline|struct|enum|break|return|continue|asm|defer|if|else|switch|while|for|null|undefined|true|false|comptime)\b/,
    'property': /\b(bool|i8|u8|i16|u16|i32|u32|i64|u64|isize|usize|f32|f64|void|unreachable|type|error|c_short|c_ushort|c_int|c_uint|c_long|c_ulong|c_longlong|c_ulonglong)\b/,
  });
</script>
<h1>I Quit My Cushy Job at OkCupid to Live on Donations to Zig</h1>
<p>
I am fortunate to be one of those people who started tinkering with code
in their teens, and then found themselves in the position where their hobby
could not only pay rent, but afford a comfortable standard of living, even
in New York City.
</p>
<p>
For a little over a year now, I worked on the backend team at
<a href="https://www.okcupid.com/">OkCupid</a>, maintaining a large C++
codebase that has not been reworked since 2005 when
<a href="https://twitter.com/maxtaco">Maxwell Krohn</a>
published the original
<a href="https://pdos.csail.mit.edu/papers/okws:krohn-ms/okws_krohn-ms.pdf">OKWS paper</a>
on which OkCupid is still based.
</p>
<p>
As satisfying as it is to delete dead code, rework abstractions to remove footguns,
and migrate a monolithic codebase to
<a href="https://en.wikipedia.org/wiki/Service-oriented_architecture">Service Oriented Architecture</a>,
my coworkers were well aware that my real passion was spent on nights and weekends,
where I poured all the energy I could muster into <a href="https://ziglang.org/">Zig</a> -
if not from my incessant comparisons between snippets of C++ code and how it could be
better expressed in Zig, then from the tea mug that my lovely girlfriend Alee got me
as a celebratory gift for reaching
<a href="https://ziglang.org/download/#release-0.1.1">release 0.1.0</a>:
</p>
<img src="https://andrewkelley.me/img/zig-mug.jpg" style="width: 300px">
<h2>Zig is Picking Up Steam</h2>
<p>
Ever since I gave this
<a href="https://www.recurse.com/events/localhost-andrew-kelley">Software Should be Perfect</a> talk, it seems that the Zig community has seen a steady influx of new members. 
</p>
<img src="https://andrewkelley.me/img/david-zig.jpg" style="width: 200px">
<p>David keeps sending me selfies with this hat that he made.</p>
<p>
The Zig community grew in size so much, that some weekends I found myself
with only enough time to merge pull requests and respond to issues, and no
time leftover to make progress on big roadmap changes.
</p>
<p>
I found this both exciting, and frustrating. Here I was working on legacy code
40 hours per week, while multiple Zig issues needed my attention.
</p>
<p>
And so I took the plunge. I gave up my senior backend software engineer salary,
and will attempt to live on <a href="https://github.com/users/andrewrk/sponsorship">monthly donations</a>.
</p>
<h2>Roadmap</h2>
<p>Here are some of the bigger items that are coming up now that I have more time:</p>
<ul>
  <li><a href="https://github.com/ziglang/zig/issues/1023">remove more sigils from the language</a></li>
  <li><a href="https://github.com/ziglang/zig/issues/208">add tuples and remove var args</a></li>
  <li><a href="https://github.com/ziglang/zig/issues/89">self-hosted compiler</a></li>
  <li><a href="https://github.com/ziglang/zig/issues/910">http server and http client based on async/await</a></li>
  <li><a href="https://github.com/ziglang/zig/issues/943">decentralized package manager</a></li>
  <li><a href="https://github.com/ziglang/zig/issues/21">generate html documentation</a></li>
  <li><a href="https://github.com/ziglang/zig/issues/68">hot code swapping</a></li>
</ul>
<p>I can't wait to make significant progress on these items. That said, I will also
dedicate time each day for bug fixes and writing documentation.
Every weekday except Friday, this will be my wakeup process:
</p>
<ol>
  <li>Make tea</li>
  <li>Fix a bug</li>
  <li>Write some documentation</li>
  <li>Deal with all open pull requests, whether that is by merging, rejecting, requesting
    changes, or solving the use case a different way.</li>
  <li>Proceed with working on a big feature item.</li>
</ol>
<p>
On Fridays, I will spend the day working on a project other than Zig, but one that
is implemented in Zig. For example:
</p>
<ul>
  <li>Rewrite <a href="https://github.com/andrewrk/groovebasin">Groove Basin</a> in Zig.</li>
  <li>Work on my <a href="https://github.com/andrewrk/clashos">game that the raspberry pi boots directly into</a> (this explores Zig's ability to be used for embedded and Operating System development) </li>
  <li>Start reworking my <a href="http://genesisdaw.org/">digital audio workstation</a> in Zig.</li>
</ul>
<p>
I'm beyond excited to get Zig to a place where it can reasonably be used for
high quality and practical software projects. My goal is to make Zig so useful
and practical that people will find themselves using it without intending to.
</p>
<p>And so it comes down to this - will you fund my efforts?
</p>

<h2>Why Donate to Zig?</h2>
<p>
Consider the basic premise of a for-profit business:
</p>
<ul>
  <li>Customers pay currency in exchange for goods or services.</li>
</ul>
<p>
If all goes well, both parties benefit from the exchange. But for-profit
businesses are motivated, at the end of the day, not by customer satisfaction,
but by profit.
</p>
<p>
It's fine, this is how the world works. But it's not how Zig works.
</p>
<p>
Zig is created by the open source community, for the open source community.
One of our main <a href="https://ziglang.org/documentation/master/#Zen">tenets</a>
is</p>
<blockquote>Together we serve end users.</blockquote>
<p>
The Zig project is a public service, entirely motivated by improving the technical
landscape of open source tools. There is no board to please, no stock shareholders
demanding higher quarterly earnings, no office politics or career progressions on
the line. 100% of donations I receive go towards paying rent, buying food, and
generally attempting to live a modest, but healthy life.
</p>
<p>
<a href="https://github.com/users/andrewrk/sponsorship">Will you pledge $5/month?</a>
</p>
]]></description>
      </item>
      <item>
         <title>Zig: January 2018 in Review</title>
         <pubDate>Sun, 11 Feb 2018 06:54:50 GMT</pubDate>

         <link>https://andrewkelley.me/post/zig-january-2018-in-review.html</link>
         <guid>https://andrewkelley.me/post/zig-january-2018-in-review.html</guid>
         <description><![CDATA[<script>
  Prism.languages['zig'] = Prism.languages.extend('clike', {
    'keyword': /\b(fn|import|cImport|const|var|extern|volatile|export|pub|noalias|inline|struct|enum|break|return|catch|try|continue|asm|defer|if|else|switch|while|for|null|undefined|true|false|comptime|noinline)\b/,
    'property': /\b(bool|i8|u8|i16|u16|i32|u32|i64|u64|isize|usize|f32|f64|void|unreachable|type|error|c_short|c_ushort|c_int|c_uint|c_long|c_ulong|c_longlong|c_ulonglong)\b/,
  });
</script>
<h1>Zig: January 2018 in Review</h1>

<p>
One month (and a few days, sorry I'm late!) has passed since I did the
<a href="zig-december-2017-in-review.html">December 2017 writeup</a>, and so it's time for
another month-in-review for <a href="https://github.com/users/andrewrk/sponsorship">my esteemed sponsors</a>.
</p>

<h2>LLVM 6 Readiness</h2>
<p><a href="http://prereleases.llvm.org/6.0.0/">LLVM 6.0.0rc2</a> was just announced on the mailing list.
It's scheduled to be released on February 21, and Zig is ready.
I plan to have Zig release 0.2.0 one week after LLVM 6 comes out.
We already have all tests passing with debug builds of LLVM 6 in the
<a href="https://github.com/zig-lang/zig/tree/llvm6">llvm6 branch</a>,
but that extra week is for a bug-stomping rampage.
</p>
<p>
After that, all the 0.3.0 milestone issues get postponed to 0.4.0, and all the
0.2.0 milestone issues get moved to 0.3.0.
</p>
<p>
0.2.0 will be an exciting release because, among many other things, it enables
source-level debugging with MSVC on Windows.
</p>
<p>
Zig is once again in the <a href="http://prereleases.llvm.org/6.0.0/rc2/docs/ReleaseNotes.html#zig-programming-language">release notes of LLVM</a>, so we should see a slight increase in community size when LLVM release notes hit the tech news headlines.
</p>
<h2>Error Syntax Cleanup</h2>
<p>
One of the biggest complaints newcomers to Zig had was about its sigils regarding error handling.
Given this, I made an effort to choose friendlier syntax.
</p>
<ul>
  <li><code>%return</code> is replaced with <code>try</code></li>
  <li><code>%defer</code> is replaced with <code>errdefer</code></li>
  <li><code>a %% b</code> is replaced with <code>a catch b</code></li>
  <li><code>%%x</code> is removed entirely to discourage its use.
    You can get an equivalent effect with <code>x catch unreachable</code>,
    which has been updated to understand that it was attempting to unwrap an error union.
    See <a href="https://github.com/zig-lang/zig/issues/545">#545</a>  and <a href="https://github.com/zig-lang/zig/issues/510">#510</a></li>
</ul>
<p>
After these changes, there is a strong pattern that only keywords can modify control flow.
For example we have <code>and</code> and <code>or</code> instead of <code>&amp;&amp;</code> and <code>||</code>. There is one last exception, which is <code>a ?? b</code>. Maybe it's okay, since <a href="https://docs.microsoft.com/en-us/dotnet/csharp/language-reference/operators/null-conditional-operator">C# set a precedent</a>.
</p>
<p>
An even bigger change is coming soon which I'm calling
<a href="https://github.com/zig-lang/zig/issues/632">Error Sets</a>.
</p>
<h2>Error Return Traces</h2>
<p>
I'm really excited about this one. I invented a new kind of debugging tool and integrated it into Debug and ReleaseSafe builds.
</p>
<p>
One of the concerns with removing the <code>%%</code> prefix operator was that it was
just so gosh darn convenient to get a stack trace right at the moment where you
asserted that a value did not have an error. I wanted to make it so that programmers
could use <code>try</code> everywhere and still get the debuggability benefit when
an error occurred.
</p>
<p>
Watch this:
</p>
<pre><code class="language-zig">const std = @import("std");

pub fn main() !void {
    const allocator = std.debug.global_allocator;

    const args = try std.os.argsAlloc(allocator);
    defer std.os.argsFree(allocator, args);

    const count = try parseFile(allocator, args[1]);

    if (count &lt; 10) return error.NotEnoughItems;
}

fn parseFile(allocator: &amp;std.mem.Allocator, file_path: []const u8) !usize {
    const contents = std.io.readFileAlloc(allocator, file_path) catch return error.UnableToReadFile;
    defer allocator.free(contents);

    return contents.len;
}</code></pre>
<p>
Here's a simple program with a bunch of different ways that errors could get returned
from <code>main</code>. In our test example, we're going to open a bogus file that
does not exist.
</p>
<pre>
$ zig build-exe test2.zig
$ ./test2 bogus-does-not-exist.txt
error: UnableToReadFile
/home/andy/dev/zig/build/lib/zig/std/os/index.zig:301:33: 0x000000000021acd0 in ??? (test2)
                posix.ENOENT =&gt; return PosixOpenError.PathNotFound,
                                ^
/home/andy/dev/zig/build/lib/zig/std/os/file.zig:25:24: 0x00000000002096f6 in ??? (test2)
            const fd = try os.posixOpen(allocator, path, flags, 0);
                       ^
/home/andy/dev/zig/build/lib/zig/std/io.zig:267:16: 0x000000000021ebec in ??? (test2)
    var file = try File.openRead(allocator, path);
               ^
/home/andy/dev/zig/build/test2.zig:15:71: 0x000000000021ce72 in ??? (test2)
    const contents = std.io.readFileAlloc(allocator, file_path) catch return error.UnableToReadFile;
                                                                      ^
/home/andy/dev/zig/build/test2.zig:9:19: 0x000000000021c1f9 in ??? (test2)
    const count = try parseFile(allocator, args[1]);
                  ^
</pre>
<p>
I'm going to include a picture of the above here, because it looks a lot better with
terminal colors:
</p>
<img src="https://superjoe.s3.amazonaws.com/blog-files/zig-january-2018-in-review/error-return-traces.png">
<p>
This is not a stack trace snapshot from when an error was "created". This is a <strong>return trace</strong> of all the points in the code where an error was returned from a function.
</p>
<p>
Note that, if it only told you the origin of the error that we ultimately received - <code>UnableToReadFile</code> - we would only see the bottom 2 items in the trace.
Not only do we have this information, we have all the information about the origin of
the error, right up to the fact that we received <code>ENOENT</code> from <code>open</code>.
</p>
<p>
With this in place, programmers can comfortably use <code>try</code> everywhere, safe
in the knowledge that it will be straightforward to troubleshoot the origin of any error
bubbling up through the system.
</p>
<p>
I hope you're skeptically wondering, OK, what's the tradeoff in terms of binary size, performance, and memory?
</p>
<p>
First of all, this feature is disabled in ReleaseFast mode. So the answer is,
literally no cost, in this case. But what about Debug and ReleaseSafe builds?
</p>
<p>
To analyze performance cost, there are two cases:
</p>
<ul>
  <li>when no errors are returned</li>
  <li>when returning errors</li>
</ul>
<p>
For the case when no errors are returned, the cost is a single memory write operation, only in the first non-failable function in the call graph that calls a failable function, i.e. when a function returning <code>void</code> calls a function returning <code>error</code>.
This is to initialize this struct in the stack memory:
</p>
<pre><code class="language-zig">pub const StackTrace = struct {
    index: usize,
    instruction_addresses: [N]usize,
};</code></pre>
<p>
Here, N is the maximum function call depth as determined by call graph analysis. Recursion is ignored and counts for 2.
</p>
<p>
A pointer to <code>StackTrace</code> is passed as a secret parameter to every function that can return an error, but it's always the first parameter, so it can likely sit in a register and stay there.
</p>
<p>
That's it for the path when no errors occur. It's practically free in terms of performance.
</p>
<p>
When generating the code for a function that returns an error, just before the <code>return</code> statement (only for the <code>return</code> statements that return errors), Zig generates a call to this function:
</p>
<pre><code class="language-zig">noinline fn __zig_return_error(stack_trace: &amp;StackTrace) void {
    stack_trace.instruction_addresses[stack_trace.index] = @returnAddress();
    stack_trace.index = (stack_trace.index + 1) % N;
}</code></pre>
<p>
The cost is 2 math operations plus some memory reads and writes. The memory accessed is constrained and should remain cached for the duration of the error return bubbling.
</p>
<p>
As for code size cost, 1 function call before a return statement is no big deal. Even so,
I have <a href="https://github.com/zig-lang/zig/issues/690">a plan</a> to make the call to
<code>__zig_return_error</code> a tail call, which brings the code size cost down to actually zero. What is a return statement in code without error return tracing can become a jump instruction in code with error return tracing.
</p>
<p>
There are a few ways to activate this error return tracing feature:
</p>
<ul>
  <li>Return an error from main</li>
  <li>An error makes its way to <code>catch unreachable</code> and you have not overridden the default panic handler</li>
  <li>Use <a href="http://ziglang.org/documentation/master/#errorReturnTrace">@errorReturnTrace</a> to access the current return trace. You can use <code>std.debug.dumpStackTrace</code> to print it. This function returns comptime-known <code>null</code> when building without error return tracing support.</li>
</ul>

<p>Related issues: <a href="https://github.com/zig-lang/zig/issues/651">#651</a>  <a href="https://github.com/zig-lang/zig/issues/684">#684</a>
</p>

<h2>Documentation</h2>
<p>
Big news on the documentation front.
</p>
<p>
All the outdated docs are fixed, and we have automatic
<a href="https://github.com/zig-lang/zig/blob/46aa416c48c283849059292267ac25a6d0db76d6/doc/docgen.zig">docgen tool</a>
which:
</p>
<ul>
  <li>Automatically generates the table of contents</li>
  <li>Validates all internal links</li>
  <li>Validates all code examples</li>
  <li>Turns terminal coloring of stack traces and compile errors into HTML</li>
</ul>
<p>The tool is, of course, written in Zig. <a href="https://github.com/zig-lang/zig/issues/465">#465</a></p>
<p>In addition to the above, the following improvements were made to the documentation:</p>
<ul>
  <li>Added documentation for <a href="http://ziglang.org/documentation/master/#noInlineCall">@noInlineCall</a></li>
  <li>Added documentation for <a href="http://ziglang.org/documentation/master/#extern-enum">extern enum</a></a>
  <li>Improved the documentation styling</li>
  <li>Made the documentation a single file that has no external dependencies</li>
  <li>Add the documentation to appveyor build artifacts as <code>langref.html</code>. In other words we ship with the docs now.</li>
</ul>
<p><strong>Marc Tiehuis</strong> improved documentation styling for mobile devices. <a href="https://github.com/zig-lang/zig/issues/729">#729</a></p>
<ul>
  <li> No overscrolling on small screens</li>
  <li> Font-size is reduced for more content per screen</li>
  <li> Tables + Code blocks scroll within a block to avoid page-widenening</li>
</ul>
<p>
There is still much more to document, before we have achieved <a href="https://github.com/zig-lang/zig/issues/367">basic documentation for everything</a>.
</p>

<h2>Self-Hosted Compiler</h2>
<p>The self-hosted compiler now fully successfully builds on Windows and MacOS.</p>
<p>The main test suite builds the self-hosted compiler.</p>
<p>The self-hosted build inherits the std lib file list and C header file list from the stage1 cmake build, as well as the <code>llvm-config</code> output. So if you get stage1 to build,
stage2 will reliably build as well.</p>

<h2>Windows 32-bit Support Status</h2>
<p>
Windows 32-bit mostly works, but there are some failing tests. The number of failing tests
grew and it didn't seem fair to claim that we supported it officially.
</p>
<p>
So I removed the claims that we support Windows 32-bit from the README, and removed
32-bit Windows from the testing matrix.
</p>
<p>
<a href="https://github.com/zig-lang/zig/issues/537">We still want to support Windows 32-bit</a>.
</p>

<h2>Syntax: Mandatory Function Return Type </h2>
<p><code>-&gt;</code> is removed, and all functions require an explicit return type.</p>
<p>The purpose of this is:</p>
<ul>
  <li> Only one way to do things</li>
  <li> Changing a function with void return type to return a possible
       error becomes a 1 character change, subtly encouraging
       people to use errors.</li>
</ul>
  <p>
    Here are some imperfect sed commands for performing this update:
  </p>
  <p>
    remove arrow:
  </p>
  <pre>sed -i 's/\(\bfn\b.*\)-&gt; /\1/g' $(find . -name "*.zig")</pre>
  <p>
    add void:
  </p>
  <pre>sed -i 's/\(\bfn\b.*\))\s*{/\1) void {/g' $(find . -name "*.zig")</pre>
  <p>
    Some cleanup may be necessary, but this should do the bulk of the work.
  </p>

    <p>This has been a controversial change, and <a href="https://github.com/zig-lang/zig/issues/760">may be reverted</a>.</p>

<h2>Generating .h Files</h2>
<ul>
  <li>Now Zig emits compile errors for non-extern, non-packed struct, enum, unions in <code>extern</code> fn signatures.</li>
  <li>Zig generates .h file content for <code>extern</code> struct, enum, unions</li>
  <li>.h file generation is now tested in the main test suite.</li>
</ul>
<p>Marc Tiehuis added array type handling:</p>
<pre><code class="language-zig">const Foo = extern struct {
    A: [2]i32,
    B: [4]&amp;u32,
};
export fn entry(foo: Foo, bar: [3]u8) void { }</code></pre>
<p>This generates:</p>
<pre><code class="language-c">struct Foo {
    int32_t A[2];
    uint32_t * B[4];
};

TEST_EXPORT void entry(struct Foo foo, uint8_t bar[]);</code></pre>

<h2>Translating C to Zig</h2>
<p><strong>Jimmi Holst Christensen</strong> <a href="https://github.com/zig-lang/zig/pull/695">improved translate-c</a>:</p>
<ul>
  <li> output "undefined" on uninitialized variables</li>
  <li> correct translation of if statements on integers and floats</li>
</ul>

<h2>Crypto Additions to Zig std lib</h2>
<p><strong>Marc Tiehuis</strong> added a bunch of crypto functions:</p>
<ul>
<li>added hardware sqrt for x86_64.
  (See <a href="https://github.com/zig-lang/zig/issues/681">#681</a>)</li>
<li>fixed bitrotted endian swapping std lib code
  (See <a href="https://github.com/zig-lang/zig/issues/682">#682</a>)</li>
</ul>
<h3>Integer Rotation Functions</h3>
<pre><code class="language-zig">/// Rotates right. Only unsigned values can be rotated.
/// Negative shift values results in shift modulo the bit count.
pub fn rotr(comptime T: type, x: T, r: var) -&gt; T {
test "math.rotr" {
    assert(rotr(u8, 0b00000001, usize(0))  == 0b00000001);
    assert(rotr(u8, 0b00000001, usize(9))  == 0b10000000);
    assert(rotr(u8, 0b00000001, usize(8))  == 0b00000001);
    assert(rotr(u8, 0b00000001, usize(4))  == 0b00010000);
    assert(rotr(u8, 0b00000001, isize(-1)) == 0b00000010);
}
/// Rotates left. Only unsigned values can be rotated.
/// Negative shift values results in shift modulo the bit count.
pub fn rotl(comptime T: type, x: T, r: var) -&gt; T {
test "math.rotl" {
    assert(rotl(u8, 0b00000001, usize(0))  == 0b00000001);
    assert(rotl(u8, 0b00000001, usize(9))  == 0b00000010);
    assert(rotl(u8, 0b00000001, usize(8))  == 0b00000001);
    assert(rotl(u8, 0b00000001, usize(4))  == 0b00010000);
    assert(rotl(u8, 0b00000001, isize(-1)) == 0b10000000);
}</code></pre>
<h3>MD5 and SHA1 Hash Functions</h3>
<p><a href="https://github.com/zig-lang/zig/pull/686">Marc writes</a>:</p>
<p>Some performance comparisons to C.</p>

<p>We take the fastest time measurement taken across multiple runs.</p>

<p>The block hashing functions use the same md5/sha1 methods.</p>

<pre><code>Cpu: Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz
Gcc: 7.2.1 20171224
Clang: 5.0.1
Zig: 0.1.1.304f6f1d
</code></pre>

<p>See <a href="https://www.nayuki.io/page/fast-md5-hash-implementation-in-x86-assembly">https://www.nayuki.io/page/fast-md5-hash-implementation-in-x86-assembly</a>:</p>

<pre><code>gcc -O2
    661 Mb/s
clang -O2
    490 Mb/s
zig --release-fast and zig --release-safe
    570 Mb/s
zig
    50 Mb/s
</code></pre>

<p>See <a href="https://www.nayuki.io/page/fast-sha1-hash-implementation-in-x86-assembly">https://www.nayuki.io/page/fast-sha1-hash-implementation-in-x86-assembly</a>
:</p>

<pre><code>gcc -O2
    588 Mb/s
clang -O2
    563 Mb/s
zig --release-fast and zig --release-safe
    610 Mb/s
zig
    21 Mb/s
</code></pre>

<p>In short, zig provides pretty useful tools for writing this sort of
code. We are in the lead against clang (which uses the same LLVM
backend) with us being slower only against md5 with GCC.</p>

<h3>SHA-2 Functions</h3>
<p><a href="https://github.com/zig-lang/zig/pull/687">Marc writes</a>:</p>
<p>We take the fastest time measurement taken across multiple runs. Tested
across multiple compiler flags and the best chosen.</p>

<pre><code>Cpu: Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz
Gcc: 7.2.1 20171224
Clang: 5.0.1
Zig: 0.1.1.304f6f1d
</code></pre>

<p>See <a href="https://www.nayuki.io/page/fast-sha2-hashes-in-x86-assembly">https://www.nayuki.io/page/fast-sha2-hashes-in-x86-assembly</a>.</p>

<pre><code>Gcc -O2
    219 Mb/s
Clang -O2
    213 Mb/s
Zig --release-fast
    284 Mb/s
Zig --release-safe
    211 Mb/s
Zig
    6 Mb/s
</code></pre>

<pre><code>Gcc -O2
    350 Mb/s
Clang -O2
    354 Mb/s
Zig --release-fast
    426 Mb/s
Zig --release-safe
    300 Mb/s
Zig
    11 Mb/s
</code></pre>

<h3>Blake2 Hash Functions</h3>
<p><a href="https://github.com/zig-lang/zig/pull/689">Marc writes</a>:</p>

<p>Blake performance numbers for reference:</p>

<pre><code>Cpu: Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz
</code></pre>

<p>-- Blake2s</p>

<pre><code>Zig --release-fast
    485 Mb/s
Zig --release-safe
    377 Mb/s
Zig
    11 Mb/s
</code></pre>

<p>-- Blake2b</p>

<pre><code>Zig --release-fast
    616 Mb/s
Zig --release-safe
    573 Mb/s
Zig
    18 Mb/s
</code></pre>

<h3>Sha3 Hashing Functions</h3>
<p><a href="https://github.com/zig-lang/zig/pull/696">Marc writes:</a></p>
<p>These are on the slower side and could be improved. No performance optimizations
yet have been done.</p>

<pre><code>Cpu: Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz
</code></pre>

<p>-- Sha3-256</p>

<pre><code>Zig --release-fast
    93 Mb/s
Zig --release-safe
    99 Mb/s
Zig
    4 Mb/s
</code></pre>

<p>-- Sha3-512</p>

<pre><code>Zig --release-fast
    49 Mb/s
Zig --release-safe
    54 Mb/s
Zig
    2 Mb/s
</code></pre>

<p>Interestingly, release-safe is producing slightly better code than
release-fast.</p>

<h2>Improvements</h2>
<ul>
  <li>The return type of <code>main</code> can now be <code>void</code>, <code>noreturn</code>, <code>u8</code>, or an error union. <a href="https://github.com/zig-lang/zig/issues/535">#535</a></li>
  <li>Implemented bigint div and rem. <a href="https://github.com/zig-lang/zig/issues/405">#405</a></li>
  <li>Removed coldcc keyword and added <a href="http://ziglang.org/documentation/master/#setCold">@setCold</a>. <a href="https://github.com/zig-lang/zig/issues/661">#661</a></li>
  <li>Renamed "debug safety" to "runtime safety". <a href="https://github.com/zig-lang/zig/issues/437">#437</a></li>
  <li>Updated windows build to use llvm 5.0.1.
    <a href="http://lists.llvm.org/pipermail/llvm-dev/2018-January/120153.html">Reported usability issue regarding diaguids.lib to llvm-dev.</a></li>
  <li>Implemented <code>std.os.selfExePath</code> and <code>std.os.selfExeDirPath</code> for windows.</li>
  <li>Added more test coverage.</li>
  <li>
    The same string literal codegens to the same constant.
    This makes it so that you can send the same string literal
    as a comptime slice and get the same type.
  </li>
</ul>
<p>Zig now supports structs defined inside a function that reference local constants:</p>
<pre><code class="language-zig">const assert = @import("std").debug.assert;

test "struct inside function" {
    const BlockKind = u32;

    const Block = struct {
        kind: BlockKind,
    };

    var block = Block { .kind = 1234 };

    block.kind += 1;

    assert(block.kind == 1235);
}</code></pre>
<p>This fixed <a href="https://github.com/zig-lang/zig/issues/672">#672</a>  and <a href="https://github.com/zig-lang/zig/issues/552">#552</a>. However there is still issue <a href="https://github.com/zig-lang/zig/issues/675">#675</a>  which is that structs inside
functions get named after the function they are in:</p>
<pre><code class="language-zig">test "struct inside function" {
    const Block = struct { kind: u32 };
    @compileLog(@typeName(Block));
}</code></pre>
<p>When executed gives</p>
<pre>
| "struct inside function()"
</pre>
<p>
Moving on, Zig now allows enum tag values to not be in parentheses:
</p>
<pre><code class="language-zig">const EnumWithTagValues = enum(u4) {
    A = 1 &lt;&lt; 0,
    B = 1 &lt;&lt; 1,
    C = 1 &lt;&lt; 2,
    D = 1 &lt;&lt; 3,
};</code></pre>
<p>Previously this required <code>A = (1 &lt;&lt; 0)</code>.
</code></pre>

<h2>Bug Fixes</h2>
<ul>
  <li>fix exp1m implementation by using <code>@setFloatMode</code> and using modular arithmetic</li>
  <li>fix compiler crash related to <code>@alignOf</code></li>
  <li>fix null debug info for 0-length array type. <a href="https://github.com/zig-lang/zig/issues/702">#702</a></li>
  <li>fix compiler not able to rename files into place on windows if the file already existed</li>
  <li>fix crash when switching on enum with 1 field and no switch prongs. <a href="https://github.com/zig-lang/zig/issues/712">#712</a></li>
  <li>fix crash on union-enums with only 1 field. <a href="https://github.com/zig-lang/zig/issues/713">#713</a></li>
  <li>fix crash when align 1 field before self referential align 8 field as slice return type. <a href="https://github.com/zig-lang/zig/issues/723">#723</a></li>
  <li>fix error message mentioning <code>unreachable</code> instead of <code>noreturn</code></li>
  <li>fix std.io.readFileAllocExtra incorrectly returning <code>error.EndOfStream</code></li>
  <li>workaround for <a href="http://ziglang.org/documentation/master/#extern-enum">microsoft releasing windows SDK with the wrong version</a></li>
  <li>Found a bug in NewGVN. Disabled it to match clang and filed an llvm bug.</li>
  <li>emit compile error for @panic called at compile time. <a href="https://github.com/zig-lang/zig/issues/706">#706</a></li>
  <li>emit compile error for shifting by negative comptime integer. <a href="https://github.com/zig-lang/zig/issues/698">#698</a></li>
  <li>emit compile error for calling naked function. @ptrCast a naked function first to call it.</li>
  <li>emit compile error for duplicate struct, enum, union fields. <a href="https://github.com/zig-lang/zig/issues/730">#730</a></li>
</ul>

 
 <h2>Thank you contributors!</h2>
 <ul>
   <li><strong>Jimmi Holst Christensen</strong> fixed bitrotted code: `std.Rand.scalar` and `std.endian.swap`.
     (See <a href="https://github.com/zig-lang/zig/issues/674">#674</a>)</li>
   <li><strong>Andrea Orru</strong> removed the deprecated Darwin target and added Zen OS target.
     (See <a href="https://github.com/zig-lang/zig/issues/438">#438</a>)</li>
   <li><strong>Andrea Orru</strong> added intrusive linked lists to the standard library.
     (See <a href="https://github.com/zig-lang/zig/issues/680">#680</a>)</li>
   <li><strong>Jimmi Holst Christensen</strong> fixed bigint xor with zero.
     (See <a href="https://github.com/zig-lang/zig/issues/701">#701</a>)</li>
   <li><strong>Jimmi Holst Christensen</strong> implemented windows versions of seekTo and getPos
     (See <a href="https://github.com/zig-lang/zig/issues/710">#710</a>)</li>
   <li><strong>Jeff Fowler</strong> sent a pull request to GitHub/linguist to add Zig syntax highlighting.
     (See <a href="https://github.com/github/linguist/pull/4005">github/linguist/pull/4005</a>)</li>
   <li><strong>Jeff Fowler</strong> Added Zig 0.1.1 to Homebrew.
     (See <a href="https://github.com/Homebrew/homebrew-core/pull/23189">Homebrew/homebrew-core/pull/23189</a>)</li>
 </ul>

   <h2>Thank you financial supporters!</h2>

<p>
Special thanks to those who <a href="https://github.com/users/andrewrk/sponsorship">donate monthly</a>. We're now at $207 of the $3,000 goal.
</p>

<ul>
  <li>Lauren Chavis</li>
  <li>Andrea Orru</li>
  <li>Adrian Sinclair</li>
  <li>David Joseph</li>
  <li>jeff kelley</li>
  <li>Hasen Judy</li>
  <li>Wesley Kelley</li>
  <li>Harry Eakins</li>
  <li>Richard Ohnemus</li>
  <li>Brendon Scheinman</li>
  <li>Martin Schwaighofer</li>
  <li>Matthew </li>
  <li>Mirek Rusin</li>
  <li>Jordan Torbiak</li>
  <li>Pyry Kontio</li>
  <li>Thomas Ballinger</li>
  <li>Peter Ronnquist</li>
  <li>Luke McCarthy</li>
  <li>Robert Paul Herman</li>
  <li>Audun Wilhelmsen</li>
  <li>Marko Mikulicic</li>
  <li>Jimmi Holst Christensen</li>
  <li>Caius </li>
  <li>Don Poor</li>
  <li>Anthony J. Benik</li>
  <li>David Hayden</li>
  <li>Tanner Schultz</li>
  <li>Tyler Philbrick</li>
  <li>Eduard Nicodei</li>
  <li>Christopher A. Butler</li>
  <li>Colleen Silva-Hayden</li>
  <li>Jeremy Larkin</li>
  <li>Rasmus Rønn Nielsen</li>
  <li>Brian Lewis</li>
  <li>Tom Palmer</li>
  <li>Josh McDonald</li>
  <li>Chad Russell</li>
  <li>Alexandra Gillis</li>
  <li>david karapetyan</li>
  <li>Zi He Goh</li>
</ul>

]]></description>
      </item>
      <item>
         <title>Unsafe Zig is Safer Than Unsafe Rust</title>
         <pubDate>Wed, 24 Jan 2018 20:17:36 GMT</pubDate>

         <link>https://andrewkelley.me/post/unsafe-zig-safer-than-unsafe-rust.html</link>
         <guid>https://andrewkelley.me/post/unsafe-zig-safer-than-unsafe-rust.html</guid>
         <description><![CDATA[<script>
  Prism.languages['zig'] = Prism.languages.extend('clike', {
    'keyword': /\b(fn|import|cImport|const|var|extern|volatile|export|pub|noalias|inline|struct|enum|break|return|continue|asm|defer|if|else|switch|while|for|null|undefined|true|false|comptime)\b/,
    'property': /\b(bool|i8|u8|i16|u16|i32|u32|i64|u64|isize|usize|f32|f64|void|unreachable|type|error|c_short|c_ushort|c_int|c_uint|c_long|c_ulong|c_longlong|c_ulonglong)\b/,
  });
  Prism.languages['rust'] = Prism.languages.extend('clike', {
    'keyword': /\b(fn|const|var|pub|struct|enum|break|return|continue|if|else|match|while|for|true|false)\b/,
    'property': /\b(bool|i8|u8|i16|u16|i32|u32|i64|u64|isize|usize|f32|f64|void|unreachable|type|error|str|static)\b/,
  });
  Prism.languages['llvm'] = Prism.languages.extend('clike', {
    'keyword': /\b(private|constant|declare|define|c|noreturn|nounwind|alloca|br|store|load|getelementptr|and|icmp|eq|zext|call|unreachable|add|target|datalayout|triple|unnamed_addr|align|inbounds|uwtable|sext|ret|ne|phi|global|to|zeroinitializer|shl|or|ult|switch)\b/,
    'number': null,
    'comment': /;.*/,
    'operator': /\w[\w\d_]*:/,
    'regex': /[@%]\.?\w+/,
    'property': /\b(i8|i1|i32|i64|i16|void|label)\*?\b/,
    'punctuation': null,
    'function': null,
  });
</script>
<style type="text/css">
.red {
  color: red;
}
.green {
  color: green;
}
</style>
<h1>Unsafe Zig is Safer than Unsafe Rust</h1>

<p>Consider the following Rust code:</p>

<pre><code class="language-rust">struct Foo {
    a: i32,
    b: i32,
}

fn main() {
    unsafe {
        let mut array: [u8; 1024] = [1; 1024];
        let foo = std::mem::transmute::&lt;&amp;mut u8, &amp;mut Foo&gt;(&amp;mut array[0]);
        foo.a += 1;
    }
}</code></pre>

<p>
This pattern is pretty common if you are <a href="https://gist.github.com/andrewrk/182ace5dee6c4025d8c4b0ca22ca98ca">interacting with Operating System APIs</a>. <a href="https://github.com/andrewrk/libsoundio/blob/fc96baf8130b52ba6fe928e5f629afd55ecc7321/src/alsa.c#L802">Another example</a>.
</p>

<p>Can you spot the problem with the code?</p>

<p>It's pretty subtle, but there is actually undefined behavior going on here. Let's take a look at the LLVM IR:</p>

<pre><code class="language-llvm">define internal void @_ZN4test4main17h916a53db53ad90a1E() unnamed_addr #0 {
start:
  %transmute_temp = alloca %Foo*
  %array = alloca [1024 x i8]
  %0 = getelementptr inbounds [1024 x i8], [1024 x i8]* %array, i32 0, i32 0
  call void @llvm.memset.p0i8.i64(i8* %0, i8 1, i64 1024, i32 1, i1 false)
  br label %bb1

bb1:                                              ; preds = %start
  %1 = getelementptr inbounds [1024 x i8], [1024 x i8]* %array, i64 0, i64 0
  %2 = bitcast %Foo** %transmute_temp to i8**
  store i8* %1, i8** %2, align 8
  %3 = load %Foo*, %Foo** %transmute_temp, !nonnull !1
  br label %bb2

bb2:                                              ; preds = %bb1
  %4 = getelementptr inbounds %Foo, %Foo* %3, i32 0, i32 0
  %5 = load i32, i32* %4
  %6 = call { i32, i1 } @llvm.sadd.with.overflow.i32(i32 %5, i32 1)
  %7 = extractvalue { i32, i1 } %6, 0
  %8 = extractvalue { i32, i1 } %6, 1
  %9 = call i1 @llvm.expect.i1(i1 %8, i1 false)
  br i1 %9, label %panic, label %bb3

bb3:                                              ; preds = %bb2
  %10 = getelementptr inbounds %Foo, %Foo* %3, i32 0, i32 0
  store i32 %7, i32* %10
  ret void

panic:                                            ; preds = %bb2
; call core::panicking::panic
  call void @_ZN4core9panicking5panic17hfecc01813e436969E({ %str_slice, [0 x i8], %str_slice, [0 x i8], i32, [0 x i8], i32, [0 x i8] }* noalias readonly dereferenceable(40) bitcast ({ %str_slice, %str_slice, i32, i32 }* @panic_loc.2 to { %str_slice, [0 x i8], %str_slice, [0 x i8], i32, [0 x i8], i32, [0 x i8] }*))
  unreachable
}</code></pre>

<p>
That's the code for the main function.
This is using rustc version 1.21.0.
Let's zoom in on the problematic parts:
</p>

<pre><code class="language-llvm">  %array = alloca [1024 x i8]
  ; loading foo.a in order to do + 1
  %5 = load i32, i32* %4
  ; storing the result of + 1 into foo.a
  store i32 %7, i32* %10
</code></pre>

<p>
None of these <a href="http://llvm.org/docs/LangRef.html#alloca-instruction">alloca</a>,
<a href="http://llvm.org/docs/LangRef.html#load-instruction">load</a>, or
<a href="http://llvm.org/docs/LangRef.html#store-instruction">store</a>
instructions have alignment attributes on them, so they use the ABI alignment of the respective types.
</p>

<p>
That means the i8 array gets alignment of 1, since the ABI alignment of i8 is 1,
and the load and store instructions get alignment 4, since the ABI alignment of i32 is 4.
This is undefined behavior:
</p>

<ul>
<blockquote>
the store has undefined behavior if the alignment is not set to a value which is at least the size in bytes of the pointee
</blockquote>
<blockquote>
the load has undefined behavior if the alignment is not set to a value which is at least the size in bytes of the pointee
</blockquote>
</ul>

<p>
It's a nasty bug, because besides being an easy mistake to make, on some architectures it will only
cause mysterious slowness, while on others it can cause an illegal instruction exception on the CPU.
Regardless, it's undefined behavior, and we are professionals, and so we do not accept undefined
behavior.
</p>

<p>
Let's try writing the equivalent code in <a href="http://ziglang.org/">Zig</a>:
</p>

<pre><code class="language-zig">const Foo = struct {
    a: i32,
    b: i32,
};

pub fn main() {
    var array = []u8{1} ** 1024;
    const foo = @ptrCast(&amp;Foo, &amp;array[0]);
    foo.a += 1;
}</code></pre>

<p>And now we compile it:</p>

<pre>/home/andy/tmp/test.zig:8:17: error: cast increases pointer alignment
    const foo = @ptrCast(&amp;Foo, &amp;array[0]);
                ^
/home/andy/tmp/test.zig:8:38: note: '&amp;u8' has alignment 1
    const foo = @ptrCast(&amp;Foo, &amp;array[0]);
                                     ^
/home/andy/tmp/test.zig:8:27: note: '&amp;Foo' has alignment 4
    const foo = @ptrCast(&amp;Foo, &amp;array[0]);
                          ^</pre>

<p>
Zig knows not to compile this code. Here's how to fix it:
</p>

<pre><code class="language-diff">@@ -4,7 +4,7 @@
 };
 
 pub fn main() {
<span class="red">-    var array = []u8{1} ** 1024;</span>
<span class="green">+    var array align(@alignOf(Foo)) = []u8{1} ** 1024;</span>
     const foo = @ptrCast(&amp;Foo, &amp;array[0]);
     foo.a += 1;
 }</code></pre>

<p>
Now it compiles fine. Let's have a look at the LLVM IR:
</p>

<pre><code class="language-llvm">define internal fastcc void @main() unnamed_addr #0 !dbg !8911 {
Entry:
  %array = alloca [1024 x i8], align 4
  %foo = alloca %Foo*, align 8
  %0 = bitcast [1024 x i8]* %array to i8*, !dbg !8923
  call void @llvm.memcpy.p0i8.p0i8.i64(i8* %0, i8* getelementptr inbounds ([1024 x i8], [1024 x i8]* @266, i32 0, i32 0), i64 1024, i32 4, i1 false), !dbg !8923
  call void @llvm.dbg.declare(metadata [1024 x i8]* %array, metadata !8914, metadata !529), !dbg !8923
  %1 = getelementptr inbounds [1024 x i8], [1024 x i8]* %array, i64 0, i64 0, !dbg !8924
  %2 = bitcast i8* %1 to %Foo*, !dbg !8925
  store %Foo* %2, %Foo** %foo, align 8, !dbg !8926
  call void @llvm.dbg.declare(metadata %Foo** %foo, metadata !8916, metadata !529), !dbg !8926
  %3 = load %Foo*, %Foo** %foo, align 8, !dbg !8927
  %4 = getelementptr inbounds %Foo, %Foo* %3, i32 0, i32 0, !dbg !8927
  %5 = load i32, i32* %4, align 4, !dbg !8927
  %6 = call { i32, i1 } @llvm.sadd.with.overflow.i32(i32 %5, i32 1), !dbg !8929
  %7 = extractvalue { i32, i1 } %6, 0, !dbg !8929
  %8 = extractvalue { i32, i1 } %6, 1, !dbg !8929
  br i1 %8, label %OverflowFail, label %OverflowOk, !dbg !8929

OverflowFail:                                     ; preds = %Entry
  tail call fastcc void @panic(%"[]u8"* @88, %StackTrace* null), !dbg !8929
  unreachable, !dbg !8929

OverflowOk:                                       ; preds = %Entry
  store i32 %7, i32* %4, align 4, !dbg !8929
  ret void, !dbg !8930
}</code></pre>

<p>Zooming in on the relevant parts:</p>

<pre><code class="language-llvm">  %array = alloca [1024 x i8], align 4
  %5 = load i32, i32* %4, align 4, !dbg !8927
  store i32 %7, i32* %4, align 4, !dbg !8929
</code></pre>

<p>
Notice that the alloca, load, and store all agree on the alignment.
</p>

<p>
In Zig the problem of alignment is solved completely; the compiler catches all
possible alignment issues.
In the situation where you need to assert to the compiler that something is more aligned than Zig thinks it is, you can use <a href="http://ziglang.org/documentation/master/#alignCast">@alignCast</a>.
This inserts a cheap safety check in debug mode to make sure the alignment assertion is correct.
</p>
]]></description>
      </item>
      <item>
         <title>Zig: December 2017 in Review</title>
         <pubDate>Wed, 03 Jan 2018 07:23:11 GMT</pubDate>

         <link>https://andrewkelley.me/post/zig-december-2017-in-review.html</link>
         <guid>https://andrewkelley.me/post/zig-december-2017-in-review.html</guid>
         <description><![CDATA[<script>
  Prism.languages['zig'] = Prism.languages.extend('clike', {
    'keyword': /\b(fn|import|cImport|const|var|extern|volatile|export|pub|noalias|inline|struct|enum|break|return|continue|asm|defer|if|else|switch|while|for|null|undefined|true|false|comptime)\b/,
    'property': /\b(bool|i8|u8|i16|u16|i32|u32|i64|u64|isize|usize|f32|f64|void|unreachable|type|error|c_short|c_ushort|c_int|c_uint|c_long|c_ulong|c_longlong|c_ulonglong)\b/,
  });
</script>
<h1>Zig: December 2017 in Review</h1>

<p>
I figured since I <a href="https://github.com/users/andrewrk/sponsorship">ask people to donate monthly</a>, 
I will start giving a monthly progress report to provide accountability.
</p>

<p>So here's everything that happened in <a href="http://ziglang.org/">Zig land</a>
in December 2017:</p>

<h2>enum tag types</h2>

<p>You can now specify the tag type for an enum (<a href="https://github.com/zig-lang/zig/issues/305">#305</a>):</p>

<pre><code class="language-zig">const Small2 = enum (u2) {
    One,
    Two,
};</code></pre>

<p>If you specify the tag type for an enum, you can put it in a packed struct:</p>

<pre><code class="language-zig">const A = enum (u3) {
    One,
    Two,
    Three,
    Four,
    One2,
    Two2,
    Three2,
    Four2,
};

const B = enum (u3) {
    One3,
    Two3,
    Three3,
    Four3,
    One23,
    Two23,
    Three23,
    Four23,
};

const C = enum (u2) {
    One4,
    Two4,
    Three4,
    Four4,
};

const BitFieldOfEnums = packed struct {
    a: A,
    b: B,
    c: C,
};

const bit_field_1 = BitFieldOfEnums {
    .a = A.Two,
    .b = B.Three3,
    .c = C.Four4,
};</code></pre>

<p>
You can no longer cast from an enum to an arbitrary integer. Instead you must
cast to the enum tag type and vice versa:
</p>

<pre><code class="language-zig">const Small2 = enum (u2) {
    One,
    Two,
};
test "casting enum to its tag type" {
    testCastEnumToTagType(Small2.Two);
}

fn testCastEnumToTagType(value: Small2) {
    assert(u2(value) == 1);
}</code></pre>

<h2>enum tag values</h2>
<p>
Now you can set the tag values of enums:
</p>

<pre><code class="language-zig">const MultipleChoice = enum(u32) {
    A = 20,
    B = 40,
    C = 60,
    D = 1000,
};</code></pre>

<h2>Complete enum and union overhaul</h2>
<p>
Related issue: <a href="https://github.com/zig-lang/zig/issues/618">#618</a>
</p>
<p>
Enums are now a simple mapping between a symbol and a number. They can
no longer contain payloads.
</p>

<p>
Unions have been upgraded and can now accept an enum as an argument:
</p>

<pre><code class="language-zig">const TheTag = enum {A, B, C};
const TheUnion = union(TheTag) { A: i32, B: i32, C: i32 };
test "union field access gives the enum values" {
    assert(TheUnion.A == TheTag.A);
    assert(TheUnion.B == TheTag.B);
    assert(TheUnion.C == TheTag.C);
}</code></pre>

<p>
If you want to auto-create an enum for a union, you can use the <code>enum</code>
keyword like this:
</p>

<pre><code class="language-zig">const TheUnion2 = union(enum) {
    Item1,
    Item2: i32,
};</code></pre>

<p>
You can switch on a union-enum just like you could previously with an
enum:
</p>

<pre><code class="language-zig">const SwitchProngWithVarEnum = union(enum) {
    One: i32,
    Two: f32,
    Meh: void,
};
fn switchProngWithVarFn(a: &amp;const SwitchProngWithVarEnum) {
    switch(*a) {
        SwitchProngWithVarEnum.One =&gt; |x| {
            assert(x == 13);
        },
        SwitchProngWithVarEnum.Two =&gt; |x| {
            assert(x == 13.0);
        },
        SwitchProngWithVarEnum.Meh =&gt; |x| {
            const v: void = x;
        },
    }
}</code></pre>

<p>
However, if you do not give an enum to a union, the tag value is not
visible to the programmer:
</p>

<pre><code class="language-zig">const Payload = union {
    A: i32,
    B: f64,
    C: bool,
};
export fn entry() {
    const a = Payload { .A = 1234 };
    foo(a);
}
fn foo(a: &amp;const Payload) {
    switch (*a) {
        Payload.A =&gt; {},
        else =&gt; unreachable,
    }
}</code></pre>

<pre><code>test.zig:11:13: error: switch on union which has no attached enum
    switch (*a) {
            ^
test.zig:1:17: note: consider 'union(enum)' here
const Payload = union {
                ^
test.zig:12:16: error: container 'Payload' has no member called 'A'
        Payload.A =&gt; {},
               ^</code></pre>

<p>
There is still debug safety though!
</p>

<pre><code class="language-zig">const Foo = union {
    float: f32,
    int: u32,
};

pub fn main() -&gt; %void {
    var f = Foo { .int = 42 };
    bar(&amp;f);
}

fn bar(f: &amp;Foo) {
    f.float = 12.34;
}
</code></pre>

<pre><code>access of inactive union field
lib/zig/std/special/panic.zig:12:35: 0x0000000000203674 in ??? (test)
        @import("std").debug.panic("{}", msg);
                                  ^
test.zig:12:6: 0x0000000000217bd7 in ??? (test)
    f.float = 12.34;
     ^
test.zig:8:8: 0x0000000000217b7c in ??? (test)
    bar(&amp;f);
       ^
Aborted</code></pre>

<p>
However, if you make an <code>extern union</code> to be compatible with C code,
there is no debug safety, just like a C union.
</p>

<p>
Other tidbits:
</p>

<ul>
  <li> <code>@enumTagName</code> is renamed to <a href="http://ziglang.org/documentation/master/#builtin-tagName">@tagName</a></li>
 <li> <code>@EnumTagType</code> is renamed to <a href="http://ziglang.org/documentation/master/#builtin-TagType">@TagType</a>, and it works on both enums and
   union-enums.</li>
 <li>There is no longer an <code>EnumTag</code> type</li>
 <li>It is now an error for enums and unions to have 0 fields. However
   you can still have a struct with 0 fields.</li>
 <li> union values can implicitly cast to enum values when the enum type is
   the tag type of the union and the union value tag is comptime known to
   have a void field type. likewise, enum values can implicitly cast to
   union values. See <a href="https://github.com/zig-lang/zig/issues/642">#642</a>.</li>
</ul>

<pre><code class="language-zig">test "cast tag type of union to union" {
    var x: Value2 = Letter2.B;
    assert(Letter2(x) == Letter2.B);
}
const Letter2 = enum { A, B, C };
const Value2 = union(Letter2) { A: i32, B, C, };

test "implicit cast union to its tag type" {
    var x: Value2 = Letter2.B;
    assert(x == Letter2.B);
    giveMeLetterB(x);
}
fn giveMeLetterB(x: Letter2) {
    assert(x == Value2.B);
}</code></pre>

<h2>Update LLD fork to 5.0.1rc2</h2>

<p>
We have a fork of LLD in the zig project because of several upstream issues, all
of which I have filed bugs for:
</p>

<ul>
  <li>LDD calls exit after successful link. Patch to fix sent upstream and accepted.
  </li>
  <li>LDD crashes on a linker script with empty sections. Fixed upstream.</li>
  <li>Buggy MACH-O code causes assertion failure in simple object. We have a
    hacky workaround for this bug in zig's fork of LLD, but the workaround is
    not good enough to send upstream. See
    <a href="https://github.com/zig-lang/zig/issues/662">#662</a> for more details.
  </li>
  <li>MACH-O: Bad ASM code generated for __stub_helpers section
    Thanks to Patricio V. for sending the fix upstream, which has been accepted.</li>
</ul>

  <p>
When LLVM 6.0.0 comes out, Zig will have to keep its fork because of the one
issue, but we can drop all the other patches since they have been accepted
upstream.
  </p>

  <h2>Self-hosted compiler progress</h2>

<p>
The self-hosted compiler effort has begun.
</p>

<p>
So far we have a tokenizer, and an incomplete parser and formatter.
The code uses no recursion and therefore has compile-time
known stack space usage. See <a href="https://github.com/zig-lang/zig/issues/157">#157</a>
</p>

<p>
The self-hosted compiler works on every supported platform, is built using
the zig build system, tested with <code>zig test</code>, links against LLVM,
and can import <strong>100%</strong> of the LLVM symbols from the LLVM
C-API .h files - <em>even the inline functions</em>.
</p>

<p>
There is one C++ file in Zig which uses the more powerful LLVM C++ API
(for example to create debug information) and exposes a C API. This file
is now shared between the C++ self-hosted compiler and the self-hosted compiler.
In stage1, we create a static library with this one file in it, and then use
that library in both the C++ compiler and the self-hosted compiler.
</p>

<h2>Higher level arg-parsing API</h2>

<p>
It's really a shame that Windows command line parsing requires you to
allocate memory. This means that to have a cross-platform API for command
line arguments, even though in POSIX it can never fail, we have to handle
the possibility because of Windows. This lead to a command line args
API like this:
</p>

<pre><code class="language-zig">pub fn main() -&gt; %void {
    var arg_it = os.args();
    // skip my own exe name
    _ = arg_it.skip();
    while (arg_it.next(allocator)) |err_or_arg| {
        const arg = %return err_or_arg;
        defer allocator.free(arg);
        // use the arg...
    }
}</code></pre>

<p>
Yikes, a bit cumbersome. I added a higher level API. Now you can call
<code>std.os.argsAlloc</code> and get a <code>%[]const []u8</code>, and you just have to call
<code>std.os.argsFree</code> when you're done with it.
</p>

<pre><code class="language-zig">pub fn main() -&gt; %void {
    const allocator = std.heap.c_allocator;

    const args = %return os.argsAlloc(allocator);
    defer os.argsFree(allocator, args);

    var arg_i: usize = 1;
    while (arg_i &lt; args.len) : (arg_i += 1) {
        const arg = args[arg_i];
        // do something with arg...
    }
}</code></pre>

<p>Better! Single point of failure.</p>

<p>
For now this uses the other API under the hood, but it could be
reimplemented with the same API to do a single allocation.
</p>

<p>
I added a new kind of test to make sure command line argument parsing works.
</p>

<h2>Automatic C-to-Zig translation</h2>

<ul>
  <li> Translation now understands enum tag types.</li>
  <li> I refactored the C macro parsing, and made it understand pointer casting.
   Now some kinds of int-to-ptr code in .h files for embedded programming
   just work:</li>
</ul>

<pre><code class="languuage-c">#define NRF_GPIO ((NRF_GPIO_Type *) NRF_GPIO_BASE)</code></pre>

<p>Zig now understands this C macro.</p>

   <h2>std.mem</h2>

   <ul>
     <li>add <code>aligned</code> functions to Allocator interface
</li>
<li><code>mem.Allocator</code> initializes bytes to undefined. This does nothing in ReleaseFast
  mode. In Debug and ReleaseSafe modes, it initializes bytes to <code>0xaa</code> which helps catch
   memory errors.
   </li>
   <li>add <code>mem.FixedBufferAllocator</code>
 </li>
   </ul>

 <h2>std.os.ChildProcess</h2>

 I added <code>std.os.ChildProcess.exec</code> for when you want to spawn a child process, wait for it
to complete, and then capture the stdandard output into a buffer.

<pre><code class="language-zig">pub fn exec(self: &amp;Builder, argv: []const []const u8) -&gt; []u8 {
    const max_output_size = 100 * 1024;
    const result = os.ChildProcess.exec(self.allocator, argv, null, null, max_output_size) %% |err| {
        std.debug.panic("Unable to spawn {}: {}", argv[0], @errorName(err));
    };
    switch (result.term) {
        os.ChildProcess.Term.Exited =&gt; |code| {
            if (code != 0) {
                warn("The following command exited with error code {}:\n", code);
                printCmd(null, argv);
                warn("stderr:{}\n", result.stderr);
                std.debug.panic("command failed");
            }
            return result.stdout;
        },
        else =&gt; {
            warn("The following command terminated unexpectedly:\n");
            printCmd(null, argv);
            warn("stderr:{}\n", result.stderr);
            std.debug.panic("command failed");
        },
    }
}</code></pre>

<h2>std.sort</h2>

<p>
<a href="https://github.com/Hejsil">Hejsil</a> <a href="https://github.com/zig-lang/zig/issues/657">pointed out</a>
that the quicksort implementation in the standard library failed a simple test case.
</p>

<p>
There was another problem with the implementation of sort in the standard library, 
which is that it used <code>O(n)</code> stack space via recursion. This is fundamentally
insecure, especially if you consider that the length of an array you might want to sort could be
user input. It prevents <a href="https://github.com/zig-lang/zig/issues/157">#157</a>
from working as well.
</p>

<p>
I had a look at
<a href="https://en.wikipedia.org/wiki/Sorting_algorithm#Comparison_of_algorithms">Wikipedia's Comparison of Sorting Algorithms</a> and only 1 sorting algorithm checked all the boxes:
</p>

<ul>
  <li>Best case <code>O(n)</code> complexity (adaptive sort)</a>
  <li>Average case <code>O(n * log(n))</code> complexity</a>
  <li>Worst case <code>O(n * log(n))</code> complexity</a>
  <li><code>O(1)</code> memory</a>
  <li>Stable sort</a>
</ul>

<p>
And that algorithm is <a href="https://en.wikipedia.org/wiki/Block_sort">Block sort</a>.
</p>

<p>
I found a
<a href="https://github.com/BonzaiThePenguin/WikiSort/blob/master/WikiSort.c">high quality implementation of block sort in C</a>,
which is licensed under the public domain.
</p>

<p>
I ported the code from C to Zig, integrated it into the standard library, and it passed all tests first try. Amazing.
</p>

<p>
Surely, I thought, there must be some edge case. So I created a simple fuzz tester:
</p>

<pre><code class="language-zig">test "sort fuzz testing" {
    var rng = std.rand.Rand.init(0x12345678);
    const test_case_count = 10;
    var i: usize = 0;
    while (i &lt; test_case_count) : (i += 1) {
        fuzzTest(&amp;rng);
    }
}

var fixed_buffer_mem: [100 * 1024]u8 = undefined;

fn fuzzTest(rng: &amp;std.rand.Rand) {
    const array_size = rng.range(usize, 0, 1000);
    var fixed_allocator = mem.FixedBufferAllocator.init(fixed_buffer_mem[0..]);
    var array = %%fixed_allocator.allocator.alloc(IdAndValue, array_size);
    // populate with random data
    for (array) |*item, index| {
        item.id = index;
        item.value = rng.range(i32, 0, 100);
    }
    sort(IdAndValue, array, cmpByValue);

    var index: usize = 1;
    while (index &lt; array.len) : (index += 1) {
        if (array[index].value == array[index - 1].value) {
            assert(array[index].id &gt; array[index - 1].id);
        } else {
            assert(array[index].value &gt; array[index - 1].value);
        }
    }
}</code></pre>

<p>
This test passed as well. And so I think this problem is solved.
</p>

<h2>@export</h2>

<p>
There is now an <a href="http://ziglang.org/documentation/master/#builtin-export">@export</a> builtin function which can be used in a comptime block
to conditionally export a function:
</p>

<pre><code class="language-zig">const builtin = @import("builtin");

comptime {
    const strong_linkage = builtin.GlobalLinkage.Strong;
    if (builtin.link_libc) {
        @export("main", main, strong_linkage);
    } else if (builtin.os == builtin.Os.windows) {
        @export("WinMainCRTStartup", WinMainCRTStartup, strong_linkage);
    } else {
        @export("_start", _start, strong_linkage);
    }
}</code></pre>

<p>
It can also be used to create aliases:
</p>


<pre><code class="language-zig">const builtin = @import("builtin");
const is_test = builtin.is_test;

comptime {
    const linkage = if (is_test) builtin.GlobalLinkage.Internal else builtin.GlobalLinkage.Weak;
    const strong_linkage = if (is_test) builtin.GlobalLinkage.Internal else builtin.GlobalLinkage.Strong;

    @export("__letf2", @import("comparetf2.zig").__letf2, linkage);
    @export("__getf2", @import("comparetf2.zig").__getf2, linkage);

    if (!is_test) {
        // only create these aliases when not testing
        @export("__cmptf2", @import("comparetf2.zig").__letf2, linkage);
        @export("__eqtf2", @import("comparetf2.zig").__letf2, linkage);
        @export("__lttf2", @import("comparetf2.zig").__letf2, linkage);
        @export("__netf2", @import("comparetf2.zig").__letf2, linkage);
        @export("__gttf2", @import("comparetf2.zig").__getf2, linkage);
    }
}</code></pre>

<p>
Previous export syntax is still allowed. See <a href="https://github.com/zig-lang/zig/issues/462 ">#462</a> and <a href="https://github.com/zig-lang-zig/issues/420">#420</a>.
</p>

<h2>Labeled loops, blocks, break, and continue, and R.I.P. goto</h2>

<p>
We used to have labels and goto like this:
</p>

<pre><code class="language-zig">export fn entry() {
    label:
    goto label;
}</code></pre>

<p>
Now this does not work, because goto is gone.
</p>

<pre><code>test.zig:2:10: error: expected token ';', found ':'
    label:
         ^</code></pre>

<p>
There are a few reasons to use goto, but all of the use cases are better served
with other zig control flow features:
</p>

<ul>
  <li>cleanup pattern. Use <code>defer</code> and <code>%defer</code> instead.</li>
  <li>goto backward</li>
  <li>goto forward</li>
</ul>

<h3>goto backward</h3>

<pre><code class="language-zig">export fn entry() {
    start_over:

    while (some_condition) {
        // do something...
        goto start_over;
    }
}</code></pre>

<p>
Instead, use a loop!
</p>

<pre><code class="language-zig">export fn entry() {
    outer: while (true) {

        while (some_condition) {
            // do something...
            continue :outer;
        }

        break;
    }
}</code></pre>

<h3>goto forward</h3>

<pre><code class="language-zig">pub fn findSection(elf: &amp;Elf, name: []const u8) -&gt; %?&amp;SectionHeader {
    var file_stream = io.FileInStream.init(elf.in_file);
    const in = &amp;file_stream.stream;

    section_loop: for (elf.section_headers) |*elf_section| {
        if (elf_section.sh_type == SHT_NULL) continue;

        const name_offset = elf.string_section.offset + elf_section.name;
        %return elf.in_file.seekTo(name_offset);

        for (name) |expected_c| {
            const target_c = %return in.readByte();
            if (target_c == 0 or expected_c != target_c) goto next_section;
        }

        {
            const null_byte = %return in.readByte();
            if (null_byte == 0) return elf_section;
        }
next_section:
    }

    return null;
}</code></pre>

<p>
Looks like the use case is breaking out of an outer loop:
</p>

<pre><code class="language-zig">pub fn findSection(elf: &amp;Elf, name: []const u8) -&gt; %?&amp;SectionHeader {
    var file_stream = io.FileInStream.init(elf.in_file);
    const in = &amp;file_stream.stream;

    section_loop: for (elf.section_headers) |*elf_section| {
        if (elf_section.sh_type == SHT_NULL) continue;

        const name_offset = elf.string_section.offset + elf_section.name;
        %return elf.in_file.seekTo(name_offset);

        for (name) |expected_c| {
            const target_c = %return in.readByte();
            if (target_c == 0 or expected_c != target_c) continue :section_loop;
        }

        {
            const null_byte = %return in.readByte();
            if (null_byte == 0) return elf_section;
        }
    }

    return null;
}</code></pre>

<p>
You can also break out of arbitrary blocks:
</p>

<pre><code class="language-zig">export fn entry() {
    outer: {

        while (some_condition) {
            // do something...
            break :outer;
        }
    }
}</code></pre>

<p>
This can be used to return a value from a block in the same way you
can return a value from a function:
</p>

<pre><code class="language-zig">export fn entry() {
    const value = init: {
        for (slice) |item| {
            if (item &gt; 100)
                break :init item;
        }
        break :init 0;
    };
}</code></pre>

<p>
Omitting a semicolon no longer causes the value to be returned by the block.
Instead you must use explicit block labels to return a value from a block.
I'm considering a keyword such as <code>result</code> which defaults to the
current block.
</p>

<p>
Removal of goto caused a regression in C-to-Zig translation: Switch statements
no longer can be translated. However this code will be resurrected
soon using labeled loops and labeled break instead of goto.
</p>

<p>
See <a href="https://github.com/zig-lang/zig/issues/346">#346</a>, <a href="https://github.com/zig-lang-zig/issues/630">#630</a>, and <a href="https://github.com/zig-lang-zig/issues/629">#629</a>.
</p>

<h2>New IR pass iteration strategy</h2>

<p>
Before:
</p>

<ul>
  <li> IR basic blocks are in arbitrary order</li>
    <li> when doing an IR pass, when a block is encountered, code
    must look at all the instructions in the old basic block,
    determine what blocks are referenced, and queue up those
    old basic blocks first.
    </li>
    <li> This had a bug</li>
</ul>

<pre><code class="language-zig">while (cond) {
    if (false) { }
    break;
}</code></pre>

<p>
Pretty crazy right? Something as simple as this would crash the compiler.
</p>
<p>
Now:
</p>

<ul>
  <li> IR basic blocks are required to be in an order that guarantees
    they will be referenced by a branch, before any instructions
    within are referenced.
    ir pass1 is updated to meet this constraint.</li>
    <li>hen doing an IR pass, we iterate over old basic blocks
    in the order they appear. Blocks which have not been
    referenced are discarded.</li>
    <li>fter the pass is complete, we must iterate again to look
    for old basic blocks which now point to incomplete new
    basic blocks, due to comptime code generation.</li>
    <li>his last part can probably be optimized - most of the time
      we don't need to iterate over the basic block again.</li>
</ul>

    <p>
This improvement deletes a lot of messy code:
    </p>

<pre> 5 files changed, 288 insertions(+), 1243 deletions(-) </pre>

<p>
And it also fixes comptime branches not being respected sometimes:
</p>

<pre><code class="language-zig">export fn entry() {
    while (false) {
        @compileError("bad");
    }
}</code></pre>

<p>Before, this would cause a compile error. Now the while loop respects the
implicit compile-time.
</p>

<p>
See <a href="https://github.com/zig-lang/zig/issues/667">#667</a>.
</p>

<h2>Bug Fixes</h2>

<ul>
  <li> fix const and volatile qualifiers being dropped sometimes.
   in the expression <code>&amp;const a.b</code>, the const (and/or volatile)
   qualifiers would be incorrectly dropped. See <a href="https://github.com/zig-lang/zig/issues/655">#655</a>.
  </li>
  <li> fix compiler crash in a nullable if after an if in a switch
 prong of a switch with 2 prongs in an else. See <a href="https://github.com/zig-lang/zig/issues/656">#656</a>.
  </li>
  <li>fix assert when wrapping zero bit type in nullable. See <a href="https://github.com/zig-lang/zig/issues/659">#659</a>.
  </li>
  <li>fix crash when implicitly casting array of len 0 to slice. See <a href="https://github.com/zig-lang/zig/issues/660">#660</a>.
  </li>
  <li>fix endianness of sub-byte integer fields in packed structs. In the future
 packed structs will require specifying endianness. See <a href="https://github.com/zig-lang/zig/issues/307">#307</a>.
  </li>
  <li>fix <code>std.os.path.resolve</code> when the drive is missing.
  </li>
  <li>fix automatically C-translated functions not having debug information.
  </li>
  <li>fix crash when passing union enum with sub-byte field to const slice parameter.
 See <a href="https://github.com/zig-lang/zig/issues/664">#664</a>.
 </li>
</ul>
    
   <h2>Miscellaneous changes</h2>

<ul>
  <li>Rename <code>builtin.is_big_endian</code> to <code>builtin.endian</code>. This is in preparation for
   having endianness be a pointer property, which is related to packed structs.
   See <a href="https://github.com/zig-lang/zig/issues/307">#307</a>.
  </li>
  <li>Tested Zig with LLVM debug mode and fixed some bugs that were causing LLVM
   assertions.
  </li>
  <li>Add
    <a href="http://ziglang.org/documentation/master/#builtin-noInlineCall">@noInlineCall</a>.
    See <a href="https://github.com/zig-lang/zig/issues/640">#640</a>.
    This fixes a crash in <code>--release-safe</code> and <code>--release-fast</code> modes
    where the optimizer inlines everything into <code>_start</code> and
   clobbers the command line argument data.
   If we were able to verify that the user's code never reads
   command line args, we could leave off this "no inline"
   attribute. This might call for a patch to LLVM. It seems like inlining
   into a naked function should correctly bump the stack pointer.
  </li>
  <li>add <code>i29</code> and <code>u29</code> primitive types. <code>u29</code> is the type of alignment,
   so it makes sense to be a primitive.
   probably in the future we'll make any <code>i</code> or <code>u</code> followed by
   digits into a primitive.
  </li>
  <li>add implicit cast from enum tag type of union to const ptr to the union. closes <a href="https://github.com/zig-lang/zig/issues/654">#654</a>
  </li>
  <li>ELF stack traces support <code>DW_AT_ranges</code>, so sometimes when you would see "???"
   you now get a useful stack trace instead.
   </li>
   <li>add <code>std.sort.min</code> and <code>std.sort.max</code> functions
 </li>
 <li><code>std.fmt.bufPrint</code> returns a possible <code>error.BufferTooSmall</code> instead of 
   asserting that the buffer is large enough.
   </li>
   <li>Remove unnecessary inline calls in <code>std.math</code>.
 </li>
 <li><code>zig build</code> now has a <code>--search-prefix</code> option. Any number of search prefixes can be
   specified.
   </li>
   <li>add some utf8 parsing utilities to the standard library.
 </li>
</ul>

 <h2>Thank you contributors!</h2>

 <ul>
   <li><strong>MIURA Masahiro</strong> fixed the color of compiler messages for light-themed terminals.
     (See <a href="https://github.com/zig-lang/zig/issues/644">#644</a>)</li>
   <li><strong>Peter Rönnquist</strong> added format for floating point numbers.
     <code>{.x}</code> where <code>x</code> is the number of decimals. (See <a href="https://github.com/zig-lang/zig/issues/668">#668</a>)</li>
 </ul>

   <h2>Thank you financial supporters!</h2>

<p>
Special thanks to those who <a href="https://github.com/users/andrewrk/sponsorship">donate monthly</a>:
</p>

<ul>
<li>Lauren Chavis</li>
<li>Andrea Orru</li>
<li>Adrian Sinclair</li>
<li>David Joseph</li>
<li>jeff kelley</li>
<li>Hasan Abdul-Rahman</li>
<li>Wesley Kelley</li>
<li>Jordan Torbiak</li>
<li>Richard Ohnemus</li>
<li>Martin Schwaighofer</li>
<li>Matthew </li>
<li>Mirek Rusin</li>
<li>Brendon Scheinman</li>
<li>Pyry Kontio</li>
<li>Thomas Ballinger</li>
<li>Peter Ronnquist</li>
<li>Robert Paul Herman</li>
<li>Audun Wilhelmsen</li>
<li>Marko Mikulicic</li>
<li>Anthony J. Benik</li>
<li>Caius </li>
<li>Tyler Philbrick</li>
<li>Jeremy Larkin</li>
<li>Rasmus Rønn Nielsen</li>
</ul>
]]></description>
      </item>
      <item>
         <title>A Better Way to Implement Bit Fields</title>
         <pubDate>Fri, 17 Feb 2017 00:43:53 GMT</pubDate>

         <link>https://andrewkelley.me/post/a-better-way-to-implement-bit-fields.html</link>
         <guid>https://andrewkelley.me/post/a-better-way-to-implement-bit-fields.html</guid>
         <description><![CDATA[<script>
  Prism.languages['zig'] = Prism.languages.extend('clike', {
    'keyword': /\b(fn|import|cImport|const|var|extern|packed|volatile|export|pub|noalias|inline|struct|enum|goto|break|return|continue|asm|defer|if|else|switch|while|for|null|undefined|true|false|comptime)\b/,
    'property': /\b(bool|i8|u8|i16|u16|i32|u32|i64|u64|i13|u1|u2|u3|u4|u5|u7|u12|isize|usize|f32|f64|void|unreachable|type|error|c_short|c_ushort|c_int|c_uint|c_long|c_ulong|c_longlong|c_ulonglong)\b/,
  });
</script>
<h1>A Better Way to Implement Bit-Fields</h1>
<p>
One of the main use cases for the <a href="https://ziglang.org/">Zig Programming Language</a>
is operating system development and embedded development. So, what better way to make sure the
language is suitable than to work on a project in this field?
</p>
<p>
This is why I am creating a
<a href="https://github.com/andrewrk/clashos">4-player arcade game that runs directly on the Raspberry Pi 3 hardware</a>.
The project has only just begun, but already it is revealing important fixes and features in Zig,
and I am updating the compiler to incorporate these things as I work.
</p>
<p>
The next thing I'm working on is adding a USB controller driver, so that I can use a gamepad to
test the game. I noticed when looking at
<a href="https://github.com/Chadderz121/csud/blob/e13b9355d043a9cdd384b335060f1bc0416df61e/include/hcd/dwc/designware20.h#L164">some reference code</a>
that there are large structs full of bit-fields, and these turn out to be extremely convenient:
</p>
<pre>
<code class="language-c">extern volatile struct CoreGlobalRegs {
	volatile struct {
		volatile bool sesreqscs : 1;
		volatile bool sesreq : 1;
		volatile bool vbvalidoven:1;
		volatile bool vbvalidovval:1;
		volatile bool avalidoven:1;
		volatile bool avalidovval:1;
		volatile bool bvalidoven:1;
		volatile bool bvalidovval:1;
		volatile bool hstnegscs:1;
		volatile bool hnpreq:1;
		volatile bool HostSetHnpEnable : 1;
		volatile bool devhnpen:1;
		volatile unsigned _reserved12_15:4;
		volatile bool conidsts:1;
		volatile unsigned dbnctime:1;
		volatile bool ASessionValid : 1;
		volatile bool BSessionValid : 1;
		volatile unsigned OtgVersion : 1;
		volatile unsigned _reserved21:1;
		volatile unsigned multvalidbc:5;
		volatile bool chirpen:1;
		volatile unsigned _reserved28_31:4;
	} __attribute__ ((__packed__)) OtgControl; // +0x0
	volatile struct {
		volatile unsigned _reserved0_1 : 2; // @0
		volatile bool SessionEndDetected : 1; // @2
		volatile unsigned _reserved3_7 : 5; // @3
		volatile bool SessionRequestSuccessStatusChange : 1; // @8
		volatile bool HostNegotiationSuccessStatusChange : 1; // @9
		volatile unsigned _reserved10_16 : 7; // @10
		volatile bool HostNegotiationDetected : 1; // @17
		volatile bool ADeviceTimeoutChange : 1; // @18
		volatile bool DebounceDone : 1; // @19
		volatile unsigned _reserved20_31 : 12; // @20
	} __attribute__ ((__packed__)) OtgInterrupt; // +0x4
// ...</code>
</pre>
<p>
I'll stop there, but this struct goes on for pages and pages. You can see that in C,
bit-fields have special syntax. The fields have a type like normal, and then an extra
colon and a number of bits.
</p>
<p>
Bit-fields in C have acquired a poor reputation in the programming community, for a few reasons:
</p>
<ul>
  <li><a href="https://www.flamingspork.com/blog/2014/11/14/c-bitfields-considered-harmful/">Poor performance due to loading more bytes than necessary</a></li>
  <li><a href="https://gcc.gnu.org/onlinedocs/gcc/Structures-unions-enumerations-and-bit-fields-implementation.html">Largely implementation-defined behavior</a></li>
  <li><a href="http://stackoverflow.com/a/4240989/432">Platform dependence</a></li>
  <li><a href="http://yarchive.net/comp/linux/bitfields.html">Linus Torvalds: bitfields make it harder to work with combinations of flags</a></li>
</ul>
<p>
Many experienced developers have given up on the usefulness of bit-fields and prefer
to manually wrangle their binary data.
</p>
<p>
Some of these problems I can address in Zig, by loading the minimal number of bytes necessary,
defining how bit-fields are laid out in memory, and ensuring that bit-fields work the same
or at least predictably on all platforms. Some of the problems are inherently tricky, such
as the question of how to deal with endianness when a bit-field has more than 8 bits and
crosses a byte boundary.
</p>
<p>
With these things in mind, I set out to implement bit-fields in Zig, and
now I am pleased to announce that work is complete. Zig has bit-fields, and it required
no syntax additions.
</p>
<p>
Zig takes advantage of the fact that types are first-class values at compile-time. There is a
built-in function which returns an integer type with the specified signness and bit count,
and it can be used to get access to uncommon integer types like this:
</p>
<pre>
<code class="language-zig">const i13 = @intType(true, 13);</code>
</pre>
<p>
These uncommon integer types work like normal integer types. Arithmetic, casting, and overflow
are generalized to work with any integer type.
</p>
<p>
These integer types are one component to the way bit-fields are implemented in Zig.
The other component takes advantage of the fact that Zig has 3 different
kinds of <code>struct</code> layouts:
</p>
<ul>
  <li>Default - compiler may re-arrange fields and insert padding</li>
  <li><code>extern</code> - compatible with the target environment C ABI</li>
  <li><code>packed</code> - fields in exact order specified, no padding</li>
</ul>
<p>
In a <code>packed</code> struct, the programmer is directly in charge of the memory layout
of the struct. Fields with integer types take up exactly as many bits as the integer type specifies.
If a field greater than 8 bits is byte-aligned, it is represented in memory with
the endianness of the host. If the field is not byte-aligned, it is represented in memory in
big-endian. Boolean values are represented as 1 bit in packed structs.
</p>
<p>
So that's it. To make a bit-field, you have a packed struct with fields that are
integers with the bit sizes you want.
</p>
<p>
For illustration, here is the above code translated into Zig:
</p>
<pre>
<code class="language-zig">const u1 = @intType(false, 1);
const u12 = @intType(false, 12);

const CoreGlobalRegs = packed struct {
    OtgControl: struct {
        sesreqscs: bool,
        sesreq: bool,
        vbvalidoven: bool,
        vbvalidovval: bool,
        avalidoven: bool,
        avalidovval: bool,
        bvalidoven: bool,
        bvalidovval: bool,
        hstnegscs: bool,
        hnpreq: bool,
        HostSetHnpEnable: bool,
        devhnpen: bool,
        _reserved12_15: u4,
        conidsts: bool,
        dbnctime: u1,
        ASessionValid: bool,
        BSessionValid: bool,
        OtgVersion: u1,
        _reserved21: u1,
        multvalidbc: u5,
        chirpen: bool,
        _reserved28_31: u4,
    }, // +0x0
    OtgInterrupt: packed struct {
        _reserved0_1: u2, // @0
        SessionEndDetected: bool, // @2
        _reserved3_7: u5, // @3
        SessionRequestSuccessStatusChange: bool, // @8
        HostNegotiationSuccessStatusChange: bool, // @9
        _reserved10_16: u7, // @10
        HostNegotiationDetected: bool, // @17
        ADeviceTimeoutChange: bool, // @18
        DebounceDone: bool, // @19
        _reserved20_31: u12, // @20
    }, // +0x4
// ...</code>
</pre>
<p>
By the way it's nice to know that the USB protocol has a bit to indicate "a valid oven".
I wonder when that is used.
</p>
<p>
You may notice that we do a bit of setup before declaring the struct by creating
these integer types. Currently, only integer types of size <code>8</code>, <code>16</code>,
<code>32</code>, and <code>64</code> are provided by the compiler globally.
But perhaps more integer types can be provided as primitive types, or perhaps there will
be a standard library file to import and get more integer types.
</p>
<p>
You may also notice that the <code>volatile</code> keyword is gone. This is a different
issue, but Zig handles volatile at the pointer level. So instead of putting the keyword
on every field, you make sure the pointer to the whole struct is volatile, and then
any loads and stores done via the pointer become volatile, and any pointers derived from
the volatile pointer are also volatile.
</p>
<p>
One point of comparison with C bit-fields. Take a look at this code:
</p>
<pre>
<code class="language-c">struct Foo {
    unsigned a : 3;
    unsigned b : 3;
    unsigned c : 2;
} __attribute__ ((__packed__));

struct Foo foo = {1, 2, 3};

unsigned f(void) {
    unsigned *ptr = &amp;foo.b;
    return *ptr;
}</code>
</pre>
<p>
Here we try to take the address of a bit-field, and clang doesn't like that idea so much:
</p>
<pre><code>test.c:10:12: error: address of bit-field requested
    return &amp;foo.b;
           ^~~~~~</code>
</pre>
<p>
Meanwhile, in Zig, this works just fine:
</p>
<pre>
<code class="language-zig">const Foo = packed struct {
    a: u3,
    b: u3,
    c: u2,
};

var foo = Foo {
    .a = 1,
    .b = 2,
    .c = 3,
};

fn f() u3 {
    const ptr = &amp;foo.b;
    return ptr.*;
}</code>
</pre>
<p>
The bit offset and length is carried in the type of the pointer, so you get an error if you
try to pass such a pointer to a function expecting a normal, byte-aligned value:
</p>
<pre>
<code class="language-zig">const BitField = packed struct {
    a: u3,
    b: u3,
    c: u2,
};

fn foo(bit_field: *const BitField) u3 {
    return bar(&amp;bit_field.b);
}

fn bar(x: *const u3) u3 {
    return x.*;
}</code>
</pre>
<p>
In this case the compiler catches the mistake:
</p>
<pre>
<code>./test.zig:8:26: error: expected type '*const u3', found '*:3:6 const u3'
    return bar(&amp;bit_field.b);
                         ^</code>
</pre>
<p>
There are bound to be some edge cases and bugs as I polish this feature, but I am
pleased that it turned out to integrate so cleanly into Zig's minimal design.
</p>
]]></description>
      </item>
      <item>
         <title>Zig: Already More Knowable Than C</title>
         <pubDate>Tue, 14 Feb 2017 04:49:59 GMT</pubDate>

         <link>https://andrewkelley.me/post/zig-already-more-knowable-than-c.html</link>
         <guid>https://andrewkelley.me/post/zig-already-more-knowable-than-c.html</guid>
         <description><![CDATA[<script>
  Prism.languages['zig'] = Prism.languages.extend('clike', {
    'keyword': /\b(fn|import|cImport|const|var|extern|volatile|export|pub|noalias|inline|struct|enum|goto|break|return|continue|asm|defer|if|else|switch|while|for|null|undefined|true|false|comptime)\b/,
    'property': /\b(bool|i8|u8|i16|u16|i32|u32|i64|u64|isize|usize|f32|f64|void|unreachable|type|error|c_short|c_ushort|c_int|c_uint|c_long|c_ulong|c_longlong|c_ulonglong)\b/,
  });
</script>
<h1>Zig: Already More Knowable Than C</h1>
<p>
There is a nifty article created back in 2015 that made its way onto Hacker News today:
<a href="http://kukuruku.co/hub/programming/i-do-not-know-c">I Do Not Know C</a>.
</p>
<p>
The author creates a delightful set of source code samples in which the reader is
intended to determine the correctness, and if correct, predict the output of
the provided code. If you have not done the exercises, I encourage you to take
a moment to do so.
</p>
<p>
What follows are some of the examples translated into <a href="http://ziglang.org/">Zig</a> for comparison.
The numbers correspond to the numbers from the original post linked above,
but the C code is embedded here for comparison.
</p>
<h2>1. Declaring the same variable twice</h2>
<pre>
<code class="language-c">int i;
int i = 10;</code>
</pre>
<p>
In Zig:
</p>
<pre>
<code class="language-zig">var i = undefined;
var i = 10;</code>
</pre>
<p>
Output:
</p>
<pre>./test.zig:2:1: error: redefinition of 'i'
var i = 10;
^
./test.zig:1:1: note: previous definition is here
var i = undefined;
^</pre>

<h2>2. Null pointer</h2>
<pre>
<code class="language-c">extern void bar(void);
void foo(int *x) {
    int y = *x;  /* (1) */
    if (!x) {    /* (2) */
        return;  /* (3) */
    }
    bar();
}</code>
</pre>
<p>
When this example is translated to Zig, you have to explicitly decide if the pointer
is nullable. For example, if you used a bare pointer in Zig:
</p>
<pre>
<code class="language-zig">extern fn bar();

export fn foo(x: &amp;c_int) {
    var y = *x;  // (1)
    if (x == null) {     // (2)
        return;    // (3)
    }
    bar();
}</code>
</pre>
<p>
Then it wouldn't even compile:
</p>
<pre>./test.zig:7:11: error: operator not allowed for type '?&amp;c_int'
    if (x == null) {     // (2)
          ^</pre>
<p>
I think this error message can be improved, but even so it prevents a possible
null related bug here.
</p>
<p>
If the code author makes the pointer nullable, then the natural way to port the
C code to Zig would be:
</p>
<pre>
<code class="language-zig">extern fn bar();

export fn foo(x: ?&amp;c_int) {
    var y = x ?? return;  // (1), (2), (3)
    bar();
}</code>
</pre>
<p>
This compiles to:
</p>
<pre>
<code>0000000000000000 &lt;foo&gt;:
   0:	48 85 ff             	test   %rdi,%rdi
   3:	74 05                	je     a &lt;foo+0xa&gt;
   5:	e9 00 00 00 00       	jmpq   a &lt;foo+0xa&gt;
   a:	c3                   	retq   </code>
</pre>
<p>
This does a null check, returns if null, and otherwise calls bar.
</p>
<p>
Perhaps a more faithful way to port the C code to Zig would be this:
</p>
<pre>
<code class="language-zig">extern fn bar();

fn foo(x: ?&amp;c_int) {
    var y = ??x;  // (1)
    if (x == null) {     // (2)
        return;    // (3)
    }
    bar();
}

pub fn main() -&gt; %void {
    foo(null);
}</code>
</pre>
<p>
The <code>??</code> operator unwraps the nullable value. It asserts
that the value is not null, and returns the value. If the value is null
then the behavior is undefined, just like in C.
</p>
<p>
However, in Zig, undefined behavior causes a crash in debug mode:
</p>
<pre>
<code>$ ./test
attempt to unwrap null
/home/andy/dev/zig/build/lib/zig/std/special/zigrt.zig:16:35: 0x0000000000203395 in ??? (test)
        @import("std").debug.panic("{}", message_ptr[0..message_len]);
                                  ^
/home/andy/tmp/test.zig:4:13: 0x000000000020a61d in ??? (test)
    var y = ??x;  // (1)
            ^
/home/andy/tmp/test.zig:12:8: 0x00000000002046fd in ??? (test)
    foo(null);
       ^
/home/andy/dev/zig/build/lib/zig/std/special/bootstrap.zig:60:21: 0x0000000000203697 in ??? (test)
    return root.main();
                    ^
/home/andy/dev/zig/build/lib/zig/std/special/bootstrap.zig:47:13: 0x0000000000203420 in ??? (test)
    callMain(argc, argv, envp) %% std.os.posix.exit(1);
            ^
/home/andy/dev/zig/build/lib/zig/std/special/bootstrap.zig:34:25: 0x0000000000203290 in ??? (test)
    posixCallMainAndExit()
                        ^
Aborted</code>
</pre>
<p>
This is a half-finished traceback implementation (lots of TODO items to
complete before the
<a href="https://github.com/andrewrk/zig/milestone/1">0.1.0 milestone</a>),
but the point is that Zig detected the undefined behavior and aborted.
</p>
<p>
In release mode, this example invokes undefined behavior just like in C.
To avoid this, programmers are expected to choose one of these options:
</p>
<ul>
  <li>Test code sufficiently in debug mode to catch undefined behavior abuse.</li>
  <li>Utilize the safe-release option which includes the runtime undefined behavior
  safety checks.</li>
</ul>
<h2>5. strlen</h2>
<pre>
<code class="language-c">int my_strlen(const char *x) {
    int res = 0;
    while(*x) {
        res++;
        x++;
    }
    return res;
}</code>
</pre>
<p>
In Zig, pointers generally point to single objects, while <em>slices</em> are used to
refer to ranges of memory. So in practice you wouldn't need a strlen function,
you would use <code>some_bytes.len</code>. But we can port this code over anyway:
</p>
<pre>
<code class="language-zig">export fn my_strlen(x: &amp;const u8) -&gt; c_int {
    var res: c_int = 0;
    while (x[res] != 0) {
        res += 1;
    }
    return res;
}</code>
</pre>
<p>
Here we must use pointer indexing because Zig does not support direct pointer
arithmetic.
</p>
<p>The compiler catches this problem:</p>
<pre>
<code>./test.zig:3:14: error: expected type 'usize', found 'c_int'
    while (x[res] != 0) {
             ^</code>
</pre>
<h2>6. Print string of bytes backwards</h2>
<pre>
<code class="language-c">#include &lt;stdio.h&gt;
#include &lt;string.h&gt;
int main() {
    const char *str = "hello";
    size_t length = strlen(str);
    size_t i;
    for(i = length - 1; i &gt;= 0; i--) {
        putchar(str[i]);
    }
    putchar('\n');
    return 0;
}</code>
</pre>
<p>
Ported to Zig:
</p>
<pre>
<code class="language-zig">const c = @cImport({
    @cInclude("stdio.h");
    @cInclude("string.h");
});

export fn main(argc: c_int, argv: &amp;&amp;u8) -&gt; c_int {
    const str = c"hello";
    const length: c.size_t = c.strlen(str);
    var i: c.size_t = length - 1;
    while (i &gt;= 0) : (i -= 1) {
        _ = c.putchar(str[i]);
    }
    _ = c.putchar('\n');
    return 0;
}</code>
</pre>
<p>It compiles fine but produces this output when run:</p>
<pre>
<code>integer overflow
test.zig:10:25: 0x000000000020346d in ??? (test)
    while (i &gt;= 0) : (i -= 1) {
                        ^
Aborted</code>
</pre>
<h2>8. Weirdo syntax</h2>
<pre>
<code class="language-c">#include &lt;stdio.h&gt;
int main() {
    int array[] = { 0, 1, 2 };
    printf("%d %d %d\n", 10, (5, array[1, 2]), 10);
}</code>
</pre>
<p>
There is no way to express this code in Zig. Good riddance.
</p>

<h2>9. Unsigned overflow</h2>
<pre>
<code class="language-c">unsigned int add(unsigned int a, unsigned int b) {
    return a + b;
}</code>
</pre>
<p>In Zig:</p>
<pre>
<code class="language-zig">const io = @import("std").io;

export fn add(a: c_uint, b: c_uint) -&gt; c_uint {
    return a + b;
}

pub fn main() -&gt; %void {
    %%io.stdout.printf("{}\n", add(@maxValue(c_uint), 1));
}</code>
</pre>
<p>Output:</p>
<pre>
<code>$ ./test
integer overflow
test.zig:4:14: 0x00000000002032b4 in ??? (test)
    return a + b;
             ^
test.zig:8:35: 0x0000000000204747 in ??? (test)
    %%io.stdout.printf("{}\n", add(@maxValue(c_uint), 1));
                                  ^
lib/zig/std/special/bootstrap.zig:60:21: 0x00000000002036d7 in ??? (test)
    return root.main();
                    ^
lib/zig/std/special/bootstrap.zig:47:13: 0x0000000000203460 in ??? (test)
    callMain(argc, argv, envp) %% std.os.posix.exit(1);
            ^
lib/zig/std/special/bootstrap.zig:34:25: 0x00000000002032d0 in ??? (test)
    posixCallMainAndExit()
                        ^
Aborted</code>
</pre>
<p>
The <code>+</code> operator asserts that there will be no overflow.
If you want twos complement wraparound behavior, that is possible
with the <code>+%</code> operator instead:
</p>
<pre>
<code class="language-zig">export fn add(a: c_uint, b: c_uint) -&gt; c_uint {
    return a +% b;
}</code>
</pre>
<p>
Now the output is:
</p>
<pre>
<code>$ ./test 
0</code>
</pre>

<h2>10. Signed overflow</h2>
<pre>
<code class="language-c">int add(int a, int b) {
    return a + b;
}</code>
</pre>
<p>In C signed and unsigned integer overflow work differently. In Zig,
they work the same. <code>+</code> asserts that no overflow occurs,
and <code>+%</code> performs twos complement wraparound behavior.</p>

<h2>11. Negation overflow</h2>
<pre>
<code class="language-c">int neg(int a) {
    return -a;
}</code>
</pre>
<p>By now you can probably predict how this works in Zig.</p>
<pre>
<code class="language-zig">const io = @import("std").io;

export fn neg(a: c_int) -&gt; c_int {
    return -a;
}

pub fn main() -&gt; %void {
    %%io.stdout.printf("{}\n", neg(@minValue(c_int)));
}</code>
</pre>
<p>Output:</p>
<pre>
<code>$ ./test
integer overflow
test.zig:4:12: 0x00000000002032b0 in ??? (test)
    return -a;
           ^
test.zig:8:35: 0x0000000000204742 in ??? (test)
    %%io.stdout.printf("{}\n", neg(@minValue(c_int)));
                                  ^
lib/zig/std/special/bootstrap.zig:60:21: 0x00000000002036d7 in ??? (test)
    return root.main();
                    ^
lib/zig/std/special/bootstrap.zig:47:13: 0x0000000000203460 in ??? (test)
    callMain(argc, argv, envp) %% std.os.posix.exit(1);
            ^
lib/zig/std/special/bootstrap.zig:34:25: 0x00000000002032d0 in ??? (test)
    posixCallMainAndExit()
                        ^
Aborted</code>
</pre>
<p>The <code>-%</code> wraparound variant of the negation operator works
here too:
<pre>
<code class="language-zig">export fn neg(a: c_int) -&gt; c_int {
    return -%a;
}</code>
</pre>
<p>Output:</p>
<pre>
<code>$ ./test 
-2147483648</code>
</pre>

<h2>12. Division overflow</h2>
<pre>
<code class="language-c">int div(int a, int b) {
    assert(b != 0);
    return a / b;
}</code>
</pre>
<p>Different operation, same deal.</p>
<pre>
<code class="language-zig">const io = @import("std").io;

fn div(a: i32, b: i32) -&gt; i32 {
    return a / b;
}

pub fn main() -&gt; %void {
    %%io.stdout.printf("{}\n", div(@minValue(i32), -1));
}</code>
</pre>
<p>First of all, Zig doesn't let us do this operation because
it's unclear whether we want floored division or truncated division:</p>
<pre><code>
test.zig:4:14: error: division with 'i32' and 'i32': signed integers must use @divTrunc, @divFloor, or @divExact
    return a / b;
             ^</code></pre>
<p>Some languages use truncation division (C) while others (Python) use floored division.
Zig makes the programmer choose explicitly.</p>
<pre><code class="language-zig">fn div(a: i32, b: i32) -&gt; i32 {
    return @divTrunc(a, b);
}</code></pre>
<p>Output:</p>
<pre><code>$ ./test
integer overflow
test.zig:4:12: 0x000000000020a683 in ??? (test)
    return @divTrunc(a, b);
           ^
test.zig:8:35: 0x0000000000204707 in ??? (test)
    %%io.stdout.printf("{}\n", div(@minValue(i32), -1));
                                  ^
lib/zig/std/special/bootstrap.zig:60:21: 0x0000000000203697 in ??? (test)
    return root.main();
                    ^
lib/zig/std/special/bootstrap.zig:47:13: 0x0000000000203420 in ??? (test)
    callMain(argc, argv, envp) %% std.os.posix.exit(1);
            ^
lib/zig/std/special/bootstrap.zig:34:25: 0x0000000000203290 in ??? (test)
    posixCallMainAndExit()
                        ^
Aborted</code></pre>
<p>
Notably, if you execute the division operation at compile time,
the overflow becomes a compile error (same for the other operations):
</p>
<pre>
<code class="language-zig">    %%io.stdout.printf("{}\n", comptime div(@minValue(i32), -1));</code>
</pre>
<p>Output:</p>
<pre>
<code>./test.zig:4:14: error: operation caused overflow
    return a / b;
             ^
./test.zig:8:44: note: called from here
    %%io.stdout.printf("{}\n", comptime div(@minValue(i32), -1));
                                           ^</code>
</pre>

<h2>Conclusion</h2>
<p>
Zig is on track to boot out C as the simple, straightforward way to write system code.
</p>
]]></description>
      </item>
      <item>
         <title>Zig Programming Language Blurs the Line Between Compile-Time and Run-Time</title>
         <pubDate>Mon, 30 Jan 2017 08:19:59 GMT</pubDate>

         <link>https://andrewkelley.me/post/zig-programming-language-blurs-line-compile-time-run-time.html</link>
         <guid>https://andrewkelley.me/post/zig-programming-language-blurs-line-compile-time-run-time.html</guid>
         <description><![CDATA[<script>
  Prism.languages['zig'] = Prism.languages.extend('clike', {
    'keyword': /\b(fn|import|cImport|const|var|extern|volatile|export|pub|noalias|inline|struct|enum|goto|break|return|continue|asm|defer|if|else|switch|while|for|null|undefined|true|false|comptime|or|and)\b/,
    'property': /\b(bool|i8|u8|i16|u16|i32|u32|i64|u64|isize|usize|f32|f64|void|unreachable|type|error|c_short|c_ushort|c_int|c_uint|c_long|c_ulong|c_longlong|c_ulonglong)\b/,
  });
  Prism.languages['rust'] = Prism.languages.extend('clike', {
    'keyword': /\b(fn|const|var|pub|struct|enum|break|return|continue|if|else|match|while|for|true|false)\b/,
    'property': /\b(bool|i8|u8|i16|u16|i32|u32|i64|u64|isize|usize|f32|f64|void|unreachable|type|error|str|static)\b/,
  });
</script>
<h1>Zig Programming Language Blurs the Line Between Compile-Time and Run-Time</h1>
<p>
Zig places importance on the concept of whether an expression is known at compile-time.
There are a few different places this concept is used, and these building blocks are used
to keep the language small, readable, and powerful.
</p>
<ul>
  <li><a href="#introducing-compile-time-concept">Introducing the Compile-Time Concept</a></li>
  <ul>
    <li><a href="#compile-time-parameters">Compile-time parameters</a></li>
    <li><a href="#compile-time-variables">Compile-time variables</a></li>
    <li><a href="#compile-time-expressions">Compile-time expressions</a></li>
  </ul>
  <li><a href="#generic-data-structures">Generic Data Structures</a></li>
  <li><a href="#case-study-printf">Case Study: printf in C, Rust, and Zig</a></li>
  <li><a href="#conclusion">Conclusion</a></li>
</ul>
<h3 id="introducing-compile-time-concept">Introducing the Compile-Time Concept</h3>
<h4 id="compile-time-parameters">Compile-Time Parameters</h4>
<p>
Compile-time parameters is how Zig implements generics. It is compile-time duck typing
and it works mostly the same way that C++ template parameters work. Example:
</p>
<pre>
<code class="language-zig">fn max(comptime T: type, a: T, b: T) -&gt; T {
    if (a &gt; b) a else b
}
fn gimmeTheBiggerFloat(a: f32, b: f32) -&gt; f32 {
    max(f32, a, b)
}
fn gimmeTheBiggerInteger(a: u64, b: u64) -&gt; u64 {
    max(u64, a, b)
}</code>
</pre>
<p>
In Zig, types are first-class citizens. They can be assigned to variables, passed as parameters to functions,
and returned from functions. However, they can only be used in expressions which are known at <em>compile-time</em>,
which is why the parameter <code>T</code> in the above snippet must be marked with <code>comptime</code>.
</p>
<p>
A <code>comptime</code> parameter means that:
</p>
<ul>
  <li>At the callsite, the value must be known at compile-time, or it is a compile error.</li>
  <li>In the function definition, the value is known at compile-time.</li>
</ul>
<p>
</p>
<p>
For example, if we were to introduce another function to the above snippet:
</p>
<pre>
<code class="language-zig">fn max(comptime T: type, a: T, b: T) -&gt; T {
    if (a &gt; b) a else b
}
fn letsTryToPassARuntimeType(condition: bool) {
    const result = max(
        if (condition) f32 else u64,
        1234,
        5678);
}</code>
</pre>
<p>
Then we get this result from the compiler:
</p>
<pre>
<code>./test.zig:6:9: error: unable to evaluate constant expression
        if (condition) f32 else u64,
        ^</code>
</pre>
<p>
This is an error because the programmer attempted to pass a value only known at run-time
to a function which expects a value known at compile-time.
</p>
<p>
Another way to get an error is if we pass a type that violates the type checker when the
function is analyzed. This is what it means to have <em>compile-time duck typing</em>.
</p>
<p>
For example:
</p>
<pre>
<code class="language-zig">fn max(comptime T: type, a: T, b: T) -&gt; T {
    if (a &gt; b) a else b
}
fn letsTryToCompareBools(a: bool, b: bool) -&gt; bool {
    max(bool, a, b)
}</code>
</pre>
<p>
The code produces this error message:
</p>
<pre>
<code>./test.zig:2:11: error: operator not allowed for type 'bool'
    if (a &gt; b) a else b
          ^
./test.zig:5:8: note: called from here
    max(bool, a, b)
       ^</code>
</pre>
<p>
On the flip side, inside the function definition with the <code>comptime</code> parameter, the
value is known at compile-time. This means that we actually could make this work for the bool type
if we wanted to:
</p>
<pre>
<code class="language-zig">fn max(comptime T: type, a: T, b: T) -&gt; T {
    if (T == bool) {
        return a or b;
    } else if (a &gt; b) {
        return a;
    } else {
        return b;
    }
}
fn letsTryToCompareBools(a: bool, b: bool) -&gt; bool {
    max(bool, a, b)
}</code>
</pre>
<p>
This works because Zig implicitly inlines <code>if</code> expressions when the condition
is known at compile-time, and the compiler guarantees that it will skip analysis of
the branch not taken.
</p>
<p>
This means that the actual function generated for <code>max</code> in this situation looks like
this:
</p>
<pre>
<code class="language-zig">fn max(a: bool, b: bool) -&gt; bool {
    return a or b;
}</code>
</pre>
<p>
All the code that dealt with compile-time known values is eliminated and we are left with only
the necessary run-time code to accomplish the task.
</p>
<p>
This works the same way for <code>switch</code> expressions - they are implicitly inlined
when the target expression is compile-time known.
</p>
<h4 id="compile-time-variables">Compile-Time Variables</h4>
<p>
In Zig, the programmer can label variables as <code>comptime</code>. This guarantees to the compiler
that every load and store of the variable is performed at compile-time. Any violation of this results in a
compile error.
</p>
<p>
This combined with the fact that we can <code>inline</code> loops allows us to write
a function which is partially evaluated at compile-time and partially at run-time.
</p>
<p>
For example:
</p>
<pre>
<code class="language-zig">const CmdFn = struct {
    name: []const u8,
    func: fn(i32) -&gt; i32,
};

const cmd_fns = []CmdFn{
    CmdFn {.name = "one", .func = one},
    CmdFn {.name = "two", .func = two},
    CmdFn {.name = "three", .func = three},
};
fn one(value: i32) -&gt; i32 { value + 1 }
fn two(value: i32) -&gt; i32 { value + 2 }
fn three(value: i32) -&gt; i32 { value + 3 }

fn performFn(comptime prefix_char: u8, start_value: i32) -&gt; i32 {
    var result: i32 = start_value;
    comptime var i = 0;
    inline while (i &lt; cmd_fns.len) : (i += 1) {
        if (cmd_fns[i].name[0] == prefix_char) {
            result = cmd_fns[i].func(result);
        }
    }
    return result;
}

fn testPerformFn() {
    @setFnTest(this);

    assert(performFn('t', 1) == 6);
    assert(performFn('o', 0) == 1);
    assert(performFn('w', 99) == 99);
}

fn assert(ok: bool) {
    if (!ok) unreachable;
}</code>
</pre>
<p>
This example is a bit contrived, because the compile-time evaluation component is unnecessary;
this code would work fine if it was all done at run-time. But it does end up generating
different code. In this example, the function <code>performFn</code> is generated three different times,
for the different values of <code>prefix_char</code> provided:
</p>
<pre>
<code class="language-zig">// From the line:
// assert(performFn('t', 1) == 6);
fn performFn(start_value: i32) -&gt; i32 {
    var result: i32 = start_value;
    result = two(result);
    result = three(result);
    return result;
}

// From the line:
// assert(performFn('o', 0) == 1);
fn performFn(start_value: i32) -&gt; i32 {
    var result: i32 = start_value;
    result = one(result);
    return result;
}

// From the line:
// assert(performFn('w', 99) == 99);
fn performFn(start_value: i32) -&gt; i32 {
    var result: i32 = start_value;
    return result;
}</code>
</pre>
<p>
Note that this happens even in a debug build; in a release build these generated functions still
pass through rigorous LLVM optimizations. The important thing to note, however, is not that this
is a way to write more optimized code, but that it is a way to make sure that what <em>should</em> happen
at compile-time, <em>does</em> happen at compile-time. This catches more errors and as demonstrated
later in this article, allows expressiveness that in other languages requires using macros,
generated code, or a preprocessor to accomplish.
</p>
<h4 id="compile-time-expressions">Compile-Time Expressions</h4>
<p>
In Zig, it matters whether a given expression is known at compile-time or run-time. A programmer can
use a <code>comptime</code> expression to guarantee that the expression will be evaluated at compile-time.
If this cannot be accomplished, the compiler will emit an error. For example:
</p>
<pre>
<code class="language-zig">extern fn exit() -&gt; unreachable;

fn foo() {
    comptime {
        exit();
    }
}</code>
</pre>
<pre>
<code>./test.zig:5:9: error: unable to evaluate constant expression
        exit();
        ^</code>
</pre>
<p>
It doesn't make sense that a program could call <code>exit()</code> (or any other external function)
at compile-time, so this is a compile error. However, a <code>comptime</code> expression does much
more than sometimes cause a compile error.
</p>
<p>
Within a <code>comptime</code> expression:
</p>
<ul>
  <li>All variables are <code>comptime</code> variables.</li>
  <li>All <code>if</code>, <code>while</code>, <code>for</code>, <code>switch</code>, and <code>goto</code>
    expressions are evaluated at compile-time, or emit a compile error if this is not possible.</li>
  <li>All function calls cause the compiler to interpret the function at compile-time, emitting a
    compile error if the function tries to do something that has global run-time side effects.</li>
</ul>
<p>
This means that a programmer can create a function which is called both at compile-time and run-time, with
no modification to the function required.
</p>
<p>
Let's look at an example:
</p>
<pre>
<code class="language-zig">fn fibonacci(index: u32) -&gt; u32 {
    if (index &lt; 2) return index;
    return fibonacci(index - 1) + fibonacci(index - 2);
}

fn testFibonacci() {
    @setFnTest(this);

    // test fibonacci at run-time
    assert(fibonacci(7) == 13);

    // test fibonacci at compile-time
    comptime {
        assert(fibonacci(7) == 13);
    }
}

fn assert(ok: bool) {
    if (!ok) unreachable;
}</code>
</pre>
<pre>
<code>$ zig test test.zig
Test 1/1 testFibonacci...OK</code>
</pre>
<p>
Imagine if we had forgotten the base case of the recursive function and tried to run the tests:
</p>
<pre>
<code class="language-zig">fn fibonacci(index: u32) -&gt; u32 {
    //if (index &lt; 2) return index;
    return fibonacci(index - 1) + fibonacci(index - 2);
}

fn testFibonacci() {
    @setFnTest(this);

    comptime {
        assert(fibonacci(7) == 13);
    }
}

fn assert(ok: bool) {
    if (!ok) unreachable;
}</code>
</pre>
<pre>
<code>$ zig test test.zig
./test.zig:3:28: error: operation caused overflow
    return fibonacci(index - 1) + fibonacci(index - 2);
                           ^
./test.zig:3:21: note: called from here
    return fibonacci(index - 1) + fibonacci(index - 2);
                    ^
./test.zig:3:21: note: called from here
    return fibonacci(index - 1) + fibonacci(index - 2);
                    ^
./test.zig:3:21: note: called from here
    return fibonacci(index - 1) + fibonacci(index - 2);
                    ^
./test.zig:3:21: note: called from here
    return fibonacci(index - 1) + fibonacci(index - 2);
                    ^
./test.zig:3:21: note: called from here
    return fibonacci(index - 1) + fibonacci(index - 2);
                    ^
./test.zig:3:21: note: called from here
    return fibonacci(index - 1) + fibonacci(index - 2);
                    ^
./test.zig:3:21: note: called from here
    return fibonacci(index - 1) + fibonacci(index - 2);
                    ^
./test.zig:14:25: note: called from here
        assert(fibonacci(7) == 13);
                        ^</code>
</pre>
<p>
The compiler produces an error which is a stack trace from trying to evaluate the
function at compile-time.
</p>
<p>
Luckily, we used an unsigned integer, and so when we tried to subtract 1 from 0, it triggered
undefined behavior, which is always a compile error if the compiler knows it happened.
But what would have happened if we used a signed integer?
</p>
<pre>
<code class="language-zig">fn fibonacci(index: i32) -&gt; i32 {
    //if (index &lt; 2) return index;
    return fibonacci(index - 1) + fibonacci(index - 2);
}

fn testFibonacci() {
    @setFnTest(this);

    comptime {
        assert(fibonacci(7) == 13);
    }
}

fn assert(ok: bool) {
    if (!ok) unreachable;
}</code>
</pre>
<pre>
<code>./test.zig:3:21: error: evaluation exceeded 1000 backwards branches
    return fibonacci(index - 1) + fibonacci(index - 2);
                    ^
./test.zig:3:21: note: called from here
    return fibonacci(index - 1) + fibonacci(index - 2);
                    ^
./test.zig:3:21: note: called from here
    return fibonacci(index - 1) + fibonacci(index - 2);
                    ^
./test.zig:3:21: note: called from here
    return fibonacci(index - 1) + fibonacci(index - 2);
                    ^
./test.zig:3:21: note: called from here
    return fibonacci(index - 1) + fibonacci(index - 2);
                    ^
./test.zig:3:21: note: called from here
    return fibonacci(index - 1) + fibonacci(index - 2);
                    ^
./test.zig:3:21: note: called from here
    return fibonacci(index - 1) + fibonacci(index - 2);
                    ^
./test.zig:3:21: note: called from here
    return fibonacci(index - 1) + fibonacci(index - 2);
                    ^
./test.zig:3:21: note: called from here
    return fibonacci(index - 1) + fibonacci(index - 2);
                    ^
./test.zig:3:21: note: called from here
    return fibonacci(index - 1) + fibonacci(index - 2);
                    ^
./test.zig:3:21: note: called from here
    return fibonacci(index - 1) + fibonacci(index - 2);
                    ^
./test.zig:3:21: note: called from here
    return fibonacci(index - 1) + fibonacci(index - 2);
                    ^</code>
</pre>
<p>
The compiler noticed that evaluating this function at compile-time took a long time,
and thus emitted a compile error and gave up. If the programmer wants to increase
the budget for compile-time computation, they can use a built-in function called
<code>@setEvalBranchQuota</code> to change the default number 1000 to something else.
</p>
<p>
What if we fix the base case, but put the wrong value in the <code>assert</code> line?
</p>
<pre>
<code class="language-zig">comptime {
    assert(fibonacci(7) == 99999);
}</code>
</pre>
<pre>
<code>./test.zig:15:14: error: unable to evaluate constant expression
    if (!ok) unreachable;
             ^
./test.zig:10:15: note: called from here
        assert(fibonacci(7) == 99999);
              ^</code>
</pre>
<p>
What happened is Zig started interpreting the <code>assert</code> function with the
parameter <code>ok</code> set to <code>false</code>. When the interpreter hit
<code>unreachable</code> it emitted a compile error, because reaching unreachable
code is undefined behavior, and undefined behavior causes a compile error if it is detected
at compile-time.
</p>

<p>
In the global scope (outside of any function), all expressions are implicitly
<code>comptime</code> expressions. This means that we can use functions to
initialize complex static data. For example:
</p>
<pre>
<code class="language-zig">const first_25_primes = firstNPrimes(25);
const sum_of_first_25_primes = sum(first_25_primes);

fn firstNPrimes(comptime n: usize) -&gt; [n]i32 {
    var prime_list: [n]i32 = undefined;
    var next_index: usize = 0;
    var test_number: i32 = 2;
    while (next_index &lt; prime_list.len) : (test_number += 1) {
        var test_prime_index: usize = 0;
        var is_prime = true;
        while (test_prime_index &lt; next_index) : (test_prime_index += 1) {
            if (test_number % prime_list[test_prime_index] == 0) {
                is_prime = false;
                break;
            }
        }
        if (is_prime) {
            prime_list[next_index] = test_number;
            next_index += 1;
        }
    }
    return prime_list;
}

fn sum(numbers: []i32) -&gt; i32 {
    var result: i32 = 0;
    for (numbers) |x| {
        result += x;
    }
    return result;
}</code>
</pre>
<p>
When we compile this program, Zig generates the constants
with the answer pre-computed. Here are the lines from the generated LLVM IR:
</p>
<pre>
<code>@0 = internal unnamed_addr constant [25 x i32] [i32 2, i32 3, i32 5, i32 7, i32 11, i32 13, i32 17, i32 19, i32 23, i32 29, i32 31, i32 37, i32 41, i32 43, i32 47, i32 53, i32 59, i32 61, i32 67, i32 71, i32 73, i32 79, i32 83, i32 89, i32 97]
@1 = internal unnamed_addr constant i32 1060</code>
</pre>
<p>
Note that we did not have to do anything special with the syntax of these functions. For example,
we could call the <code>sum</code> function as is with a slice of numbers whose length and values were
only known at run-time.
</p>
<h3 id="generic-data-structures">Generic Data Structures</h3>
<p>
Zig uses these capabilities to implement generic data structures without introducing any
special-case syntax. If you followed along so far, you may already know how to create a
generic data structure.
</p>
<p>
Here is an example of a generic <code>List</code> data structure, that we will instantiate with
the type <code>i32</code>. Whereas in C++ or Rust we would refer to the instantiated type as
<code>List&lt;i32&gt;</code>, in Zig we refer to the type as <code>List(i32)</code>.
</p>
<pre>
<code class="language-zig">fn List(comptime T: type) -&gt; type {
    struct {
        items: []T,
        len: usize,
    }
}</code>
</pre>
<p>
That's it. It's a function that returns an anonymous <code>struct</code>. For the purposes of error messages
and debugging, Zig infers the name <code>"List(i32)"</code> from the function name and parameters invoked when creating
the anonymous struct.
</p>
<p>
To keep the language small and uniform, all aggregate types in Zig are anonymous. To give a type
a name, we assign it to a constant:
</p>
<pre>
<code class="language-zig">const Node = struct {
    next: &amp;Node,
    name: []u8,
};</code>
</pre>
<p>
This works because all top level declarations are order-independent, and as long as there isn't
an actual infinite regression, values can refer to themselves, directly or indirectly. In this case,
<code>Node</code> refers to itself as a pointer, which is not actually an infinite regression, so
it works fine.
</p>
<h3 id="case-study-printf">Case Study: printf in C, Rust, and Zig</h3>
<p>
Putting all of this together, let's compare how <code>printf</code> works in C, Rust, and Zig.
</p>
<p>
Here's how <code>printf</code> work in C:
</p>
<pre>
<code class="language-c">#include &lt;stdio.h&gt;

static const int a_number = 1234;
static const char * a_string = "foobar";

int main(int argc, char **argv) {
    fprintf(stderr, "here is a string: '%s' here is a number: %d\n", a_string, a_number);
    return 0;
}</code>
</pre>
<pre>
<code>here is a string: 'foobar' here is a number: 1234</code>
</pre>
<p>
What happens here is the <code>printf</code> implementation iterates over the format string
at run-time, and when it encounters a format specifier such as <code>%d</code> it looks at
the next argument which is passed in an architecture-specific way, interprets the argument as
a type depending on the format specifier, and attempts to print it. If the types are incorrect
or not enough arguments are passed, undefined behavior occurs - it may crash, print garbage
data, or access invalid memory.
</p>
<p>
Luckily, the compiler defines an attribute that you can use like this:
</p>
<pre>
<code class="language-c">__attribute__ ((format (printf, x, y)));</code>
</pre>
<p>
Where x and y are the 1-based indexes of the argument parameters that correspond to the
format string and the first var args parameter, respectively.
</p>
<p>
This attribute adds type checking to the function it decorates, to prevent the above problems,
and the <code>printf</code> function from <code>stdio.h</code> has this attribute on it, so these
problems are solved.
</p>
<p>
But what if you want to invent your own format string syntax and have the compiler check
it for you?
</p>
<p>
You can't.
</p>
<p>
That's how it works in C. It is hard-coded into the compiler. If you wanted to write your own
format string printing code and have it checked by the compiler, you would have to use the
preprocessor or metaprogramming - generate C code as output from some other code.
</p>
<p>
Zig is a programming language which is intended to replace C. We can do better than this.
</p>
<p>
Here's the equivalent program in Zig:
</p>

<pre>
<code class="language-zig">const io = @import("std").io;

const a_number: i32 = 1234;
const a_string = "foobar";

pub fn main(args: [][]u8) -&gt; %void {
    %%io.stderr.printf("here is a string: '{}' here is a number: {}\n", a_string, a_number);
}</code>
</pre>
<pre>
<code>here is a string: 'foobar' here is a number: 1234</code>
</pre>

<p>
Let's crack open the implementation of this and see how it works:
</p>

<pre>
<code class="language-zig">/// Calls print and then flushes the buffer.
pub fn printf(self: &amp;OutStream, comptime format: []const u8, args: ...) -&gt; %void {
    const State = enum {
        Start,
        OpenBrace,
        CloseBrace,
    };

    comptime var start_index: usize = 0;
    comptime var state = State.Start;
    comptime var next_arg: usize = 0;

    inline for (format) |c, i| {
        switch (state) {
            State.Start =&gt; switch (c) {
                '{' =&gt; {
                    if (start_index &lt; i) %return self.write(format[start_index...i]);
                    state = State.OpenBrace;
                },
                '}' =&gt; {
                    if (start_index &lt; i) %return self.write(format[start_index...i]);
                    state = State.CloseBrace;
                },
                else =&gt; {},
            },
            State.OpenBrace =&gt; switch (c) {
                '{' =&gt; {
                    state = State.Start;
                    start_index = i;
                },
                '}' =&gt; {
                    %return self.printValue(args[next_arg]);
                    next_arg += 1;
                    state = State.Start;
                    start_index = i + 1;
                },
                else =&gt; @compileError("Unknown format character: " ++ c),
            },
            State.CloseBrace =&gt; switch (c) {
                '}' =&gt; {
                    state = State.Start;
                    start_index = i;
                },
                else =&gt; @compileError("Single '}' encountered in format string"),
            },
        }
    }
    comptime {
        if (args.len != next_arg) {
            @compileError("Unused arguments");
        }
        if (state != State.Start) {
            @compileError("Incomplete format string: " ++ format);
        }
    }
    if (start_index &lt; format.len) {
        %return self.write(format[start_index...format.len]);
    }
    %return self.flush();
}</code>
</pre>
<p>
This is a proof of concept implementation; it will gain more formatting capabilities before
Zig reaches its first release.
</p>
<p>
Note that this is not hard-coded into the Zig compiler; this userland code in the standard library.
</p>
<p>
When this function is analyzed from our example code above, Zig partially evaluates the function
and emits a function that actually looks like this:
</p>
<pre>
<code class="language-zig">pub fn printf(self: &amp;OutStream, arg0: i32, arg1: []const u8) -&gt; %void {
    %return self.write("here is a string: '");
    %return self.printValue(arg0);
    %return self.write("' here is a number: ");
    %return self.printValue(arg1);
    %return self.write("\n");
    %return self.flush();
}</code>
</pre>
<p>
<code>printValue</code> is a function that takes a parameter of any type, and does different things depending
on the type:
</p>
<pre>
<code class="language-zig">pub fn printValue(self: &amp;OutStream, value: var) -&gt; %void {
    const T = @typeOf(value);
    if (@isInteger(T)) {
        return self.printInt(T, value);
    } else if (@isFloat(T)) {
        return self.printFloat(T, value);
    } else if (@canImplicitCast([]const u8, value)) {
        const casted_value = ([]const u8)(value);
        return self.write(casted_value);
    } else {
        @compileError("Unable to print type '" ++ @typeName(T) ++ "'");
    }
}</code>
</pre>
<p>
And now, what happens if we give too many arguments to <code>printf</code>?
</p>
<pre>
<code class="language-zig">%%io.stdout.printf("here is a string: '{}' here is a number: {}\n",
        a_string, a_number, a_number);</code>
</pre>
<pre>
<code>.../std/io.zig:147:17: error: Unused arguments
                @compileError("Unused arguments");
                ^
./test.zig:7:23: note: called from here
    %%io.stdout.printf("here is a number: {} and here is a string: {}\n",
                      ^</code>
</pre>
<p>
Zig gives programmers the tools needed to protect themselves against their own mistakes.
</p>
<p>
Let's take a look at how <a href="https://www.rust-lang.org/en-US/">Rust</a> handles this
problem. Here's the equivalent program:
</p>
<pre>
<code class="language-rust">const A_NUMBER: i32 = 1234;
const A_STRING: &amp;'static str = "foobar";

fn main() {
    print!("here is a string: '{}' here is a number: {}\n",
        A_STRING, A_NUMBER);
}</code>
</pre>
<pre>
<code>here is a string: 'foobar' here is a number: 1234</code>
</pre>
<p>
<code>print!</code>, as evidenced by the exclamation point, is a macro. Here is the definition:
</p>
<pre>
<code class="language-rust">#[macro_export]
#[stable(feature = "rust1", since = "1.0.0")]
#[allow_internal_unstable]
macro_rules! print {
    ($($arg:tt)*) =&gt; ($crate::io::_print(format_args!($($arg)*)));
}
#[stable(feature = "rust1", since = "1.0.0")]
#[macro_export]
macro_rules! format_args { ($fmt:expr, $($args:tt)*) =&gt; ({
/* compiler built-in */
}) }</code>
</pre>
<p>
Rust accomplishes the syntax that one would want from a var args print implementation, but
it requires using a macro to do so.
</p>
<p>
Macros have some limitations. For example, in this case, if you move the format string to
a global variable, the Rust example can no longer compile:
</p>
<pre>
<code class="language-rust">const A_NUMBER: i32 = 1234;
const A_STRING: &amp;'static str = "foobar";
const FMT: &amp;'static str = "here is a string: '{}' here is a number: {}\n";

fn main() {
    print!(FMT, A_STRING, A_NUMBER);
}</code>
</pre>
<pre>
<code>error: format argument must be a string literal.
 --&gt; test.rs:6:12
  |
6 |     print!(FMT, A_STRING, A_NUMBER);
  |            ^^^</code>
</pre>
<p>
On the other hand, Zig doesn't care whether the format argument is a string literal,
only that it is a compile-time known value that is implicitly castable to a <code>[]const u8</code>:
</p>
<pre>
<code class="language-zig">const io = @import("std").io;

const a_number: i32 = 1234;
const a_string = "foobar";
const fmt = "here is a string: '{}' here is a number: {}\n";

pub fn main(args: [][]u8) -&gt; %void {
    %%io.stderr.printf(fmt, a_string, a_number);
}</code>
</pre>
<p>
This works fine.
</p>
<p>
A macro is a reasonable solution to this problem, but it comes at the cost of readability. From
<a href="https://doc.rust-lang.org/beta/book/macros.html">Rust's own documentation</a>:
</p>
<blockquote>
  <p>
The drawback is that macro-based code can be harder to understand, because fewer of the built-in rules apply. Like an ordinary function, a well-behaved macro can be used without understanding its implementation. However, it can be difficult to design a well-behaved macro! Additionally, compiler errors in macro code are harder to interpret, because they describe problems in the expanded code, not the source-level form that developers use.
  </p>
  <p>
These drawbacks make macros something of a "feature of last resort". That’s not to say that macros are bad; they are part of Rust because sometimes they’re needed for truly concise, well-abstracted code. Just keep this tradeoff in mind.
  </p>
</blockquote>
<p>
One of the goals of Zig is to avoid these drawbacks while still providing enough of the power that
macros provide in order to make them unnecessary.
</p>
<p>
There is another thing I noticed, and I hope someone from the Rust community can correct me if I'm wrong,
but it looks like Rust also special cased <code>format_args!</code> in the compiler by making it a built-in.
If my understanding is correct, this would make Zig stand out as the only language of the three mentioned here
which does not special case string formatting in the compiler and instead exposes enough power to accomplish this
task in userland.
</p>
<p>
But more importantly, it does so without introducing another language on top of Zig, such as
a macro language or a preprocessor language. It's Zig all the way down.
</p>

<h3 id="conclusion">Conclusion</h3>
<p>
Thank you for following along and checking out what I've been working on lately.
</p>
<p>
As always, I welcome discussion, criticism, and users. Please keep in mind that this is alpha software;
I am working toward a first beta release, but the project is not there yet.
</p>
]]></description>
      </item>
      <item>
         <title>Troubleshooting a Zig Regression with apitrace</title>
         <pubDate>Tue, 17 Jan 2017 23:25:58 GMT</pubDate>

         <link>https://andrewkelley.me/post/troubleshooting-zig-regression-apitrace.html</link>
         <guid>https://andrewkelley.me/post/troubleshooting-zig-regression-apitrace.html</guid>
         <description><![CDATA[<script>
  Prism.languages['zig'] = Prism.languages.extend('clike', {
    'keyword': /\b(fn|import|cImport|const|var|extern|volatile|export|pub|noalias|inline|struct|enum|goto|break|return|continue|asm|defer|if|else|switch|while|for|null|undefined|true|false)\b/,
    'property': /\b(bool|i8|u8|i16|u16|i32|u32|i64|u64|isize|usize|f32|f64|void|unreachable|type|error|c_short|c_ushort|c_int|c_uint|c_long|c_ulong|c_longlong|c_ulonglong)\b/,
  });
</script>
<h1>Troubleshooting a Zig Regression with apitrace</h1>
<p>
The past three months I have spent rewriting <a href="http://ziglang.org/">Zig</a> internals.
</p>
<p>
Previously, the compiler looked like this:
</p>
<p>
Source → Tokenize → Abstract Syntax Tree → Semantic Analysis → LLVM Codegen → Binary
</p>
<p>
Now the compiler looks like this:
</p>
<p>
Source → Tokenize → Abstract Syntax Tree → Intermediate Representation Code →
Evaluation and Analysis → Intermediate Representation Code → LLVM Codegen → Binary
</p>
<p>
This was a significant amount of work:
</p>
<pre>
92 files changed, 22307 insertions(+), 15357 deletions(-)
</pre>
<p>
It took a while to get all 361 tests passing before I could merge back into master.
</p>
<p>
Part of my testing process is making sure this
<a href="https://github.com/andrewrk/tetris">Tetris game</a>
continues to work. Here's a screenshot of it working, from master branch:
</p>
<img src="http://superjoe.s3.amazonaws.com/blog-files/troubleshooting-zig-regression-apitrace/tetris-working.png">
<p>
Unfortunately, after my changes to Zig, the game looked like this:
</p>
<img src="http://superjoe.s3.amazonaws.com/blog-files/troubleshooting-zig-regression-apitrace/tetris-not-working.png">
<p>
So, I ran both versions of the game with <a href="http://apitrace.github.io/">apitrace</a>.
This resulted in a handy diff:
</p>
<img src="http://superjoe.s3.amazonaws.com/blog-files/troubleshooting-zig-regression-apitrace/apitrace-diff.png">
<p>
Here, it looks like both programs are issuing the same OpenGL commands except for different values to
<code>glUniform4fv</code>. Aha! Let's go see what's going on there.
</p>
<p>
After investigating this, it turned out that the <code>glUniform4fv</code> was simply for the piece color
and since the game uses a random number for each piece, the two instances of the game started with different
pieces.
</p>
<p>
So, I made a small change to the Random Number Generator code...
</p>
<p>
Before:
</p>
<pre>
<code class="language-zig">fn getRandomSeed() -&gt; %u32 {
    var seed : u32 = undefined;
    const seed_bytes = (&amp;u8)(&amp;seed)[0...4];
    %return std.os.getRandomBytes(seed_bytes);
    return seed;
 }</code>
</pre>
<p>
After:
</p>
<pre>
<code class="language-zig">fn getRandomSeed() -&gt; %u32 {
    return 4;
 }</code>
</pre>
<p>
After
<a href="http://xkcd.com/221/">this change</a>,
the <code>glUniform4fv</code> commands were sending the same data. Therefore, the difference
<strong>must</strong> be in the "data blob" parameters sent in initialization.
</p>
<p>
This led me to scrutinize this code:
</p>
<pre>
<code class="language-zig">const rect_2d_vertexes = [][3]c.GLfloat {
    []c.GLfloat{0.0, 0.0, 0.0},
    []c.GLfloat{0.0, 1.0, 0.0},
    []c.GLfloat{1.0, 0.0, 0.0},
    []c.GLfloat{1.0, 1.0, 0.0},
};
c.glGenBuffers(1, &amp;sg.rect_2d_vertex_buffer);
c.glBindBuffer(c.GL_ARRAY_BUFFER, sg.rect_2d_vertex_buffer);
c.glBufferData(c.GL_ARRAY_BUFFER, 4 * 3 * @sizeOf(c.GLfloat), (&amp;c_void)(&amp;rect_2d_vertexes[0][0]), c.GL_STATIC_DRAW);</code>
</pre>
<p>
I discovered the problem was <code>&amp;rect_2d_vertexes[0][0]</code>.
The compiler noticed that <code>rect_2d_vertexes</code> was a compile-time constant and therefore
generated the 2D array data structure as static data. It therefore evaluated <code>&amp;rect_2d_vertexes[0][0]</code>
as a compile-time known expression as well.
</p>
<p>
The problem was that each element in the <code>rect_2d_vertexes</code> referenced another array.
The compile-time constant generation code emitted an independent array for the inner arrays, whereas
we are expecting the pointer to point to a 2D array that contains all the data contiguously.
</p>
<p>
So I updated the data structure of constant arrays to refer to their parents, added a test case to cover the change,
and now the tetris game works again. Huzzah!
</p>
]]></description>
      </item>
      <item>
         <title>Introduction to the Zig Programming Language</title>
         <pubDate>Mon, 08 Feb 2016 16:07:35 GMT</pubDate>

         <link>https://andrewkelley.me/post/intro-to-zig.html</link>
         <guid>https://andrewkelley.me/post/intro-to-zig.html</guid>
         <description><![CDATA[<script>
  Prism.languages['zig'] = Prism.languages.extend('clike', {
    'keyword': /\b(fn|const|var|extern|volatile|export|pub|noalias|inline|struct|enum|goto|break|return|orelse|continue|asm|defer|if|else|switch|while|for|null|undefined|true|false|use|unreachable|try|catch)\b/,
    'property': /\b(bool|i8|u8|i16|u16|i32|u32|i64|u64|isize|usize|f32|f64|void|type|error|c_short|c_ushort|c_int|c_uint|c_long|c_ulong|c_longlong|c_ulonglong|anyerror|noreturn)\b/,
  });
</script>
<h1>Introduction to the Zig Programming Language</h1>
<p>
The past few months I took a break from working on
<a href="http://genesisdaw.org/">Genesis Digital Audio Workstation</a>
to work, instead, on creating a
<a href="https://ziglang.org/">new programming language</a>.
</p>
<p>
I am nothing if not ambitious, and my goal is to create a new programming
language that is <em>more pragmatic than C</em>. This is like to trying to be
more evil than the devil himself.
</p>
<p>
So, in order, these are the priorities of Zig:
</p>
<ol>
  <li><strong>Pragmatic</strong>: At the end of the day, all that really matters is
    whether the language helped you do what you were trying to do better than any other
    language.
  </li>
  <li><strong>Optimal</strong>: The most natural way to write a program should result
    in top-of-the-line runtime performance, equivalent to or better than C. In places
    where performance is king, the optimal code should be clearly expressible.
  </li>
  <li><strong>Safe</strong>: Optimality may be sitting in the driver's seat, but
    safety is sitting in the passenger's seat, wearing its seatbelt, and asking nicely
    for the other passengers to do the same.
  </li>
  <li><strong>Readable</strong>: Zig prioritizes reading code over writing it.
    Avoid complicated syntax. Generally there should be a canonical way to do
    everything.
  </li>
</ol>
<h2 id="toc">Table of Contents</h2>
<ol>
  <li><a href="#toc">Table of Contents</a></li>
  <li><a href="#design-decisions">Design Decisions</a>
    <ol>
    <li><a href="#debug-release">Widely Diverging Debug and Release Builds</a></li>
    <li><a href="#c-abi">C ABI Compatibility</a></li>
    <li><a href="#maybe-type">Maybe Type Instead of Null Pointer</a></li>
    <li><a href="#error-type">The Error Type</a></li>
    <li><a href="#stdlib">Alternate Standard Library</a></li>
    <li><a href="#preprocessor-alternatives">Alternatives to the Preprocessor</a></li>
    </ol>
  </li>
  <li><a href="#tetris">Milestone: Tetris Implemented in Zig</a></li>
  <li><a href="#resources">Resources</a></li>
</ol>
<h2 id="design-decisions">Design Decisions</h2>
<h3 id="debug-release">Widely Diverging Debug and Release Builds</h3>
<p>
  Zig has the concept of a <em>debug build</em> vs a <em>release build</em>.
  Here is a comparison of priorities for debug mode vs release mode:
</p>
<table>
  <tr>
    <th></th>
    <th style="width: 44%;">Debug Mode</th>
    <th>Release Mode</th>
  </tr>
  <tr>
    <th>Time Spent Compiling</th>
    <td>
      Code must compile fast. Use all manner of caching, shared objects,
      multithreading, whatever must be done in order to produce a binary
      as soon as possible.
    </td>
    <td>
      Making a release build could take orders of magnitude longer than
      a debug build and that is acceptable.
    </td>
  </tr>
  <tr>
    <th>Runtime Performance</th>
    <td>
      Could be order of magnitude slower than release build and that is
      acceptable.
    </td>
    <td>
      Optimal performance. Aggressive optimizations. Take the time needed
      to produce a highly efficient runtime efficiency. No compromises here.
    </td>
  </tr>
  <tr>
    <th>Undefined Behavior</th>
    <td>
      What <em>would</em> be undefined behavior in a release build, is defined
      behavior in a debug build, and that is for the runtime to trap. That is,
      crash. This includes things like array bounds checking, integer overflow,
      reaching unreachable code. Not all undefined behavior can be caught, but
      a comfortably large amount can.
    </td>
    <td>
      Undefined behavior in release mode has unspecified consequences, and this
      lets the optimizer produce optimal code.
    </td>
  </tr>
</table>
</p>
  The build mode is available to the source code via the expression
  <code>@import("builtin").mode</code>.
</p>
<p>
Note: Since this blog post, Zig has gained <a href="https://ziglang.org/documentation/master/#Build-Mode">two more release modes</a>:
</p>
<ul>
  <li>Release Safe</li>
  <li>Release Small</li>
</ul>

<h3 id="c-abi">Complete C ABI Compatibility</h3>
<p>
Part of being pragmatic is recognizing C's existing success. Interop
with C is crucial. Zig embraces C like the mean older brother who you are a little
afraid of but you still want to like you and be your friend.
</p>
<p>
In Zig, functions look like this:
</p>
<pre><code class="language-zig">fn doSomething() {
    // ...
}</code></pre>
<p>
The compiler is free to inline this function, change its parameters,
and otherwise do whatever it wants, since this is an internal function.
However if you decide to export it:
</p>
<pre><code class="language-zig">export fn doSomething() {
    // ...
}</code></pre>
<p>
Now this function has the C ABI, and the name shows up in the symbol table
verbatim. Likewise, you can declare an external function prototype:
</p>
<pre><code class="language-zig">extern fn puts(s: [*]const u8) c_int;</code></pre>
<p>
In Zig, like in C, you typically do not create a "wrapper" or "bindings" to
a library, you just use it. But if you had to type out or generate all the
extern function prototypes, this would be a binding. That is why Zig has the ability
to parse .h files:
</p>
<pre><code class="language-zig">use @cImport({
    @cInclude("stdio.h");
});</code></pre>
<p>
This exposes all the symbols in stdio.h - including the <code>#define</code> statements -
to the zig program, and then you can call <code>puts</code> or <code>printf</code> just like
you would in C.
</p>
<p>
One of Zig's use cases is
<a href="http://tiehuis.github.io/iterative-replacement-of-c-with-zig">slowly transitioning a large C project to Zig</a>.
Zig can produce simple .o files for linking against other .o files, and it can
also generate .h files based on what you export. So you could write part of your
application in C and part in Zig, link all the .o files together and everything
plays nicely with each other.
</p>
<h3 id="maybe-type">Optional Type Instead of Null Pointer</h3>
<p>
One area that Zig provides safety without compromising efficiency or
readability is with the optional type.
</p>
<p>
The question mark symbolizes the optional type. You can convert a type to an optional
type by putting a question mark in front of it, like this:
</p>
<pre><code class="language-zig">// normal integer
const normal_int: i32 = 1234;

// optional integer
const optional_int: ?i32 = 5678;</code></pre>
<p>
Now the variable <code>optional_int</code> could be an <code>i32</code>, or <code>null</code>.
</p>
<p>
Instead of integers, let's talk about pointers. Null references are the source of many runtime
exceptions, and even stand accused of being
<a href="https://www.lucidchart.com/techblog/2015/08/31/the-worst-mistake-of-computer-science/">the worst mistake of computer science</a>.
</p>
<p>Zig does not have them.</p>
<p>
Instead, you can use an optional pointer. This secretly compiles down to a normal pointer,
since we know we can use 0 as the null value for the maybe type. But the compiler
can check your work and make sure you don't assign null to something that can't be null.
</p>
<p>
Typically the downside of not having null is that it makes the code more verbose to
write. But, let's compare some equivalent C code and Zig code.
</p>
<p>
Task: call malloc, if the result is null, return null.
</p>
<p>C code</p>
<pre><code class="language-c">// malloc prototype included for reference
void *malloc(size_t size);

struct Foo *do_a_thing(void) {
    char *ptr = malloc(1234);
    if (!ptr) return NULL;
    // ...
}</code></pre>
<p>Zig code</p>
<pre><code class="language-zig">// malloc prototype included for reference
extern fn malloc(size: size_t) ?[*]u8;

fn doAThing() ?*Foo {
    const ptr = malloc(1234) orelse return null;
    // ...
}</code></pre>
<p>
  Here, Zig is at least as convenient, if not more, than C. And, the type of "ptr"
  is <code>[*]u8</code> <em>not</em> <code>?[*]u8</code>. The <code>orelse</code> operator
  unwrapped the maybe type and therefore <code>ptr</code> is guaranteed to be non-null everywhere
  it is used in the function.
</p>
<p>
  The other form of checking against NULL you might see looks like this:
</p>
<pre><code class="language-c">void do_a_thing(struct Foo *foo) {
    // do some stuff

    if (foo) {
        do_something_with_foo(foo);
    }

    // do some stuff
}</code></pre>
<p>
  In Zig you can accomplish the same thing:
</p>
<pre><code class="language-zig">fn doAThing(optional_foo: ?*Foo) {
    // do some stuff

    if (optional_foo) |foo| {
      doSomethingWithFoo(foo);
    }

    // do some stuff
}</code></pre>
<p>
Once again, the notable thing here is that inside the if block,
<code>foo</code> is no longer an optional pointer, it is a pointer, which
cannot be null.
</p>
<p>
One benefit to this is that functions which take pointers as arguments can
be annotated with the "nonnull" attribute - <code>__attribute__((nonnull))</code> in
<a href="https://gcc.gnu.org/onlinedocs/gcc-4.0.0/gcc/Function-Attributes.html">GCC</a>.
The optimizer can sometimes make better decisions knowing that pointer arguments
cannot be null.
</p>
<p>
Note: when this blog post was written, Zig did not distinguish between
Single Item Pointers and Unknown Length Pointers. You can
<a href="https://ziglang.org/documentation/master/#Pointers">read about this in the documentation</a>.
</p>

<h3 id="errors">Errors</h3>
<p>
One of the distinguishing features of Zig is its exception handling strategy.
</p>
<p>
Zig introduces two primitive types:
</p>
<ul>
  <li>Error Sets</li>
  <li>Error Unions</li>
</ul>
<p>
An error set can be declared like this:
</p>
<pre><code class="language-zig">const FileOpenError = error {
  FileNotFound,
  OutOfMemory,
  UnexpectedToken,
};</code></pre>
<p>
An error set is a lot like an enum, except errors from different error sets
which share a name, are defined to have the same numerical value. So each
error name has a globally unique integer associated with it. The integer value
0 is reserved.
</p>
<p>
You can refer to these error values with field access syntax, such as
<code>FileOpenError.FileNotFound</code>. There is syntactic sugar for creating an ad-hoc
error set and referring to one of its errors: <code>error.SomethingBroke</code>. This
is equivalent to <code>error{SomethingBroke}.SomethingBroke</code>.
</p>
<p>
In the same way that pointers cannot be null, an error set value is always an error.
</p>
<pre><code class="language-zig">const err = error.FileNotFound;</code></pre>
<p>
Most of the time you will not find yourself using an error set type. Instead,
likely you will be using the error union type. Error unions are created with
the binary operator <code>!</code>, with the error set on the left and any other
type on the right: <code>ErrorSet!OtherType</code>.
</p>
<p>
Here is a function to parse a string into a 64-bit integer:
</p>
<pre><code class="language-zig">const ParseError = error {
    InvalidChar,
    Overflow,
};

pub fn parseU64(buf: []const u8, radix: u8) ParseError!u64 {
    var x: u64 = 0;

    for (buf) |c| {
        const digit = charToDigit(c);

        if (digit &gt;= radix) {
            return error.InvalidChar;
        }

        // x *= radix
        if (@mulWithOverflow(u64, x, radix, &amp;x)) {
            return error.Overflow;
        }

        // x += digit
        if (@addWithOverflow(u64, x, digit, &amp;x)) {
            return error.Overflow;
        }
    }

    return x;
}</code></pre>
<p>
Notice the return type is <code>ParseError!u64</code>. This means that the function
either returns an unsigned 64 bit integer, or one of the <code>ParseError</code> errors.
</p>
<p>
Within the function definition, you can see some return statements that return
an error set value, and at the bottom a return statement that returns a <code>u64</code>.
Both types implicitly cast to <code>ParseError!u64</code>.
</p>
<p>
Note: this blog post was written before Zig had the concept of
<a href="https://ziglang.org/documentation/master/#Error-Set-Type">Error Sets</a> vs
<a href="https://ziglang.org/documentation/master/#The-Global-Error-Set">anyerror</a>, and
before Zig had <a href="https://ziglang.org/documentation/master/#Inferred-Error-Sets">Error Set Inference</a>.
Most functions in Zig can rely on error set inference, which would make the prototype of <code>parseU64</code> 
look like this:
</p>
<pre><code class="language-zig">pub fn parseU64(buf: []const u8, radix: u8) !u64 {
    ...</code></pre>
<p>
What it looks like to use this function varies depending on what you're
trying to do. One of the following:
</p>
<ul>
  <li>You want to provide a default value if it returned an error.</li>
  <li>If it returned an error then you want to return the same error.</li>
  <li>You know with complete certainty it will not return an error, so want to unconditionally unwrap it.</li>
  <li>You want to take a different action for each possible error.</li>
</ul>
<p>If you want to provide a default value, you can use the <code>catch</code> expression:</p>
<pre><code class="language-zig">fn doAThing(str: []u8) void {
    const number = parseU64(str, 10) catch 13;
    // ...
}</code></pre>
<p>
In this code, <code>number</code> will be equal to the successfully parsed string, or
a default value of 13. The type of the right hand side of the <code>catch</code> expression must
match the unwrapped error union type, or of type <code>noreturn</code>.
</p>
<p>Let's say you wanted to return the error if you got one, otherwise continue with the
function logic:</p>
<pre><code class="language-zig">fn doAThing(str: []u8) !void {
    const number = parseU64(str, 10) catch |err| return err;
    // ...
}</code></pre>
<p>
  There is a shortcut for this. The <code>try</code> expression:
</p>
<pre><code class="language-zig">fn doAThing(str: []u8) !void {
    const number = try parseU64(str, 10);
    // ...
}</code></pre>
<p>
<code>try</code> evaluates an error union expression. If it is an error, it returns
from the current function with the same error. Otherwise, the expression results in
the unwrapped value.
</p>
<p>
  Maybe you know with complete certainty that an expression will never be an error.
  In this case you can do this:
</p>
<pre><code class="language-zig">const number = parseU64("1234", 10) catch unreachable;</code></pre>
<p>
Here we know for sure that "1234" will parse successfully. So we put the
<code>unreachable</code> keyword on the right hand side. <code>unreachable</code> generates
a panic in debug mode and undefined behavior in release mode. So, while we're debugging the
application, if there <em>was</em> a surprise error here, the application would crash
appropriately.
</p>
<p>There is no syntactic shortcut for <code>catch unreachable</code>. This encourages programmers
to think carefully before using it.</p>
<p>
Finally, you may want to take a different action for every situation. For that, we have
<code>if</code> combined with <code>switch</code>:
</p>
<pre><code class="language-zig">fn doAThing(str: []u8) {
    if (parseU64(str, 10)) |number| {
        doSomethingWithNumber(number);
    } else |err| switch (err) {
        error.Overflow =&gt; {
            // handle overflow...
        },
        // we promise that InvalidChar won't happen (or crash in debug mode if it does)
        error.InvalidChar =&gt; unreachable,
    }
}</code></pre>
<p>
The important thing to note here is that if <code>parseU64</code> is modified to return a different
set of errors, Zig will emit compile errors for handling impossible error codes, and for not handling
possible error codes.
</p>
<p>
The other component to error handling is defer statements.
In addition to an unconditional <code>defer</code>, Zig has <code>errdefer</code>,
which evaluates the deferred expression on block exit path if and only if
the function returned with an error from the block.
</p>
<p>
Example:
</p>
<pre><code class="language-zig">fn createFoo(param: i32) !Foo {
    const foo = try tryToAllocateFoo();
    // now we have allocated foo. we need to free it if the function fails.
    // but we want to return it if the function succeeds.
    errdefer deallocateFoo(foo);

    const tmp_buf = allocateTmpBuffer() orelse return error.OutOfMemory;
    // tmp_buf is truly a temporary resource, and we for sure want to clean it up
    // before this block leaves scope
    defer deallocateTmpBuffer(tmp_buf);

    if (param &gt; 1337) return error.InvalidParam;

    // here the errdefer will not run since we're returning success from the function.
    // but the defer will run!
    return foo;
}</code></pre>
<p>
The neat thing about this is that you get robust error handling without
the verbosity and cognitive overhead of trying to make sure every exit path
is covered. The deallocation code is always directly following the allocation code.
</p>
<p>
A couple of other tidbits about error handling:
</p>
<ul>
  <li>These primitives give enough expressiveness that it's completely practical
      that failing to check for an error is a compile error. If you really want
      to ignore the error, you can use <code>catch unreachable</code> and
      get the added benefit of crashing in debug mode if your assumption was wrong.
  </li>
  <li>
    Since Zig understands error types, it can pre-weight branches in favor of
    errors not occuring. Just a small optimization benefit that is not available
    in other languages.
  </li>
  <li>
    There are no C++ style exceptions or stack unwinding or anything fancy like that.
    Zig simply makes it convenient to pass error codes around.
  </li>
</ul>

<h3 id="stdlib">Alternate Standard Library</h3>
<p>
Part of the Zig project is providing an alternative to libc.
</p>
<p>
libc has a lot of useful stuff in it, but it also has
<a href="https://gcc.gnu.org/ml/gcc/1998-12/msg00083.html">cruft</a>.
Since we're starting fresh here, we can create a new API without some
of the mistakes of the 70s still haunting us, and with our 20-20 hindsight.
</p>
<p>
Further, calling dynamically linked functions is
<a href="http://ewontfix.com/18/">slow</a>. Zig's philosophy is that compiling
against the standard library in source form is worth it. In C this would be
called Link Time Optimization - where you generate Intermediate Representation
instead of machine code and then do another compile step at link time. In Zig,
we skip the middle man, and create a single compilation unit with everything
in it, then run the optimizations.
</p>
<p>
So, you can choose to link against libc and take advantage of it, or you can
choose to ignore it and use the Zig standard library instead. Note, however,
that virtually every C library you depend on probably also depends on libc, which
drags libc as a dependency into your project. Using libc is still a first
class use case for Zig.
</p>
<h3 id="preprocessor-alternatives">Alternatives to the Preprocessor</h3>
<p>
The C preprocessor is extremely powerful. Maybe a little <em>too</em> powerful.
</p>
<p>
The problem with the preprocessor is that it turns one language into
two languages that don't know about each other.
</p>
<p>
Here are some examples of where the preprocessor messes things up:
</p>
<ul>
  <li>The compiler cannot catch even simple syntax errors in code that is
    excluded via <code>#ifdef</code>.
  </li>
  <li>
    IDEs cannot implement a function, variable, or field renaming feature that
    works correctly. Among other mistakes, it will miss renaming things that are
    in code excluded via <code>#ifdef</code>.
  </li>
  <li>
    Preprocessor defines do not show up in debug symbols by default.
  </li>
  <li>
    <code>#include</code> is the single biggest contributor to slow compile times in both C and C++.
  </li>
  <li>
    Preprocessor defines are problematic for bindings generators for other languages.
  </li>
</ul>
<p>
Regardless of the flaws, C programmers find ourselves using the preprocessor
because it provides necessary features, such as conditional compilation,
a constant that can be used for array sizes, and generics.
</p>
<p>
Zig plans to provide better alternatives to solve these problems. For example,
the constant expression evaluator of Zig allows you to do this:
</p>
<pre><code class="language-zig">const array_len = 10 * 2 + 1;
const Foo = struct {
    array: [array_len]i32,
};</code></pre>
<p>
This is not an amazing concept, but it eliminates one use case for <code>#define</code>.
</p>
<p>
Next, conditional compilation. In Zig, compilation variables are available
via <code>@import("builtin")</code>.
</p>
<p>
The declarations available in this import evaluate to constant expressions.
You can write normal code using these constants:
</p>
<pre><code class="language-zig">const builtin = @import("builtin");
fn doSomething() {
    if (builtin.mode == builtin.Mode.ReleaseFast) {
        // do the release behavior
    } else {
        // do the debug behavior
    }
}</code></pre>
<p>
This is
<a href="zig-programming-language-blurs-line-compile-time-run-time.html">guaranteed to leave out the if statement when the code is generated</a>.
</p>
<p>
One use case for conditional compilation is demonstrated in
<a href="http://libsound.io/">libsoundio</a>:
</p>
<pre><code class="language-c">static const enum SoundIoBackend available_backends[] = {
#ifdef SOUNDIO_HAVE_JACK
    SoundIoBackendJack,
#endif
#ifdef SOUNDIO_HAVE_PULSEAUDIO
    SoundIoBackendPulseAudio,
#endif
#ifdef SOUNDIO_HAVE_ALSA
    SoundIoBackendAlsa,
#endif
#ifdef SOUNDIO_HAVE_COREAUDIO
    SoundIoBackendCoreAudio,
#endif
#ifdef SOUNDIO_HAVE_WASAPI
    SoundIoBackendWasapi,
#endif
    SoundIoBackendDummy,
};</code></pre>
<p>
Here, we want a statically sized array to have different contents depending on
whether we have certain libraries present.
</p>
<p>
In Zig, it would look something like this:
</p>
<pre><code class="language-zig">const opts = @import("build_options");
const available_backends =
    (if (opts.have_jack)
        []SoundIoBackend{SoundIoBackend.Jack}
    else
        []SoundIoBackend{})
    ++
    (if (opts.have_pulse_audio)
        []SoundIoBackend{SoundIoBackend.PulseAudio}
    else
        []SoundIoBackend{})
    ++
    (if (opts.have_alsa)
        []SoundIoBackend{SoundIoBackend.Alsa}
    else
        []SoundIoBackend{})
    ++
    (if (opts.have_core_audio)
        []SoundIoBackend{SoundIoBackend.CoreAudio}
    else
        []SoundIoBackend{})
    ++
    (if (opts.have_wasapi)
        []SoundIoBackend{SoundIoBackend.Wasapi}
    else
        []SoundIoBackend{})
    ++
    []SoundIoBackend{SoundIoBackend.Dummy};
</code></pre>
<p>
Here we take advantage of the compile-time array concatenation operator, <code>++</code>.
It's a bit more verbose than the C equivalent, but the important thing is that it's
one language, not two.
</p>
<p>
Finally, generics.
<a href="zig-programming-language-blurs-line-compile-time-run-time.html">Zig implements generics by allowing programmers to mark
parameters to functions as known at compile-time</a>.
</p>
<h2 id="tetris">Milestone: Tetris Implemented in Zig</h2>
<p>
This past week I achieved a fun milestone: a fully playable Tetris clone
implemented in Zig, with the help of libc,
<a href="http://www.glfw.org/">GLFW</a>, and
<a href="http://www.libpng.org/pub/png/libpng.html">libpng</a>.
</p>
<p>
If you're using Linux on the x86_64 architecture, which is currently the
only supported target, you could
<a href="https://ziglang.org/download/">download a Zig build</a>
and then
<a href="https://github.com/andrewrk/tetris#building-and-running">build this Tetris game</a>.
</p>
<p>
Otherwise, here's a video of me demoing it:
</p>
<iframe width="560" height="315" src="https://www.youtube.com/embed/AiintPutWrE" frameborder="0" allowfullscreen></iframe>

<h2 id="resources">Resources</h2>
<p>
If you are interested in the language, feel free to participate.
</p>
<ul>
  <li><a href="https://ziglang.org/">Home Page</a></li>
  <li><strong>Source code and issue tracker</strong>:
    <a href="https://github.com/ziglang/zig">https://github.com/ziglang/zig</a>
  </li>
  <li><strong>IRC channel</strong>: <code>#zig</code> on Freenode</li>
  <li><strong>Financial Support</strong>: <a href="https://github.com/users/andrewrk/sponsorship">Become a sponsor</a></li>
  <li><a href="https://ziglang.org/documentation/master/">Official Documentation</a></li>
</ul>
]]></description>
      </item>
      <item>
         <title>Turn Your Raspberry Pi into a Music Player Server</title>
         <pubDate>Fri, 20 Jun 2014 00:58:08 GMT</pubDate>

         <link>https://andrewkelley.me/post/raspberry-pi-music-player-server.html</link>
         <guid>https://andrewkelley.me/post/raspberry-pi-music-player-server.html</guid>
         <description><![CDATA[<h1>Turn Your Raspberry Pi Into a Music Player Server</h1>
<p>
A few months ago I published
<a href="quest-build-ultimate-music-player.html">My Quest to Build the Ultimate Music Player</a>,
where I described some of the trials and tribulations that led to
<a href="http://groovebasin.com/">Groove Basin</a>, an open-source
music player server that I've been building off and on for almost 4 years.
</p>
<p>
It ships with a web-based client, which looks like this:
</p>
<a href="http://superjoe.s3.amazonaws.com/blog-files/raspberry-pi-music-player-server/groovebasin-screenshot.png">
<img src="http://superjoe.s3.amazonaws.com/blog-files/raspberry-pi-music-player-server/groovebasin-screenshot.png" alt="Groove Basin screenshot">
</a>
<p>
You can also tinker with the <a href="http://demo.groovebasin.com/">live demo version</a>.
</p>
<p>
If you install this on a Raspberry Pi, you can attach speakers to it and use it as a music
player which you can control remotely, and you can remotely listen to your music by pressing
the "stream" button in the browser.
</p>
<p>
Before I get into it, however, I would like to point out that if you're deciding whether or
not to get a Raspberry Pi, the answer is <strong>no</strong>.
The Raspberry Pi is overhyped - this is why I'm
writing this guide - and there are much better alternatives. I'll mention one here,
the <a href="http://beagleboard.org/Products/BeagleBone+Black">Beagle Bone Black</a>.
Update: another good one:
<a href="http://www.solid-run.com/products/hummingboard/">Hummingboard</a>
</p>
<p>
Why you should get this instead:
</p>
<ul>
  <li>Faster CPU and Memory
    <ul>
      <li>1GHz processor instead of 700MHz</li>
      <li>DDR3 instead of SDRAM</li>
    </ul>
  </li>
  <li>
    <p>
   It can run Debian or Ubuntu armhf directly instead of having to run something like
   Raspian. It's silly that Raspbian exists when there is already an armhf port of Debian.
   If you just install normal armhf Ubuntu on the Beagle Bone Black then this entire guide
   is unnecessary and you can just do
    </p>

   <pre>
# apt-add-repository ppa:andrewrk/libgroove
# apt-get update
# apt-get install libgroove-dev</pre>

   <p>
   And presto, you're done.
   </p>

   <p>
   In fact, libgroove is in
   <a href="http://packages.qa.debian.org/libg/libgroove.html">Debian Testing</a> and
   <a href="https://launchpad.net/ubuntu/utopic/+source/libgroove">Ubuntu Utopic Unicorn</a>,
   so in a year or so when these distributions are updated,
   you won't even need to add the extra PPA.
   </p>
  </li>
  <li>
   Debian <a href="https://wiki.debian.org/RaspberryPi">officially recommends <em>against</em> the Raspberry Pi</a>, notably because there is non-free software required to run it. Debian
   specifically endorses the Beagle Bone Black.
  </li>
</ul>

<p>
If you are like me, and you unfortunately purchased a Raspberry Pi before you became
educated about better options, then you'll have to jump through some hoops to get
this working. This article will hold your hand and guide you through all the hoops so
that you don't have to waste time figuring it out yourself.
</p>
<p>
I'll start this guide at the point where you have a fresh Raspberry Pi and don't even
have an operating system yet. If you're past this point then
<a href="#install-groove-basin">skip ahead</a>.
</p>
<h2 id="table-of-contents">Table of Contents</h2>
<ol>
  <li><a href="#table-of-contents">Table of Contents</a></li>
  <li><a href="#install-raspbian">Install Raspbian and get SSH Access</a></li>
  <li><a href="#install-groove-basin">Install Groove Basin</a></li>
</ol>
<h2 id="install-raspbian">Install Raspbian and get SSH Access</h2>
<p>
Head over to the
<a href="http://www.raspberrypi.org/downloads/">Raspberry Pi downloads page</a>
and grab the Raspbian Debian Wheezy torrent (or download directly if you're not l33t).
</p>
<p>
Unzip to get the .img file out and flash it to the biggest SD card you have. You'll want
lots of room for music!
</p>
<p>
I'm on Ubuntu - all I had to do was right-click on the .img file in nautilus,
Open With, Disk Image Writer:
</p>
<img alt="Disk Image Writer" src="http://superjoe.s3.amazonaws.com/blog-files/raspberry-pi-music-player-server/disk-image-writer.png">
<p>
I'm sure there are plenty of ways to get the job done, this was easiest for me.
</p>
<p>
Once that's done, find a keyboard, monitor, and HDMI cable so that you can see
what you're doing. Our goal is to get SSH access going as soon as possible so
that we can work on the Pi without plugging things into it other than the power
and network cables.
</p>
<p>
Once the Pi boots up for the first time it gives you a menu of things you can do.
Here's what I did:
</p>
<ul>
  <li>Expand the file system to fit the full SD card.</li>
  <li>Set pi user password.</li>
  <li>Advanced Options, enable SSH server.</li>
</ul>
<p>
Now let's set it up so that it always binds to the same IP address when it boots up.
These are the settings I used, obviously you should tweak them to your network's needs:
</p>
<pre>sudo vi /etc/network/interfaces</pre>
<p>
Replace <code>iface eth0 inet dhcp</code> with:
</p>
<pre>
auto eth0
iface eth0 inet static
    address 192.168.1.99
    netmask 255.255.255.0
    gateway 192.168.1.1</pre>

<p>
Now we can unplug the TV and keyboard, we won't be needing this junk anymore.
Plug that Raspberry Pi into your network and power it on!
</p>
<p>
On your normal computer that you're used to using, you can now ssh to the Pi,
something like this:
</p>
<pre>ssh pi@192.1.68.1.99</pre>
<p>
I like to put an entry in my <code>~/.ssh/config</code> file like this:
</p>
<pre>
host pi
hostname 192.168.1.99
user pi</pre>
<p>
It makes you type in your password, but we can fix that:
</p>
<pre>ssh-copy-id pi</pre>
<p>
Now connecting to the Pi is as simple as <code>ssh pi</code>.
</p>
<p>
The first thing to do here is update all the outdated packages to the latest.
</p>
<pre>
$ sudo apt-get update
$ sudo apt-get dist-upgrade</pre>
<p>
Hmm, what's that I see there?
</p>
<pre>
...
Unpacking replacement raspberrypi-bootloader
...
</pre>
<p>
Bootloader replaced huh? Better reboot to make sure that still works.
</p>
<p>
Alright, at this point we are able to ssh into our Raspberry Pi and all the packages
that come installed are fully updated.
</p>
<h2 id="install-groove-basin">Install Groove Basin</h2>
<p>
First let's install some packages:
</p>
<pre>$ sudo apt-get install htop vim git cmake screen</pre>
<p>
I recommend that you do this work in something like
<a href="http://www.gnu.org/software/screen/">screen</a> or
<a href="http://tmux.sourceforge.net/">tmux</a> so that if the connection is dropped,
the commands we're running will continue. Also this allows us to disconnect and go
do something else while the Pi crunches numbers.
</p>
<p>
I'm going to explain how to do this one step at a time for clarity. However, note that
there are essentially 3 compilations that will take a very long time, so if you want to
start those 3 in parallel and then walk away from the computer for 8 hours or so, you
can skip around this guide and start them all in parallel. Those 3 things are the
<code>make</code> steps of:
</p>
<ul>
  <li>SDL2</li>
  <li>libav</li>
  <li>Node.js</li>
</ul>
<h3>libgroove, Part 1</h3>
<p>
Get the libgroove source code and create a build folder inside of the source:
</p>
<pre>
$ cd
$ git clone https://github.com/andrewrk/libgroove
$ cd libgroove
$ mkdir build
$ cd build</pre>
<p>
Let's build in debug mode so that if we happen upon any errors we can get a useful stack trace.
</p>
<p>
Note, you can skip the following step - it takes a minute or two to complete,
this particular command is just for your information, and I have reproduced the
output below:
</p>
<pre>$ cmake .. -DCMAKE_BUILD_TYPE=Debug</pre>
<pre>
Installation Summary
--------------------
* Install Directory            : /usr/local
* Build libgroove              : missing dependencies
* Build libgrooveplayer        : missing dependencies
* Build libgrooveloudness      : missing dependencies
* Build libgroovefingerprinter : yes

Bundled Dependencies
--------------------
* SDL2                         : ready to build
* libav                        : missing dependencies, see below
* libebur128                   : ready to build

System Dependencies
-------------------
* C99 Compiler                 : OK
* threads                      : OK
* SDL2                         : not found - will use bundled version
* ebur128                      : not found - will use bundled version
* chromaprint                  : not found
* libavformat                  : not found - will use bundled version
* libavcodec                   : not found - will use bundled version
* libavfilter                  : not found - will use bundled version
* libavutil                    : not found - will use bundled version
* yasm                         : not found
* bzip2                        : not found
* mp3lame                      : not found
* zlib                         : OK</pre>

<p>
It's missing these libraries:
</p>
<ul>
  <li>chromaprint</li>
  <li>libebur128</li>
  <li>SDL2</li>
  <li>libav</li>
</ul>
<p>
We could let libgroove install with the bundled dependencies, but it will be easier to
just install those dependencies on the system first. Let's do that.
</p>
<h3>chromaprint</h3>
<p>
Luckily chromaprint is in the repository already:
</p>
<pre>$ sudo apt-get install libchromaprint-dev</pre>
<h3>libebur128</h3>
<p>
Next we compile the easy one, libebur128.
</p>
<pre>
$ cd
$ git clone https://github.com/jiixyj/libebur128
$ cd libebur128
$ mkdir build
$ cd build
$ cmake .. -DCMAKE_BUILD_TYPE=Debug</pre>
<p>
Oops, looks like we're missing a dependency:
</p>
<pre>
-- checking for module 'speexdsp'
--   package 'speexdsp' not found</pre>

<p>
Better install that.
</p>
<pre>$ sudo apt-get install libspeexdsp-dev</pre>
<p>Let's try that configure line again:</p>
<pre>$ cmake .. -DCMAKE_BUILD_TYPE=Debug</pre>
<pre>
-- checking for module 'speexdsp'
--   found speexdsp, version 1.2rc1
-- speexdsp library dirs: 
-- speexdsp cflags: 
-- speexdsp include dirs: 
-- speexdsp libraries: speexdsp
-- speexdsp ldflags: 
-- status          found / disabled --
-- queue.h:        yes     using system copy of queue.h
-- speexdsp:       yes     no 
-- not building tests, set ENABLE_TESTS to ON to enable
-- Configuring done
-- Generating done
-- Build files have been written to: /home/pi/libebur128/build</pre>

<p>
That's better.
</p>
<p>
Now compile and then install the code:
</p>
<pre>
$ make
$ sudo make install</pre>

<pre>
[ 50%] Built target ebur128
[100%] Built target ebur128_static
Install the project...
-- Install configuration: "Debug"
-- Up-to-date: /usr/local/include/ebur128.h
-- Installing: /usr/local/lib/arm-linux-gnueabihf/libebur128.so.1.0.1
-- Installing: /usr/local/lib/arm-linux-gnueabihf/libebur128.so.1
-- Installing: /usr/local/lib/arm-linux-gnueabihf/libebur128.so
-- Installing: /usr/local/lib/arm-linux-gnueabihf/libebur128.a</pre>

<p>
Argh, it put the library files in <code>/usr/local/lib/arm-linux-gnueabihf/</code>
due to a <a href="https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=739876i">bug</a>
in the Debian cmake package.
</p>

<p>
Let's hack around that:
</p>
<pre>
$ sudo mv /usr/local/lib/arm-linux-gnueabihf/* /usr/local/lib/
$ sudo rmdir /usr/local/lib/arm-linux-gnueabihf
$ sudo ldconfig</pre>

<h3>SDL2</h3>
<p>
Next let's get SDL2 going. What's wrong with the SDL that comes with the Raspberry Pi?
Well it's version 1.2 and libgroove depends on version 2.
</p>

<p>
Find the URL for the source code of the latest SDL2 version on the
<a href="https://www.libsdl.org/download-2.0.php">SDL download page</a>.
</p>

<pre>
$ cd
$ wget https://www.libsdl.org/release/SDL2-2.0.3.tar.gz
$ tar xvf SDL2-2.0.3.tar.gz 
$ cd SDL2-2.0.3/</pre>

<p>
SDL needs an audio backend to work, so we install that now, before the configure command:
</p>

<pre>$ sudo apt-get install libasound2-dev</pre>

<p>
We only need the audio features of SDL2 and in fact some of the video stuff can cause
compilation problems. So we'll disable all the features we don't need when we configure.
</p>

<pre>
$ ./configure --enable-audio --disable-video --disable-render --disable-events --disable-joystick --disable-haptic --disable-power --disable-file --disable-timers --disable-loadso --disable-cpuinfo</pre>

<p>
This could take a while, but when it's done you should see a line like this:
</p>

<pre>Audio drivers   : disk dummy oss alsa(dynamic)</pre>

<p>
It's important that you have an audio driver other than disk, dummy, and oss.
</p>

<pre>$ make</pre>

<p>
Find something to do, this is going to take upwards of an hour to complete.
Or if you want to slap your poor Raspberry Pi into submission, this would be
the time to skip around in this guide and get libav compiling at the same time.
But again, I'm going to pretend that you're doing this sequentially and let you
deal with thinking about how to skip around the article.
</p>

<p>
So at this point that long compilation process succeeded and we're ready to install SDL2:
</p>

<pre>$ sudo make install</pre>

<h3>libav</h3>

<p>Let's go back to your home folder (or wherever you decided to do this):</p>

<pre>$ cd</pre>

<p>
Grab the URL to the latest libav 10 release from the
<a href="http://www.libav.org/download.html">libav downloads page</a>.
</p>

<pre>
$ wget http://www.libav.org/releases/libav-10.1.tar.gz
$ tar xvf libav-10.1.tar.gz 
$ cd libav-10.1/</pre>

<p>
Let's get some prerequisites out of the way and then start configuring:
</p>

<pre>
$ sudo apt-get install libmp3lame-dev libvorbis-dev
$ ./configure --enable-shared --enable-debug --disable-static --enable-gpl --enable-libmp3lame --enable-libvorbis</pre>

<p>
Again the Pi is going to have to work <em>really hard</em> to complete this configure command,
especially if you're simultaneously compiling SDL2, so don't worry if it takes a minute or two.
</p>

<pre>$ make</pre>

<p>
This is going to take upwards of 8 hours. Seriously, I'd start this one and then go to bed.
If you're trying to start all the compilations simultaneously, you might also want to start
Node.js compiling as well.
</p>

<p>After libav compilation succeeds:</p>

<pre>$ sudo make install</pre>

<p>
Now we've finally finished installing libgroove's dependencies and we can finally move on to
installing libgroove itself.
</p>

<h3>libgroove, Part 2</h3>

<p>
So at this point, you've waited a very long time and the Pi has successfully finished
compiling libav and SDL2, and you have installed both of them. If this is not true, then
you need to figure out why and fix it before progressing with this guide.
</p>

<pre>
$ sudo ldconfig
$ cd ~/libgroove/build/
$ cmake .. -DCMAKE_BUILD_TYPE=Debug</pre>
<pre>
Installation Summary
--------------------
* Install Directory            : /usr/local
* Build libgroove              : yes
* Build libgrooveplayer        : yes
* Build libgrooveloudness      : yes
* Build libgroovefingerprinter : yes

Bundled Dependencies
--------------------
* SDL2                         : using system library
* libav                        : using system libraries
* libebur128                   : using system library

System Dependencies
-------------------
* C99 Compiler                 : OK
* threads                      : OK
* SDL2                         : OK
* ebur128                      : OK
* chromaprint                  : OK
* libavformat                  : OK
* libavcodec                   : OK
* libavfilter                  : OK
* libavutil                    : OK</pre>

<p>
Ah that output looks much better than before.
</p>

<pre>
$ make
</pre>

<p>
This make should be relatively quick.
</pre>

<pre>
$ sudo make install
$ sudo ldconfig</pre>

<p>
At this point we have the necessary libraries installed:
</p>

<pre>$ ls /usr/local/lib/</pre>
<pre>
libavcodec.so           libebur128.a                     libgroove.so.4
libavcodec.so.55        libebur128.so                    libgroove.so.4.1.0
libavcodec.so.55.34.1   libebur128.so.1                  libSDL2-2.0.so.0
libavdevice.so          libebur128.so.1.0.1              libSDL2-2.0.so.0.2.1
libavdevice.so.54       libgroove.a                      libSDL2.a
libavdevice.so.54.0.0   libgroovefingerprinter.a         libSDL2.la
libavfilter.so          libgroovefingerprinter.so        libSDL2main.a
libavfilter.so.4        libgroovefingerprinter.so.4      libSDL2.so
libavfilter.so.4.2.0    libgroovefingerprinter.so.4.1.0  libSDL2_test.a
libavformat.so          libgrooveloudness.a              libswscale.so
libavformat.so.55       libgrooveloudness.so             libswscale.so.2
libavformat.so.55.12.0  libgrooveloudness.so.4           libswscale.so.2.1.2
libavresample.so        libgrooveloudness.so.4.1.0       pkgconfig
libavresample.so.1      libgrooveplayer.a                python2.7
libavresample.so.1.1.0  libgrooveplayer.so               python3.2
libavutil.so            libgrooveplayer.so.4             site_ruby
libavutil.so.53         libgrooveplayer.so.4.1.0
libavutil.so.53.3.0     libgroove.so</pre>

<h3>Node.js</h3>
<p>
Now we need
<a href="http://nodejs.org">Node.js</a>. Get the latest stable source code from the
<a href="http://nodejs.org/download/">downloads page</a>.
</p>

<pre>
$ cd
$ wget http://nodejs.org/dist/v0.10.29/node-v0.10.29.tar.gz
$ tar xvf node-v0.10.29.tar.gz 
$ cd node-v0.10.29/
$ ./configure
$ make</pre>

<p>
This compilation process will take several hours.
</p>
<p>
Once it's done:
</p>

<pre>$ sudo make install</pre>

<h3>Groove Basin</h3>
<p>
Now it is time to start Groove Basin, the music player server.
</p>

<pre>
$ cd
$ git clone https://github.com/andrewrk/groovebasin
$ cd groovebasin/
$ npm run build</pre>

<p>
This step can take several minutes - it downloads and compiles Groove Basin dependencies.
</p>

<p>
Let's make the music directory if we don't already have one.
</p>

<pre>$ mkdir ~/music/</pre>

<p>
Copy all your music there at this point if you have any.
</p>

<pre>$ node lib/server.js</pre>

<p>
Now you should be up and running. If you want to change configuration options,
kill the server with Ctrl+C and edit <code>config.js</code>.
</p>
<p>
Enjoy! Feel free to follow
<a href="https://github.com/andrewrk/groovebasin">Groove Basin on GitHub</a>,
<a href="https://github.com/andrewrk/groovebasin/issues/new">file a bug report or feature request</a>, or join <code>#libgroove</code> on irc.freenode.org to discuss or get help troubleshooting.
</p>
<p>
Pull requests are welcome, especially ones that make <a href="https://github.com/andrewrk/groovebasin.com">groovebasin.com</a> look nicer.
</p>
]]></description>
      </item>
      <item>
         <title>Laptop Review - Bonobo Extreme</title>
         <pubDate>Thu, 12 Jun 2014 23:25:14 GMT</pubDate>

         <link>https://andrewkelley.me/post/laptop-review-bonobo-extreme.html</link>
         <guid>https://andrewkelley.me/post/laptop-review-bonobo-extreme.html</guid>
         <description><![CDATA[<h1>Bonobo Extreme Review</h1>
<p>
I used to have a
<a href="http://www.dell.com/us/business/p/xps-13-linux/pd">Dell Inspiron XPS 13 Developer Edition</a>.
It was fine, but I needed something with a better graphics card because I wanted to learn
some of the new <a href="http://www.opengl.org/sdk/docs/man/">OpenGL</a> features by prototyping
a <a href="https://www.youtube.com/watch?v=1bqlZaXzh4c">3D space flight simulator game</a>. 
</p>
<p>
I decided to buy the <a href="https://system76.com/laptops/model/bonx8">Bonobo Extreme</a> laptop
which comes with <a href="http://www.ubuntu.com/">Ubuntu</a> pre-installed.
</p>
<p>
This thing is a monster. It's huge, bulky, and heavy. Check out the power brick:
</p>
<img src="http://superjoe.s3.amazonaws.com/blog-files/laptop-review-bonobo-extreme/bonox-brick.jpg">
<p>
Yes, that is a standard IEC 60320 C13 power supply cord going into the laptop brick.
</p>
<p>
Here are the peripherals:
</p>
<img src="http://superjoe.s3.amazonaws.com/blog-files/laptop-review-bonobo-extreme/bonox-peripherals-left.jpg">
<br>
<img src="http://superjoe.s3.amazonaws.com/blog-files/laptop-review-bonobo-extreme/bonox-peripherals-right.jpg">
<br>
<img src="http://superjoe.s3.amazonaws.com/blog-files/laptop-review-bonobo-extreme/bonox-back.jpg">
<h2>Compliments</h2>
<ul>
  <li>Plenty fast and performant. Handles any game I've thrown at it, no problem. <code>make -j8</code> like a boss.</li>
  <li>The Ubuntu logo on the super key instead of Windows logo is pretty fun.</li>
  <li>Drivers and everything work out of the box. This is <em>huge</em>.</li>
  <li><strong>Battery life</strong> is surprisingly good for this beast of a machine.
      I can get upwards of 4 hours easily. Of course this goes down when I run
      the CPU or GPU hot.
  </li>
  <li>The keyboard button to turn off the LCD display is surprisingly handy.</li>
  <li>The built-in speakers are loud and sound decent for a laptop.</li>
</ul>
<h2>Criticism</h2>
<h3>Driver/Hardware</h3>
<ul>
  <li><strong>Dead pixels</strong> after 1 month of use. Now I'm 8 months in and I have an entire column of about 20
      dead pixels in a row. Based on reading the
      <a href="http://ubuntuforums.org/forumdisplay.php?f=341">support forum</a> this is normal.
  </li>
  <li>
    Sometimes it tries to connect to a wired network when there is no network cable plugged in.
    This may be an Ubuntu bug.
  </li>
  <li>
    The power cable is not a perfect fit. It feels like you must jam it in there
    and it's not clear when it is "in".
  </li>
  <li>
    Every two weeks or so, the system has a complete hardware failure and powers off.
    It's not due to heat, it's a weird hardware problem. And it's not only me; my
    room mate has the same laptop and it happens to him too.
  </li>
</ul>
<h3>Keyboard</h3>
<img src="http://superjoe.s3.amazonaws.com/blog-files/laptop-review-bonobo-extreme/bonox-keyboard.jpg">
<ul>
  <li>
  It's not possible to push Shift+Home or Shift+End, which is an extremely common and
  handy shortcut when text editing. I'd rather give up the entire numpad than give up
  Shift+Home and Shift+End.
  </li>
  <li>
  It took me a long time to not accidentally press spacebar with my left thumb when I
  wanted to press Alt. Spacebar goes pretty far to the left.
  </li>
  <li>
  The key between Spacebar and Right Alt is completely useless. It displays a pipe and
  backslash on it ("| \") but it actually inserts a "&lt;" or a "&gt;" depending on
  whether I press shift. None of these 4 things are more convenient than the canonical
  position of the respective key it replaces. I'd rather have the traditional key that
  pops the context menu for the widget that is in focus.
  </li>
  <li>
  There is a button to put the computer to sleep which is between Mute and Volume Down. I
  press it on accident sometimes.
  </li>
</ul>

<h3>Touchpad</h3>
<ul>
  <li>It is difficult to click without moving the mouse.</li>
  <li>The difference between left click and right click is subtle. It's very
  easy to do the wrong one.
  </li>
</ul>

<h2>Curiosity</h2>
<p>
I noticed that the HDD LED flashes whenever I toggle numlock. I wonder why that is.
It doesn't do that for caps lock.
</p>

<h2>Conclusion</h2>
<p>
Lots of room for improvement. I think it's overpriced given the problems that it has.
Getting a powerful machine for which you know for sure all the drivers will
work with Linux is nothing to sneeze at. However, it's probably worth it to spend your
money on a system for which the hardware has undergone more quality assurance.
</p>
]]></description>
      </item>
      <item>
         <title>My Quest to Build the Ultimate Music Player</title>
         <pubDate>Tue, 22 Apr 2014 17:39:04 GMT</pubDate>

         <link>https://andrewkelley.me/post/quest-build-ultimate-music-player.html</link>
         <guid>https://andrewkelley.me/post/quest-build-ultimate-music-player.html</guid>
         <description><![CDATA[<h1>My Quest to Build the Ultimate Music Player</h1>
<p>
Over the past few years, I have been slowly but surely building my own music player.
It's been a wild ride. The codebase has radically changed several times, but is always
converging on a better music listening experience.
</p>
<p>
In this article my goal is to take you along for the ride.
</p>
<p>
<a href="https://github.com/andrewrk/groovebasin">See the project on GitHub</a>
</p>
<h2 id="toc">Table of Contents</h2>
<ol>
  <li><a href="#toc">Table of Contents</a></li>
  <li><a href="#amarok14">I &lt;3 Amarok 1.4</a></li>
  <li><a href="#loudness-intro">A Short Explanation of Loudness Compensation</a></li>
  <li><a href="#amarok14-shortcomings">Shortcomings of Amarok 1.4</a></li>
  <li><a href="#laundry">My Laundry List of Music Player Features</a></li>
  <li><a href="#partybeat">"PartyBeat"</a></li>
  <li><a href="#tech">Fumbling Around with Technology</a></li>
  <li><a href="#growing">Growing Pains</a></li>
  <li><a href="#rejecting-mpd">Rejecting MPD</a></li>
  <li><a href="#building-a-backend">Building a Music Player Backend</a></li>
  <li><a href="#packaging">Packaging</a></li>
  <li><a href="#libav-contributions">Contributing to libav</a></li>
  <li><a href="#conclusion">Conclusion</a></li>
</ol>
<h2 id="amarok14">I &lt;3 Amarok 1.4</h2>
<p>
Back in 2009, my music player of choice was Amarok 1.4.
This was by far the best music player I had ever used on Windows, Mac, or Linux,
especially when combined with the wonderful
<a href="http://kde-apps.org/content/show.php?content=26073">ReplayGain plugin</a>.
Here's a screenshot:
</p>
<img src="http://s3.amazonaws.com/superjoe/blog-files/quest-build-ultimate-music-player/amarok-1.4-screenshot.png">
<p>
One way you can tell how much people loved this music player is by looking at the comments
on the release blog articles for Amarok <em>2.0</em>, which rebuilt the player from scratch and
took it in a completely different direction. Arguably, they should have picked a different
project name and logo. Look at some of these comments, how angry and vitriolic they are:
<a href="http://amarok.kde.org/en/releases/2.0">2.0</a>
<a href="http://amarok.kde.org/en/releases/2.1">2.1</a>
<a href="http://amarok.kde.org/en/releases/2.2">2.2</a>
</p>

<p>
Even now, 4 years later, the project is at version 2.8 and the release name is titled
<a href="http://amarok.kde.org/en/releases/2.8">"Return To The Origin"</a>:
</p>

<blockquote>Amarok 1.8 is titled "Return To The Origin" as we are bringing back the polish that many users loved from the original 1.x series!</blockquote>

<p>
That should give you an idea of how much respect Amarok 1.4 commanded.
</p>

<p>
Even so, it was not perfect. Notably, the ReplayGain plugin I mentioned above had several
shortcomings. Before I get into that, however, let me take a detour and explain what
ReplayGain, or more generally, loudness compensation, is and some of its implications.
</p>

<h2 id="loudness-intro">A Short Explanation of Loudness Compensation</h2>
<p>
Have you ever seen this 2-minute video explaining the Loudness War?
</p>
<iframe width="420" height="315" src="https://www.youtube.com/embed/3Gmex_4hreQ" frameborder="0" allowfullscreen></iframe>
<p>
The video demonstrates a trend in digital audio mastering where songs are
highly compressed to sound louder, and how this can compromise the integrity
of the music.
</p>
<p>
While thinking about building a music player, we're not going to make
moral judgments about whether or not compression is ruining music for
everybody. If users want to listen to highly compressed music, that's a
valid use case. So we have to consider a music library which contains both
compressed songs and dynamic songs.
</p>
<p>
Here is a song called <em>The Happiest Days of Our Lives</em> by Pink Floyd, mastered in 1979:
</p>
<img src="http://s3.amazonaws.com/superjoe/blog-files/quest-build-ultimate-music-player/pink-floyd-waveform.png">
<p>
Here is a song called <em>Saying Sorry</em> by Hawthorne Heights, mastered in 2006:
</p>
<img src="http://s3.amazonaws.com/superjoe/blog-files/quest-build-ultimate-music-player/saying-sorry-waveform.png">
<p>
It is immediately obvious that the second one is much louder than the other.
So what happens when they are played one after the other in a music player?
</p>
<p>
When the quieter song comes on first, the user reaches for the volume knob to
turn it up so they can hear. Oops. When the next song begins, a surge of
adrenaline shoots through the user's body as they scramble to turn the volume
down. This goes beyond poor usability; this problem can cause hearing loss.
</p>
<p>
The solution is to analyze each song before playing it to figure out how "loud"
it sounds to humans. Then the music player adjusts the playback volume of each
track to compensate for the perceived loudness. This way, the user does not
have to adjust the volume for each track that comes on.
</p>
<p>
The idea is simple enough, but it poses a few subtle challenges.
</p>
<p>
For one, the loudness of an individual track might be different than the
loudness of the album as a whole. A complete loudness compensation solution
has to take this into account, both during scanning and playback.
</p>
<p>
An even trickier problem is avoiding <strong>clipping</strong>. Music is composed of samples
which have a fixed range. For example in floating point format, samples can be
between 0.0 and 1.0. Even quiet songs usually have some samples which peak
at 1.0, for example on the drums. But we need to turn the volume up on these
quiet songs to make them sound as loud as the highly compressed ones.
</p>
<p>
If we naïvely increased the volume on such a song, we would end up with
something like this:
</p>
<img src="http://superjoe.s3.amazonaws.com/blog-files/quest-build-ultimate-music-player/500px-Clipping.svg.png">
<p>
The grey bars above the red lines represent clipping. This causes distortion
and generally sounds awful.
</p>
<p>
The solution is not to increase the volume of the quiet song, but to
<em>decrease</em> the volume of the loud song. In order to do this, we
introduce an amount called <strong>pre-gain</strong>. All songs are turned down by this
amount, which gives us the <strong>headroom</strong> we need to turn the quieter ones
back up.
</p>
<p>
It's not a perfect solution though.
</p>
<p>
The lower the pre-gain, the more the music player will sound quieter than other
applications on the computer. The higher the pre-gain, the more likely that
there is not enough headroom to increase the volume of a quiet song enough.
</p>
<p>
The 
<a href="http://wiki.hydrogenaudio.org/index.php?title=ReplayGain_1.0_specification#Clipping_prevention">ReplayGain 1.0 Specification</a>
outlines this in more detail.
</p>
<p>
In 2010, the European Broadcasting Union introduced a new standard called
<a href="https://tech.ebu.ch/loudness">R128</a>. This standard outlines a
strategy for analyzing media and determining how loud it is. There is a
<a href="http://wiki.hydrogenaudio.org/index.php?title=Talk:ReplayGain_2.0_specification">motion</a>
to make ReplayGain 2.0 use this standard.
</p>
<p>
I recommend this excellent Introduction to EBU R128 by Florian Camerer:
</p>
<iframe width="560" height="315" src="https://www.youtube.com/embed/iuEtQqC-Sqo" frameborder="0" allowfullscreen></iframe>
<h2 id="amarok14-shortcomings">Shortcomings of Amarok 1.4</h2>
<p>
As much as I loved Amarok 1.4, it did not even attempt to address these
loudness issues. There is no built-in loudness compensation.
</p>
<p>
The ReplayGain plugin I mentioned earlier was great, but it was limited in
usefulness:
</p>
<ul>
  <li>
  It had to scan every time the playlist updated; it didn't cache the data.
  </li>
  <li>
  Each format that you wanted to scan had a different command-line utility
  which had to be installed. This means that the set of songs that Amarok 1.4
  could play was completely different than the set of songs that it could scan.
  </li>
  <li>
  It applied the volume changes on a gradient instead of instantly, and timing
  was not precise. This means that it might erroneously turn up the loudness
  far too high in the transition time to the next track. This behavior was
  distracting and sometimes ear-piercingly painful.
  </li>
  <li>
  You had to manually decide between track and album mode. This is a pointless
  chore that the music player should do automatically. Here's a simple algorithm:
    <ul>
      <li>
        If the previous item in the playlist is the previous item from the
        same album, or the next item in the playlist is the next item from the
        same album, use the album ReplayGain information.
      </li>
      <li>
        Otherwise, use the track ReplayGain information.
      </li>
    </ul>
  </li>
</ul>
<p>
Aside from the loudness compensation, I had a couple other nits to pick:
</p>
<ul>
  <li>
  Dynamic Mode was a useful feature that could continually play random songs
  from the library. But the random selection was too random; it would often
  queue the same song within a short period of time.
  </li>
  <li>
  If the duration tag was incorrect in a song, or if in was a variable rate MP3,
  the song would seemingly end when the song had not yet gotten to the end. Or in
  other words, the reported duration was incorrect and seeking would be broken.
  </li>
</ul>
<p>
I've spent some time criticizing, now let me be more constructive and actually specify some features
that I think music players should have.
</p>
<h2 id="laundry">My Laundry List of Music Player Features</h2>

<h4>Loudness Compensation using the same scanner as decoder</h4>
<p>
This is absolutely crucial. If you want to solve the loudness compensation problem,
the set of songs which you can decode and play back <em>must</em> be the same set of songs
which you can scan for loudness. I should never have to manually adjust the volume because
a different song or album came on.
</p>
<p>
Ideally, loudness scanning should occur lazily when items are added to the
play queue and then the generated values should be saved so that the loudness
scanning would not have to be repeated.
</p>
<h4>Do not trust duration tags</h4>
<p>
A music player already must scan songs to determine loudness compensation values.
At the same time, it should determine the <em>true</em> duration of the file and use that
information instead of a tag which could be wrong.
</p>
<h4>If my friends come over, they can control the music playback</h4>
<p>
Friends should be able to upload and download music, as well as queue, skip, pause, and play.
</p>
<h4>Ability to listen to my music library even when I'm not home</h4>
<p>
I should be able to run the music player on my home computer and listen to a real-time
stream from work, for example.
</p>
<h4>Gapless Playback</h4>
<p>
Many albums are created in order to be a listening experience that transcends tracks. When
listening to an album, songs should play seamlessly and without volume changes at the seams.
This means that loudness scanning must automatically take into account albums.
</p>
<h4>Robust codec support</h4>
<p>
You know how when you need to play some obscure video format, you can always rely on
<a href="http://www.videolan.org/vlc/index.html">VLC</a> to play it? That must be true
for the ultimate music player as well. A music player must be able to play music.
If you don't have a wide range of codecs supported, you don't have a music player.
</p>
<h4>Keyboard Shortcuts for Everything</h4>
<p>
I should be allowed to never touch the mouse when operating the music player.
</p>
<h4>Clean up my messy files</h4>
<p>
One thing that Amarok 1.4 got right is library organization. It offered a powerful
way to specify the canonical location for a music file, and then it had an option to
organize properly tagged music files into the correct file location.
</p>
<p>
I don't remember the exact format, but you could specify a format something like this:
<pre>
%artist/%album/%track %title%extension
</pre>
</p>
<h4>Filter Search</h4>
<p>
There should be a text box where I can type search terms and instantly see
the search results live. And it should ignore diacritics.
For example, I could type "jonsi ik" and match the song <em>Boy Lilikoi</em> by Jónsi.
</p>
<h4>Playlist Mode that Automatically Queues Songs</h4>
<p>
Some names for this feature are:
</p>
<ul>
  <li>Dynamic Mode</li>
  <li>Party Mode</li>
  <li>DJ Mode</li>
</ul>
<p>
The idea is that it automatically queues songs - kind of like a real-time shuffle - so that
you don't have to manually decide what to listen to.
</p>
<p>
One common flaw found in many players is using a truly random algorithm. With true randomness,
it will frequently occur that a song which has recently been randomly chosen will be
randomly chosen again.
</p>
<p>
A more sophisticated algorithm weights songs by how long it has been since they have been
queued. So any song would be <em>possible</em> to be queued, but songs that have not been
queued recently are much more likely to be queued. <em>Queue date</em> is chosen rather than
<em>play date</em> because if a song is queued and the user skips the song, this should still
count in the weight against it being chosen again.
</p>


<h2 id="partybeat">"PartyBeat"</h2>
<p>
It would be a long time before my wishlist of features would become a reality. Meanwhile,
back in college my buddy made a
<a href="https://github.com/royvandewater/partybeat">fun little project</a>
which served as a music player that multiple people could control at the same time
with a web interface. He installed it in my and my roommate's apartment, and the three of us
used it in our apartment as a shared jukebox; anarchy deciding what we would listen to
while we worked on our respective jobs, homework, or projects. We dubbed it "PartyBeat".
</p>

<p>
Here's a screenshot:
</p>
<img src="http://s3.amazonaws.com/superjoe/blog-files/quest-build-ultimate-music-player/partybeat-screenshot.png" alt="">

<p>
You might recognize that UI scheme - it is the
<a href="http://jqueryui.com/themeroller/#!zThemeParams=5d00000100f405000000000000003d8888d844329a8dfe02723de3e5701cc8cb8a0f4166c8f00304d318f2e435fc4b1a60d3060e2d83cad1a27ba4a195c0fdfa6106759866da03494897d32e163c077e94ad990b79693595f02c7d6aaf376f48bc4f93539f1debbdf1780ed214b39f6748c7b573d421fee062f48204a18c1b26257fb226d0648500cb0c23e67232df268d61dac818f59e901c624d3b8108c88253b9a193212a6d3e32f24e4bc27cef59ebfb0c1a8e74578b7f4518439ae6db3f583eb30e285467c5b8653d95420138655a9eab8cc18fa73c21f63d2d5846f9cd5484ef5fff6dd1841a214559944c64a5205d2cbc8553d9d6854b6974de93ec0d3f9ce36440fd405fe46fb43788bd677db8f1fb10ef32accf296607f1f6e759ecb2d1634ebba7a9d5faa212b209822da727ee49bfcdeea8c259bdef74551a1683acf3eef0bb1eb5d406f57fb6681a89b896577b3422535576bea00c42a646e4591e6cb6594d01aa535d9dcb6168d2a367a04eeb9fc7049e16a2a7c76ec027e93976579318875d6ded8fdd916b95191d21fe266fd9e81e3a62e221f2a6d35593cf37ce7d3131adc748c5a4164ab9680a1282158f3700bbd07a1900d4e500e1b1b9176e04d142f838df6a965b9dde8e9b4e0c094931b5d424e5267d3b7b5be3ced768b676cc785c5032ddb8c0df548a5222afe5d2655c7ecb4c32dfb7bfffa14872cc">Trontastic JQuery UI theme</a>.
</p>

<p>
This project used <a href="https://www.djangoproject.com/">Django</a> and
<a href="https://xmms2.org/wiki/Main_Page">xmms2</a> and had a bug-ridden, barren, and clunky web-based user interface. It was so lacking compared to my usual Amarok 1.4 experience,
yet somehow I could not give up the "shared jukebox" aspect. It was simply too fun to listen
to music together, regardless of the interface.  
</p>

<p>
So finally I decided to build the ultimate music player. It would still have a web-based
interface, but it would behave like a native application - both in feel and in responsiveness.
It should be nice enough that even when you want to listen alone it would still be your go-to
player of choice.
</p>

<h2 id="tech">Fumbling Around with Technology</h2>

<p>
In early 2011 I started investigating what technology to use to build this thing.
I knew that I wanted a backend which could decode many audio formats, do gapless playback,
and provide some kind of interface for a web server to control it.
</p>

<p>
I tinkered a bit with <a href="http://qt-project.org/">Qt</a> and the
<a href="http://phonon.kde.org/">Phonon</a> framework, but I didn't get as far as
having a web interface controlling it.
</p>

<p>
Eventually I stumbled upon <a href="http://www.musicpd.org/">Music Player Daemon</a>.
At the time this seemed like a perfect fit, especially since the XMMS2 wiki admitted
that if they had known that MPD existed when they started the project, they would
probably have just used it. MPD is a service - it has a config file which tells it,
among other things, the location of your music library, and then it runs in the
background, listening on a port (typically 6600), where you can issue commands via
<a href="http://www.musicpd.org/doc/protocol/">the protocol</a>
telling it to pause, play, skip, queue, unqueue, and all that jazz.
</p>

<p>
The first iteration of "PartyBeat2" was a small
<a href="https://wiki.python.org/moin/Python2orPython3">Python 3</a>
server which was merely a proxy between the client-side JavaScript code and MPD, as well
as a file server to serve the client-side HTML and JavaScript.
</p>
<img src="http://superjoe.s3.amazonaws.com/blog-files/quest-build-ultimate-music-player/proxy-diagram.png">
<p>
At this point I had a
basic proof-of-concept. However, progress slowed for a few months as I embarked on a
<a href="http://andrewkelley.me/post/jmt.html">12-day hiking trip</a> followed immediately
by the first day of work at Amazon, my first out-of-college job.
</p>
<p>
After a short hiatus, I revisited the project. This was right when
<a href="http://socket.io/">socket.io</a> was getting a
lot of hype, and it seemed like the perfect fit for my design. Also I had just given
<a href="http://coffeescript.org/">Coffee-Script</a> a real chance after snubbing it
initially. So I ported over the proxy/file server to
<a href="http://nodejs.org/">Node.js</a> and got a prototype working:
</p>
<img src="http://superjoe.s3.amazonaws.com/blog-files/quest-build-ultimate-music-player/groove-basin-screenshot-1.png">
<img src="http://superjoe.s3.amazonaws.com/blog-files/quest-build-ultimate-music-player/groove-basin-screenshot-2.png">
<p>
I even did some design drawings on paper:
</p>
<img src="http://superjoe.s3.amazonaws.com/blog-files/quest-build-ultimate-music-player/gb-plans.jpg">
<p>
A week of iterating later, I had the basics of a user interface, and a name:
</p>
<img src="http://superjoe.s3.amazonaws.com/blog-files/quest-build-ultimate-music-player/groove-basin-ui.png">
<p>
I named it <strong>Groove Basin</strong>, after the 
<a href="http://s3k.ocremix.org/php/media.php">Sonic the Hedgehog 3 Azure Lake Remix</a> by
<a href="https://soundcloud.com/rayza">Rayza</a>.
As homage to the original project, I picked a JQuery UI theme for the UI, except this time
I chose 
<a href="http://jqueryui.com/themeroller/#!zThemeParams=5d00000100ea05000000000000003d8888d844329a8dfe02723de3e5700bbb34ecf36ce5959f380e613cafa997b39424a52ffc947ae6386d03dcb468a5a7815c8be751cbeb85f52384d8af826a6f2d5c641d90f69837073c156b7dc24847588f6b14a9a8c7dc301e44abe007e94f6e6c92f078aab5be3c4abbf3879228b2a48115b05dacab0962bd9dd50fdaa46f85079285dec07c941276a7d79d1e2858c68ee7737077d76eb62904eeba7420d8e2f0756bb05a5cb873fc3f1db179474d8bde7707611e362f807baa12d96cf4a00cfea47a4a170c167797d3281d6ff47a7c6f852bfdaa702a5524648e1e61e406f77cc5169eec9274af71300eb525d128c2d8aba556d61caefe4b2abaf55ef4b2f70f1edaf2c6daaa85f8c304bc6d43d2433caea8364ea7276fff6617e84b89c13fa1ee02fcf6c8fa464132c7c76f30fb087543e0128a2043f4c04f39e0d649707cef843ee859101dc7411d8d8398b6653ee74859c3dd5b5721d07516d326cb7e0ffde75882a2710bc22f461c210884265e2956c093e42656b76ea338ff9792c14edc072c1415a7ef705f7e4de64cb0cc36a10be95c054a33d08e34e2f90860411c8268bce234650d33ad74d3bdb18cb0e4a1feeaadf1388700395929d0410402353eddf5f6c5ae12555c25e135ba103f22a28bddfef880b025c6b1fe7ff433">Dot Luv</a>.
</p>

<h2 id="growing">Growing Pains</h2>
<p>
Progress continued off and on over the period of about a year.
As the feature set grew larger and more solidified, the server which used to only be a
proxy and file server took on more and more responsibilities.
</p>
<p>
Dynamic Mode required that the server watch the main playlist and add songs to the end
and remove them from the beginning as playback continued from one track to the next.
</p>
<p>
<a href="http://last.fm">Last.fm</a> scrobbling required that the server send scrobbles
even when the client was not connected.
</p>
<p>
As the server took on more responsibilities, it began to make less and less sense for it
to directly proxy the MPD protocol for the client.
</p>
<p>
Meanwhile, I became unsatisfied with Coffee-Script. Perhaps digressing too much,
some of the issues I took with it were:
</p>
<h4>Error-prone variable scoping</h4>
<p>
If you accidentally name a local variable the same as one in an outer scope, you mutate
the value of the outer one
rather than shadowing it. For example if you <code>path = require('path')</code> and
then later innocently use <code>path</code> as the name of a local variable, you're
in for a wild ride:
</p>
<pre>
<code class="language-coffeescript">path = require('path')

# ...

makePathName = (song) -&gt;
  path = song.title
  if song.artist
    path = song.artist + '/' + path
  return path

# ...

basename = makePathName(title: 'hello')
path.join(musicDir, basename) # TypeError: Object  has no method 'join'</code>
</pre>
<h4>Messy and inefficient output code</h4>
The JavaScript that Coffee-Script produces is frankly quite ugly. Specifically, it does
not reuse temporary variables, so you end up with <code>_len</code>, <code>_len2</code>,
<code>_len3</code> and so on:

<pre>
<code class="language-coffeescript">arr = [1, 2, 3]
f = -&gt;
  for x in arr
    console.log x
  for x in arr
    console.log x
  for x in arr
    console.log x</code>
</pre>

<p>
Produces:
</p>

<pre>
<code class="language-javascript">(function() {
  var arr, f;

  arr = [1, 2, 3];

  f = function() {
    var x, _i, _j, _k, _len, _len1, _len2, _results;
    for (_i = 0, _len = arr.length; _i &lt; _len; _i++) {
      x = arr[_i];
      console.log(x);
    }
    for (_j = 0, _len1 = arr.length; _j &lt; _len1; _j++) {
      x = arr[_j];
      console.log(x);
    }
    _results = [];
    for (_k = 0, _len2 = arr.length; _k &lt; _len2; _k++) {
      x = arr[_k];
      _results.push(console.log(x));
    }
    return _results;
  };

}).call(this);</code>
</pre>

<p>
If you look closely at that output JavaScript code, you'll notice something even
more annoying. Every function returns a value unless you explicitly put a
<code>return</code> statement at the end. This can have surprising side effects.
In our example code, Coffee-Script decided to put the output of <code>console.log</code>
into an array. This is a
<a href="https://github.com/jashkenas/coffee-script/issues/2477">controversial feature</a>,
but it is not going to change.
</p>

<h4>Inability to declare functions</h4>
<p>
In Coffee-Script you cannot
declare a function; you can only assign an anonymous function to a variable. This
makes it impossible to do what I consider to be the
<a href="js-callback-organization.html">cleanest organization of callback code</a>.
</p>

<h3>Playing with coco</h3>
<p>
What I should have done at this time is gone back to plain old JavaScript. But I was
still seduced by the features that compile-to-js languages bring to the table. So
instead I switched the codebase over to <a href="https://github.com/satyr/coco">coco</a>.
This project
<a href="https://github.com/satyr/coco/wiki/wtfcs">solved some of Coffee-Script's problems</a>
including all the ones I listed above, and
<a href="https://github.com/satyr">satyr</a> seemed to have a better understanding of
compiler design than
<a href="https://github.com/jashkenas">jashkenas</a>
given that coco ran <a href="https://github.com/satyr/coco/wiki/improvements#size--speed">twice as fast</a>.
</p>
<p>
coco lasted about a year before I removed it. satyr started taking coco in a
pretty wild direction, adding things like <code>c = a-b</code> compiling to
<code>c = aB</code> rather than <code>c = a - b</code>. The syntax became so
complicated that if you made a typo in the source code, you had more of a chance
of introducing a subtle bug than of introducing a syntax error.
</p>
<p>
In the end, though, the biggest factor, and this goes for Coffee-Script as well
as coco as well as any other compile-to-js language, is that it alienates
possible contributors.
</p>
<p>
As I gained more experience
with Node.js, I realized that most developers used JavaScript directly instead
of a compile-to-js language.
By using a language that significantly fewer people were familiar
with, I made Groove Basin a less attractive project to contribute to.
</p>
<h2 id="rejecting-mpd">Rejecting MPD</h2>
<p>
All this compile-to-js stuff was meta work; it had no fundamental effect on
how well the music player performed. Meanwhile there lurked a more substantial
problem with the way Groove Basin was designed.
</p>
<p>
At first, MPD seemed like a great choice. It is in most popular Linux distributions'
package managers, including the <a href="http://www.raspberrypi.org/">Raspberry Pi</a>.
It had been around
for long enough that there are multiple free iPhone and Android apps available to
act as a controller. It can play most audio formats. But in the end, there are some
critical issues that prevent it from being the right choice for Groove Basin.
</p>
<h3>Demands control of your music library</h3>
<p>
The only way to play a song is to add it to the library, and then queue it. If you want
to implement your own music database and use MPD as a simple playlist for playback, you
still have to keep MPD's library up to date too.
</p>
<h3>ReplayGain support is laughable</h3>
<p>
MPD supports only APEv2 ReplayGain tags. This must be some kind of joke, because obviously
not every format supports APEv2 tags, and most ReplayGain scanners actually write to
ID3v2 tags. But more importantly, MPD misses the entire point. Relying on external
ReplayGain scanning makes the set of songs you can <em>play</em> different from the
set of songs you can <em>scan</em>. This leads to an inconsistent and rather unpleasant
listening experience. Further, relying on tags makes it impossible to store ReplayGain
data for songs in the library in a container format which does not support tags.
And finally... what the hell? Is the user supposed to set up their own cron job to scan
their music collection? How about the music player app does it for the user, silently,
in the background, no questions asked?
</p>
<h3>No tag editing</h3>
<p>
After demanding control of your music library, MPD provides no way to edit tags of a
song.
</p>
<p>
If you're going to read and write audio tags, the same library should be in charge of both.
Otherwise, like the ReplayGain problem I outlined earlier, you end up with discrepancies in
what you can read and write.
</p>
<h3>Protocol is severely limited and poorly designed</h3>
<p>
MPD is controlled via the <a href="http://musicpd.org/doc/protocol/">MPD Protocol</a> that can be used to control
playback and query information. For the most part, all the information that you need is
there. However the protocol is massively inefficient.
</p>
<p>
For one example, consider the use case of a client which wants to keep an index of the
music database. That is, it wants to have an updated copy of all the music metadata, so
that it can do things such as queuing a random track, or allowing the user to quickly
search for a song. In this use case, the mpd client would have to request a copy of
the music database index, and then subscribe to a notification when the database changes.
Once that notification is sent, the client would then have to re-request the entire index
again, an operation which can be upwards of 3MB of data for a music library with 9000 songs.
A better behavior would be if the notification included a delta of exactly what changed,
so that the client could keep their copy updated without that massive payload.
</p>
<p>
This same problem exists with the main playlist - you have to request the entire playlist
instead of receiving a delta of what changed.
</p>
<p>
Another problem with the MPD protocol is that although it is intended to support multiple
concurrent users controlling the same server, it is riddled with race conditions.
</p>
<p>
For example, it packages multiple state updates into one. Consider the status update message:
</p>
<pre>
status
volume: 100
repeat: 0
random: 0
single: 0
consume: 1
playlist: 0
playlistlength: 23
xfade: 0
mixrampdb: 0.000000
mixrampdelay: nan
state: play
song: 10
songid: 0
nextsong: 11
nextsongid: 1
time: 69:224
elapsed: 68.985
bitrate: 192
audio: 44100:24:2
OK
</pre>
<p>
Now imagine that one user adjusts volume while another user toggles the repeat state.
Because volume and repeat state are sent in the same message, at least one user will
receive a status message with incorrect information before receiving a new one with
correct information. In practice, this means that the UI on clients will momentarily
display bogus state when things change which makes clients feel "glitchy".
</p>
<p>
As another example, audio files are indexed by filename. This means that if you rename a file,
every client which had a handle on that file now has an invalid handle. Even after you download
the entire new music library index, there is no way to tell which file got renamed.
</p>
<p>
One final example. Consider the use case where the user presses the volume up button several
times quickly.
</p>
<p>
The problem is that you receive an event saying that the status has changed. The only
reasonable response to this is to ask what the new status is. The new status tells us that the volume
is at 3. Now, there are 2 more messages that will arrive soon telling us that the new volume is
4 and then 5. But before they arrive, the UI is updated, the user sees an invalid value, and when they
press volume up again, it is from the invalid base position of 3 instead of 5.
Consider a simpler alternative which solves this problem. When the client sends a volume update,
the server accepts the message and then only notifies <em>other</em> clients that the volume changed.
</p>
<h3>Various audio playback glitches</h3>
<p>
MPD supports something called "stickers". These are simple pieces of data MPD
clients can add to tracks to use for their own purposes. Groove Basin took
advantage of stickers to store "last queued date" on each track in order to
implement its random song selection which favors songs you haven't heard recently.
<a href="https://github.com/thejoshwolfe">Josh</a> discovered that MPD apparently makes two
stupid decisions with regard to stickers:
</p>
<ol>
  <li>It uses the same thread for audio playback as it uses for updating sticker information.</li>
  <li>It somehow is so inefficient at updating sticker information that it causes audio
      playback to skip a few tenths of a second if you try to do it for upwards of 8 songs at once.
  </li>
</ol>
<p>
The end result here was that audio playback would glitch if you queued an album.
</p>
<p>
In addition there were some basic audio playback issues. Sometimes after unpausing,
audio playback would stutter quickly until the song ended. Sometimes the HTTP stream would
not send audio data quickly enough, so the HTTP stream would have to stop to buffer repeatedly,
and the behavior had to be fixed by disabling and re-enabling the audio output in MPD.
</p>
<p>
Now it is possible to submit patches to MPD to get these bugs fixed. But if I'm going to work on
that layer of the problem, why not put that effort toward a project which better fulfills
Groove Basin's goals?
</p>

<h3>We want more control over the HTTP audio stream</h3>
<p>
In the best case scenario, the music player server would know which of the connected browser clients
are actually streaming music. With MPD in control of the HTTP audio stream, we have no
access to or control over the HTTP request. We can only provide the URL. Also it requires
running on a separate port from the main web interface which is its own can of worms.
</p>
<p>
We also want more control over the audio stream. When a song is skipped, for example,
the best thing to do is flush the audio buffer so that the buffer can start filling up
with data from the new song. This is precisely what happens in media players that play
locally on your computer. However, with MPD this is not possible. When you skip a song
you still have to wait for the audio stream to catch up.
</p>
<p>
Having direct control over the HTTP audio stream also enables us to experiment with some
more creative ideas. For example, recently I updated the HTTP stream so that when a client connects,
it receives a burst of 200 KB of encoded audio, followed by a steady stream of exactly as
many bytes per second as the encoded audio contains.
This gives clients enough data to begin playback immediately. You may have noticed this
behavior when you watch a YouTube video - the buffering bar loads very quickly at first, but
then slows down to a crawl once playback begins.
</p>
<p>
In practice, I have observed the delay between connecting to the stream URL and playback beginning
to be anywhere from seemingly instantaneous to 300ms with this method, depending on latency and bandwidth.
Meanwhile, if you use MPD's HTTP audio stream, clients will take upwards of 10 seconds to buffer
audio before playback begins.
</p>

<h2 id="building-a-backend">Building a Music Player Backend</h2>
<p>
So, how hard could it be to build my own music player backend?
Seems like it would be a matter of solving these things:
</p>
<ul>
  <li>Use a robust library for audio decoding. How about the same one that VLC uses?</li>
  <li>Support adding and removing entries on a playlist for gapless playback.</li>
  <li>Support pause, play, and seek.</li>
  <li>Per-playlist-item gain adjustment so that perfect loudness compensation can be
      implemented.
  </li>
  <li>Support loudness scanning to make it easy to implement for example ReplayGain.</li>
  <li>Support playback to a sound device chosen at runtime.</li>
  <li>Support transcoding audio into another format so a player can implement, for example,
      HTTP streaming.
  </li>
  <li>Give raw access to decoded audio buffers just in case a player wants to do something
      other than one of the built-in things.
  </li>
  <li>
    Try to get other projects to use it to benefit from code reuse.
    <ul>
      <li>Make the API generic enough to support other music players and other use cases.</li>
      <li>Get it packaged into Debian and Ubuntu.</li>
      <li>Make a blog post about it to increase awareness.</li>
    </ul>
  </li>
</ul>
<p>
After reading up a little bit on the insane open-source soap-opera that was the forking of
<a href="http://www.libav.org/">libav</a> from
<a href="http://www.ffmpeg.org/">ffmpeg</a> (here are two sides to the story:
<a href="http://libav.org/about.html">libav side</a>,
<a href="http://blog.pkh.me/p/13-the-ffmpeg-libav-situation.html">ffmpeg side</a>), I went
with libav simply because it is what is in the
<a href="http://www.debian.org/">Debian</a> and
<a href="http://www.ubuntu.com/">Ubuntu</a> package managers, and one
of my goals is to get this music player backend into their package managers.
</p>
<p>
Several iterations later, I now have
<a href="https://github.com/andrewrk/libgroove">libgroove</a>,
a C library with what I think is a
<a href="https://github.com/andrewrk/libgroove/blob/3.0.7/groove/groove.h">pretty solid API</a>.
How it works:
</p>
<img src="http://superjoe.s3.amazonaws.com/blog-files/quest-build-ultimate-music-player/libgroove-diagram.png">
<p>
The API user creates a GroovePlaylist which spawns its own thread and is
responsible for decoding audio. The user adds and removes items at will from
this playlist. They can also call pause, play, and seek on the playlist.
As the playlist decodes audio, where does the decoded audio go?
This is where those sinks come in.
</p>
<p>
A <strong>sink</strong> is a metaphor of a real-life sink that you would find
in a bathroom or kitchen. Sinks can fill up with water, and unless the water
is drained the sink will continue to fill until it overflows.
Likewise, in audio processing, a sink is an object which collects audio buffers
in a queue.
</p>
<p>
In libgroove, decoded audio is stored in reference-counted buffer objects
and passed to each connected sink. Each sink does whatever processing it needs to
do and then calls "unref" on the buffer. Typically each sink will have its own
thread which hungrily waits for buffers and devours them as fast as possible.
However the playlist is also decoding audio as fast as possible and pushing
it onto each sink's queue. It is quite possible, that a sink's queue fills up
faster than it can process the buffers.
When the playlist discovers that all its
sinks are full, it puts its thread to sleep, waiting to be woken up by a sink
which has drained enough. 
</p>
<p>
libgroove provides some higher-level sink types in addition to the basic sink.
Each higher level sink runs in its own thread and is built using the basic sink.
These include:
</p>
<ul>
  <li><strong>playback sink</strong> - opens a sound device and
  sends the decoded audio to it. This sink fills up with <em>events</em>
  that signal when the sink has started playing the next track, or when
  a buffer underflow occurs.
  </li>
  <li><strong>encoder sink</strong> - encodes the audio buffers it receives
  and fills up with encoded audio buffers. These encoded buffers can then
  be written to a file or streamed over the network, for example.
  </li>
  <li><strong>loudness scanner sink</strong> - uses the EBU R 128 standard to 
  detect loudness. This sink fills up with information about each track, including
  loudness, peak, and duration.
  </li>
</ul>
<p>
The API is designed carefully such that even though the primary use case is for a music player backend,
libgroove can be used for other use cases, such as transcoding audio, editing tags, or ReplayGain scanning.
Here is an example of using libgroove to for a simple transcode command line application:
</p>
<pre>
<code class="language-c">/* transcode one or more files into one output file */

#include &lt;groove/groove.h&gt;
#include &lt;groove/encoder.h&gt;
#include &lt;stdio.h&gt;
#include &lt;string.h&gt;
#include &lt;stdlib.h&gt;

static int usage(char *arg0) {
    fprintf(stderr, "Usage: %s file1 [file2 ...] --output outputfile [--bitrate 320] [--format name] [--codec name] [--mime mimetype]\n", arg0);
    return 1;
}

int main(int argc, char * argv[]) {
    // arg parsing
    int bit_rate_k = 320;
    char *format = NULL;
    char *codec = NULL;
    char *mime = NULL;

    char *output_file_name = NULL;

    groove_init();
    atexit(groove_finish);
    groove_set_logging(GROOVE_LOG_INFO);
    struct GroovePlaylist *playlist = groove_playlist_create();

    for (int i = 1; i < argc; i += 1) {
        char *arg = argv[i];
        if (arg[0] == '-' && arg[1] == '-') {
            arg += 2;
            if (i + 1 >= argc) {
                return usage(argv[0]);
            } else if (strcmp(arg, "bitrate") == 0) {
                bit_rate_k = atoi(argv[++i]);
            } else if (strcmp(arg, "format") == 0) {
                format = argv[++i];
            } else if (strcmp(arg, "codec") == 0) {
                codec = argv[++i];
            } else if (strcmp(arg, "mime") == 0) {
                mime = argv[++i];
            } else if (strcmp(arg, "output") == 0) {
                output_file_name = argv[++i];
            } else {
                return usage(argv[0]);
            }
        } else {
            struct GrooveFile * file = groove_file_open(arg);
            if (!file) {
                fprintf(stderr, "Error opening input file %s\n", arg);
                return 1;
            }
            groove_playlist_insert(playlist, file, 1.0, NULL);
        }
    }
    if (!output_file_name)
        return usage(argv[0]);

    struct GrooveEncoder *encoder = groove_encoder_create();
    encoder-&gt;bit_rate = bit_rate_k * 1000;
    encoder-&gt;format_short_name = format;
    encoder-&gt;codec_short_name = codec;
    encoder-&gt;filename = output_file_name;
    encoder-&gt;mime_type = mime;
    if (groove_playlist_count(playlist) == 1) {
        groove_file_audio_format(playlist-&gt;head-&gt;file, &amp;encoder-&gt;target_audio_format);

        // copy metadata
        struct GrooveTag *tag = NULL;
        while((tag = groove_file_metadata_get(playlist-&gt;head-&gt;file, "", tag, 0))) {
            groove_encoder_metadata_set(encoder, groove_tag_key(tag), groove_tag_value(tag), 0);
        }
    }

    if (groove_encoder_attach(encoder, playlist) < 0) {
        fprintf(stderr, "error attaching encoder\n");
        return 1;
    }

    FILE *f = fopen(output_file_name, "wb");
    if (!f) {
        fprintf(stderr, "Error opening output file %s\n", output_file_name);
        return 1;
    }

    struct GrooveBuffer *buffer;

    while (groove_encoder_buffer_get(encoder, &buffer, 1) == GROOVE_BUFFER_YES) {
        fwrite(buffer-&gt;data[0], 1, buffer-&gt;size, f);
        groove_buffer_unref(buffer);
    }

    fclose(f);

    groove_encoder_detach(encoder);
    groove_encoder_destroy(encoder);

    struct GroovePlaylistItem *item = playlist-&gt;head;
    while (item) {
        struct GrooveFile *file = item-&gt;file;
        struct GroovePlaylistItem *next = item-&gt;next;
        groove_playlist_remove(playlist, item);
        groove_file_close(file);
        item = next;
    }
    groove_playlist_destroy(playlist);

    return 0;
}</code>
</pre>
<p>
Note that this code contains no threading. Even so, because of the way libgroove is designed, when this
app is run, one thread will work on decoding the audio while the main thread seen in this code will
work on writing the encoded buffers to disk.
</p>
<p>
Once I had this backend built, I needed to use it in Groove Basin, which you may recall is a Node.js app.
To do this I built a native add-on node module called <a href="https://www.npmjs.org/package/groove">groove</a>.
It uses <a href="https://github.com/joyent/libuv">libuv</a> and <a href="https://code.google.com/p/v8/">v8</a>
to interface between C and Node.js. I wrote the majority of this code at
<a href="https://www.hackerschool.com/">Hacker School</a>, an experience which I highly recommend.
</p>
<p>
With the groove node module complete, the new architecture looked like this:
</p>
<img src="http://superjoe.s3.amazonaws.com/blog-files/quest-build-ultimate-music-player/groovebasin-design-libgroove.png">
<p>
No longer did Groove Basin need to run a third party server to make everything work -
just a single Node.js application with the correct libraries installed. And now I was in control of the 
audio backend code which meant that I had the power to make everything work exactly like I wanted it to.
</p>
<h2 id="packaging">Packaging</h2>
<p>
Nothing turns away potential users faster than a cumbersome install process. I knew that I had to make
Groove Basin easy to install, so I took several steps to make it so.
</p>
<p>
One thing I did to make libgroove easy to install is bundle some of the harder to find dependencies along
with it. Specifically, libav10, libebur128, and SDL2. This way if the user is on a computer which does not
have those packages readily available, they may still install libgroove.
</p>
<p>
This convenience is less desirable than relying on existing system dependencies, however, so if the
configure script detects system libraries, it happily prefers them.
</p>
<p>
Next, I made a <a href="https://launchpad.net/~andrewrk/+archive/libgroove/">libgroove PPA</a>
for Ubuntu users. This makes installing libgroove as easy as:
</p>
<pre>
sudo apt-add-repository ppa:andrewrk/libgroove
sudo apt-get update
sudo apt-get install libgroove-dev libgrooveplayer-dev libgrooveloudness-dev
</pre>
<p>
Then I <a href="http://lists.alioth.debian.org/pipermail/pkg-multimedia-maintainers/2014-February/036583.html">joined</a> the <a href="https://wiki.debian.org/DebianMultimedia">Debian multimedia packaging team</a>.
This team is dedicated to making Debian a good platform for audio and multimedia work.
They kindly accepted me and coached me while I worked on packaging up libebur128 and libgroove for Debian.
After a few back and forths, a
<a href="https://packages.debian.org/source/jessie/libebur128">libebur128 Debian package</a> is ready to be installed
from <a href="http://www.debian.org/releases/testing/">testing</a>, and a
<a href="https://packages.debian.org/source/experimental/libgroove">libgroove Debian package</a>
can be installed from experimental. Once the
<a href="https://release.debian.org/transitions/html/libav10.html">libav10 transition</a> is complete,
libgroove can be submitted to
<a href="http://www.debian.org/releases/sid/">unstable</a>, where it will move into
testing, and then finally be released to all of Debian!
</p>
<p>
After a few more months of progress, I'd like to package up Groove Basin itself. This way, the entire
installation process could be just an <code>apt-get install</code> away.
</p>
<h2 id="libav-contributions">Contributions to libav</h2>
<p>
While working on libgroove several issues came up which led me to contribute code to
<a href="http://libav.org">libav</a>. The first thing I noticed is that if you asked
libav for a decoder based on the .ogg extension, by default it would use the FLAC
encoder. While it is true that .ogg files can contain FLAC audio, the
<a href="http://xiph.org/">Xiph.org</a> Foundation
<a href="http://wiki.xiph.org/index.php/MIME_Types_and_File_Extensions">recommends</a>
that .ogg only be used for Ogg Vorbis audio files.
</p>
<p>
Unfortunately, the built-in vorbis encoder is considered to not meet quality standards
so the
<a href="http://git.libav.org/?p=libav.git;a=commitdiff;h=b0c2c097e422b9e10a7d856582f8321d28af821e">compromise</a>
that we ended up implementing is to default .ogg to vorbis when libvorbis
is available, and default to FLAC when it is not. Fortunately for Debian and Ubuntu users,
the libav package that is available in the repository does in fact depend on libvorbis.
</p>
<p>
Another time I had to dig into libav code was when I discovered that some of my .wma songs would have
<a href="https://bugzilla.libav.org/show_bug.cgi?id=567">broken, glitchy playback if you seek to the beginning</a>.
Seeking to any other location seemed to work fine.
When I investigated this behavior, I noticed that it occurred in avplay, the command line audio playback tool
that ships with libav. I proposed
<a href="http://git.libav.org/?p=libav.git;a=commitdiff;h=0c082565965258dca143767cc6cb25e38b6e9ea3">a patch</a>
which short circuited seeking to 0 and skipped all the complicated seeking code.
Janne Grunau not only committed my patch but he took the opportunity to revisit the ASF seeking code, tidy it up,
and fix a bunch of issues.
</p>
<p>
I'm pretty happy about this patch landing in libav as it makes a whole set of songs able to be played for me that
previously could not. A pretty important "feature" for a music player.
</p>
<p>
Finally, while investigating the best way to implement loudness compensation, I realized that simply adjusting
the volume on an audio stream is not enough. If the song is so quiet that the amount we would have to turn
the volume gain up to exceeds 1.0, we would end up with clipping. The solution to this is to detect whether
it is possible that the song could clip and if so, use a compressor instead of a simple volume gain:
</p>
<img src="http://superjoe.s3.amazonaws.com/blog-files/quest-build-ultimate-music-player/compressor.svg">
<p>
A compressor, otherwise known as a limiter, allows us to turn the volume up without clipping. The tradeoff is
that it distorts the audio signal. This is why we prefer a simple volume gain, but fall back to a compressor
if we need to turn the volume up high enough. This is what VLC does when you turn the volume up past 100% into
the red.
</p>
<p>
In order to have this functionality, I
<a href="http://git.libav.org/?p=libav.git;a=commitdiff;h=ba21499648bbffc5518a41dc01a51449b9871088">ported</a>
the "compand" audio filter from
<a href="http://ffmpeg.org">ffmpeg</a>.
Now libgroove has the ability to turn the volume up beyond 100% like VLC, although I don't recommend doing it.
It's much better for sound quality to turn the gain up on your physical speakers.
</p>
<h2 id="conclusion">Conclusion</h2>
<p>
3 years, 6 months from <code>git init</code> and Groove Basin is still under active development.
Here's what the UI looks like today:
</p>
<img src="http://superjoe.s3.amazonaws.com/blog-files/quest-build-ultimate-music-player/groovebasin-screenshot-latest.png">
<p>
Some of the features that it provides are:
</p>
<ul>
  <li>
    Fast, responsive UI. It feels like a desktop app, not a web app.
  </li>
  <li>
    Dynamic playlist mode which automatically queues random songs, favoring
    songs that have not been queued recently.
  </li>
  <li>
    Drag and drop upload. Drag and drop playlist editing. Rich keyboard
    shortcuts.
  </li>
  <li>
    Lazy multi-core
    <a href="http://tech.ebu.ch/loudness">EBU R128 loudness scanning</a>
    (tags compatible with
    <a href="http://wiki.hydrogenaudio.org/index.php?title=ReplayGain_1.0_specification">ReplayGain</a>)
    and automatic switching between track and album mode.
    <a href="http://www.youtube.com/watch?v=iuEtQqC-Sqo">"Loudness Zen"</a>
  </li>
  <li>
    Streaming support. You can listen to your music library - or share it
    with your friends - even when you are not physically near your home
    speakers.
  </li>
  <li>
    MPD protocol support. This means you already have a selection of
    <a href="http://mpd.wikia.com/wiki/Clients">clients</a>
    which integrate with Groove Basin.
    For example
    <a href="https://github.com/abarisain/dmix">MPDroid</a>.
  </li>
  <li>
    <a href="http://www.last.fm/">Last.fm</a> scrobbling.
  </li>
  <li>
    File system monitoring. Add songs anywhere inside your music directory
    and
    they instantly appear in your library in real time.
  </li>
  <li>
    Supports GrooveBasin Protocol on the same port as MPD Protocol - use the
    `protocolupgrade` command to upgrade.
  </li>
</ul>
<p>
If you like you can try out the web interface client of Groove Basin on the
<a href="http://demo.groovebasin.com/">live demo</a> site.
It will probably be chaotic and unresponsive if there is a fair amount of traffic to this blog post,
as it's not designed for a large number of anonymous people to use it together; it's more for groups of
10 or less people who actually know each other in person.
</p>
<p>
The roadmap moving forward looks like this:
</p>
<ol>
  <li>Tag Editing</li>
  <li>Music library organization</li>
  <li>Accoustid Integration</li>
  <li>Playlists</li>
  <li>User accounts / permissions rehaul</li>
  <li>Event history / chat</li>
  <li>Finalize GrooveBasin protocol spec</li>
</ol>
<p>
Groove Basin still has lots of <a href="https://github.com/andrewrk/groovebasin/issues">issues</a>
but it's already a solid music player and it's only improving over time.
</p>
<p>
At some point I plan to write a tutorial article detailing exactly how to get this application
running on a <a href="http://www.raspberrypi.org/">Raspberry Pi</a>. It's mostly straightforward
but there are enough "gotchas" here and there that I think it could be a useful article.
</p>
<p>
<em>(update 2014-June-19)</em> I have now written this article:
<a href="raspberry-pi-music-player-server.html">Turn Your Raspberry Pi into a Music Player Server</a>
</p>
<p>
Feel free to star or watch the
<a href="https://github.com/andrewrk/groovebasin">Groove Basin GitHub repository</a>
if you want to keep track of progress.
</p>
]]></description>
      </item>
      <item>
         <title>Do Not Use bodyParser with Express.js</title>
         <pubDate>Fri, 06 Sep 2013 22:37:07 GMT</pubDate>

         <link>https://andrewkelley.me/post/do-not-use-bodyparser-with-express-js.html</link>
         <guid>https://andrewkelley.me/post/do-not-use-bodyparser-with-express-js.html</guid>
         <description><![CDATA[<h1>Do Not Use bodyParser with Express.js</h1>
<p>
Note: this post has
<a href="https://github.com/andrewrk/andrewkelley.me/commit/a5cbf4a7815391dbcc57d6f30f8d330a67487167">been edited</a>
to take into account
<a href="https://github.com/visionmedia/">TJ</a>'s
<a href="https://groups.google.com/forum/#!topic/express-js/iP2VyhkypHo">diligent work</a>
in response to this.
</p>
<p>
I came across
<a href="https://plus.google.com/106706438172517329683/posts/4kQiD8L1D36">this Google+ post</a>
mentioning
<a href="http://stackoverflow.com/questions/14612143/node-js-express-framework-security-issues">this StackOverflow post</a>
in which someone is quite wisely asking whether the
<a href="http://expressjs.com/">express.js framework</a> is secure
enough to use for production applications.
</p>
<p>
This reminds me of one "gotcha" in particular that you could be bitten by if
you're not careful.
</p>
<p>
All servers using
<a href="http://expressjs.com/api.html#bodyParser">express.bodyParser</a>
are vulnerable to an attack which creates an unlimited number of temp files
on the server, potentially filling up all the disk space, which is likely
to cause the server to hang.
</p>
<h2>Demonstration</h2>
<p>
This problem is extremely easy to demonstrate. Here's a simple express app:
</p>
<pre>
<code class="language-javascript">var express = require('express');
var app = express();

app.use(express.bodyParser());
app.post('/test', function(req, resp) {
  resp.send('ok');
});

app.listen(9001);</code>
</pre>
<p>
Seems pretty innocuous right?
</p>
<p>
Now check how many temp files you have with something like this:
</p>
<pre>
$ ls /tmp | wc -l
33
</pre>
<p>
Next simulate uploading a multipart form:
</p>
<pre>
$ curl -X POST -F foo=@tmp/somefile.c http://localhost:9001/test
ok
</pre>
<p>
Go back and check our temp file count:
</p>
<pre>
$ ls /tmp | wc -l
34
</pre>
<p>
That's a problem.
</p>
<h2>Solutions</h2>
<h3>Always delete the temp files when you use bodyParser or multipart middleware</h3>
<p>
  You can prevent this attack by always checking whether <code>req.files</code>
  is present for endpoints in which you use
  <code>bodyParser</code> or <code>multipart</code>, and then
  deleting the temp files. Note that this
  is <em>every POST endpoint</em> if you did something like
  <code>app.use(express.bodyParser())</code>.
</p>
<p>
This is suboptimal for several reasons:
</p>
<ol>
  <li>It is too easy to forget to do these checks.</li>
  <li>It requires a bunch of ugly cleanup code. Why have code when you could not have code?</li>
  <li>Your server is still, for every POST endpoint that you use bodyParser, processing
    every multipart upload that comes its way, creating a temp file, writing it to disk,
    and then deleting the temp file. Why do all that when you don't want to accept uploads?
  </li>
  <li>
  As of express 3.4.0 (connect 2.9.0) bodyParser is deprecated.
  It goes without saying that deprecated things should be avoided.
  </li>
</ol>
<h3>Use a utility such as tmpwatch or reap</h3>
<p>
<a href="https://github.com/jfromaniello">jfromaniello</a>
<a href="https://groups.google.com/d/msg/nodejs/6KOlfk5cpcM/SCJ9jZZfP-UJ">pointed out</a>
that using a utility such as
<a href="http://linux.die.net/man/8/tmpwatch">tmpwatch</a>
can help with this issue.
The idea here is to, for example, schedule tmpwatch as a cron job.
It would remove temp files that have not been accessed in a long enough
period of time.
</p>
<p>
It's usually a good idea to do this for all servers, just in case.
But relying on this to clean up bodyParser's mess still suffers from issue #3
outlined above. Plus, server hard drives are often small, especially when you
didn't realize you were going to have temp files in the first place.
</p>
<p>
If you ran your cron job every 8 hours for instance, given a hdd with 4 GB
of free space, an attacker would need an Internet connection with
145 KB/s upload bandwidth to crash your server.
</p>
<p>
TJ pointed out that he also has a utility for this purpose called
<a href="https://github.com/visionmedia/reap">reap</a>.
</p>
<h3>Avoid bodyParser and explicitly use the middleware that you need</h3>
<p>
If you want to parse json in your endpoint, use <code>express.json()</code> middleware.
If you want json and urlencoded endpoint, use <code>[express.json(), express.urlencoded()]</code>
for your middleware.
</p>
<p>
If you want users to upload files to your endpoint, you could use <code>express.multipart()</code> and be
sure to clean up all the temp files that are created.
This would still stuffer from problem #3 previously mentioned.
</p>
<h3>Use the defer option in the multipart middleware</h3>
<p>
When you create your multipart middleware, you can use the <code>defer</code>
option like this:
</p>
<pre>
<code class="language-javascript">express.multipart({defer: true})</code>
</pre>
<p>
According to the documentation:
</p>
<blockquote>
  defers processing and exposes the multiparty form object as `req.form`.<br>
  `next()` is called without waiting for the form's "end" event.<br>
  This option is useful if you need to bind to the "progress" or "part" events, for example.<br>
</blockquote>
<p>
So if you do this you will use <a href="https://github.com/andrewrk/node-multiparty/blob/master/README.md#api">multiparty's API</a> assuming that <code>req.form</code>
is an instantiated <code>Form</code> instance.
</p>
<h3>Use an upload parsing module directly</h3>
<p>
<code>bodyParser</code> depends on <code>multipart</code>, which behind the
scenes uses 
<a href="https://github.com/andrewrk/node-multiparty">multiparty</a> to
parse uploads.
</p>
<p>
You can use this module directly to handle the request. In this case you can
look at 
<a href="https://github.com/andrewrk/node-multiparty/blob/master/README.md#api">multiparty's API</a>
and do the right thing.
</p>
<p>
There are also alternatives such as
<a href="https://github.com/mscdex/busboy">busboy</a>,
<a href="https://github.com/chjj/parted">parted</a>,
and
<a href="https://github.com/felixge/node-formidable">formidable</a>.
</p>
]]></description>
      </item>
      <item>
         <title>JavaScript Callbacks are Pretty Okay</title>
         <pubDate>Sat, 17 Aug 2013 04:41:14 GMT</pubDate>

         <link>https://andrewkelley.me/post/js-callback-organization.html</link>
         <guid>https://andrewkelley.me/post/js-callback-organization.html</guid>
         <description><![CDATA[<h1>JavaScript Callbacks are Pretty Okay</h1>
<p>
I've seen a fair amount of
<a href="http://tirania.org/blog/archive/2013/Aug-15.html">callback bashing</a>
on 
<a href="https://news.ycombinator.com/">Hacker News</a> recently.
</p>
<p>
Among the many proposed solutions, one of them strikes me as
particularly clean: "asynchronous wait" or "wait and defer".
<a href="http://maxtaco.github.io/coffee-script/">iced-coffee-script</a> had 
this a while ago.
<a href="http://rzimmerman.github.io/kal/">Kal</a> just debuted with an
identical solution, only differing in syntax.
Supposedly <a href="http://livescript.net/#backcalls">LiveScript supports this with "backcalls"</a>.
</p>
<p>
I must say, "asynchronous wait" or "monads" - whatever you want to call it -
seems like an improvement over callbacks.
But I'm here to say that actually... callbacks are pretty okay.
Further, given how clean callbacks can be, the downsides of using
a compile-to-js language often outweigh the benefits.
</p>
<p>
I have 2 rules of thumb for code organization which makes clean
callback based async code brainless:
</p>
<ol>
  <li>Avoid nontrivial anonymous functions.</li>
  <li>Put all function declarations <em>after</em> the code that actually does things.</li>
</ol>
<h3>Example</h3>
<p>
compile-to-js languages often show examples of deeply nested callback code
to show how it can be refactored in the language. Let's take one from
<a href="http://rzimmerman.github.io/kal/">Kal</a>:
</p>
<pre>
<code class="language-javascript">function getUserFriends(userName, next) {
    db.users.findOne({name:userName}, function (err, user) {
        if (err != null) return next(err);
        db.friends.find({userId:user.id}, function (err, friends) {
            if (err != null) return next(err);
            return next(null, friends);
        });
    });
}</code>
</pre>
<p>
Yikes that does look a bit nested. But let's apply the first rule and un-nest
both of those anonymous functions.
</p>
<pre>
<code class="language-javascript">function getUserFriends(userName, next) {
    db.users.findOne({name:userName}, foundOne);

    function foundOne(err, user) {
        if (err != null) return next(err);
        db.friends.find({userId:user.id}, foundFriends);
    }

    function foundFriends(err, friends) {
        if (err != null) return next(err);
        return next(null, friends);
    }
}</code>
</pre>
<p>
It's actually longer now, but it's much easier to parse.
When you want to learn what any given function does, you only have to understand
1-2 lines.
For example, <code>getUserFriends</code> really only has 1 line which is the
<code>findOne</code> part.
The rest is a list of function declarations.
Next you probably want to learn what the <code>foundOne</code> function does,
so you jump to it and only have to read 2 lines to know what it does.
Finally you probably want to learn what the <code>foundFriends</code> function
does, so you jump to it, and again, only have to read 2 lines.
</p>
<h3>Example</h3>
<p>
Here's another one taken from Kal (sorry to pick on you
<a href="https://github.com/rzimmerman">rzimmerman</a>; you have good examples):
</p>
<pre>
<code class="language-javascript">var async = require('async');

var getUserFriends = function (userName, next) {
    db.users.findOne({name:userName}, function (err, user) {
        if (err != null) return next(err);
        getFriendsById(user.id, function (err, friends) {
            if (err != null) return next(err);
            if (user.type == 'power user') {
                async.map(friends, getFriendsById, function (err, friendsOfFriends) {
                    for (var i = 0; i &lt; friendsOfFriends.length; i++) {
                        for (var j = 0; j &lt; friendsOfFriends[i].length; j++) {
                            if (friends.indexOf(friendsOfFriends[i][j]) != -1) {
                                friends.push(friendsOfFriends[i][j]);
                            }
                        }
                    }
                    return next(null, friends);
                });
            } else {
                return next(null, friends);
            }
        });
    });
}
var getFriendsById = function (userId, next) {
    db.friends.find({userId:userId}, function (err, friends) {
        if (err != null) return next(err);
        return next(null, friends);
    });
}</code>
</pre>
<p>
Yep that is an eyesore. Let's see what the 2 rules do.
</p>
<pre>
<code class="language-javascript">var async = require('async');

function getUserFriends(userName, next) {
    db.users.findOne({name:userName}, foundUser);

    function foundUser(err, user) {
        if (err != null) return next(err);
        getFriendsById(user.id, gotFriends);

        function gotFriends(err, friends) {
            if (err != null) return next(err);
            if (user.type == 'power user') {
                async.map(friends, getFriendsById, wtfFriendAction);
            } else {
                return next(null, friends);
            }

            function wtfFriendAction(err, friendsOfFriends) {
                for (var i = 0; i &lt; friendsOfFriends.length; i++) {
                    for (var j = 0; j &lt; friendsOfFriends[i].length; j++) {
                        if (friends.indexOf(friendsOfFriends[i][j]) != -1) {
                            friends.push(friendsOfFriends[i][j]);
                        }
                    }
                }
                return next(null, friends);
            }
        }
    }
}

function getFriendsById(userId, next) {
    db.friends.find({userId:userId}, function (err, friends) {
        if (err != null) return next(err);
        return next(null, friends);
    });
}</code>
</pre>
<p>
Okay actually while refactoring that code I realized it made no sense, hence
my naming of <code>wtfFriendAction</code>. But let's run with it.
</p>
<p>
In this refactored code, we've only reduced the maximum nesting by 1 - from 8 to 7.
But consider how much easier it is to follow.
When you look at any given function, there are a few lines of
synchronous code, followed by function declarations.
Exception - I left <code>getFriendsById</code> alone since it is so short.
</p>
<h3 id="compile-to-js">Quick note on the downsides of compile-to-js languages</h3>
<p>First to note - Coffee-Script actually <em>prohibits</em> this kind of code
organization, because all functions are necessarily assignments.
Other compile-to-js languages solve this problem by providing function
declarations.
But all compile-to-js languages have some fundamental problems.
</p>
<p>
For one, you increase the barrier to contributions to your code.
People are a bazillion times more likely to create a pull request if they
already know the language your module or app is written in.
When you pick an obscure language to write your code in, you alienate
a large number of potential contributors.
</p>
<p>
It gets worse. Most people, when evaluating a module or app, will
quickly scan the source code to see if the implementation looks reasonable.
The depth of analyzation may not be too great; people are looking for
obvious problems.
When they see that it's written in another language, it makes the codebase
seem foreign; possibly even untrusted. At the very least it hampers their
ability to judge quality.
</p>
<p>
Finally, it means adding a build step to your code. Often this is not a big deal;
you may already have a build step. But it is an additional moving part in your
project that must be understood by you and any potential contributors, or even
potential users.
</p>
<p>
I probably sound like one of those grumps who scoffed at any programming language
higher-level than assembly.
But I actually really got into Coffee-Script for a while, then switched over
to <a href="https://github.com/satyr/coco/">coco</a> due to it solving some
problems better. I have a <a href="https://github.com/andrewrk/groovebasin">nontrivial music player app</a> written in coco. It was first in JavaScript, then Coffee-Script, then coco.
But I've converted back to pure JavaScript in the active working branch.
The first version of <a href="https://github.com/andrewrk/naught">naught</a>
was written in coco but that's now JavaScript as well.
I learned the hard way about some of these tradeoffs.
</p>
<h3>Conclusion</h3>
<p>
I've outlined 2 simple rules for callback organization that I think make writing
async code in pure JavaScript more than adequate.
To review:
</p>
<ol>
  <li>Avoid nontrivial anonymous functions.</li>
  <li>Function declarations <em>after</em> the code that actually does things.</li>
</ol>
<p>
Both principles aim for the same goal:
A reader of your code should be able to look at the code that actually does
things synchronously all together.
This means placing all the stuff that happens later at the end.
</p>
<p>
I'm not one to hinder progress in the world of programming languages,
but I thought I'd share my perspective on why I still write pure JavaScript
for Node.js and browser apps.
</p>
]]></description>
      </item>
      <item>
         <title>I am not a "JavaScript Developer".</title>
         <pubDate>Wed, 14 Aug 2013 15:29:13 GMT</pubDate>

         <link>https://andrewkelley.me/post/not-a-js-developer.html</link>
         <guid>https://andrewkelley.me/post/not-a-js-developer.html</guid>
         <description><![CDATA[<h1>I am not a "JavaScript Developer".</h1>

<p>
It seems that the tech world is trying to thrust upon me an identity
which I do not hold. GitHub puts "JavaScript" right under my name:
</p>
<a href="https://github.com/andrewrk">
<img style="margin: 20px" src="https://s3.amazonaws.com/superjoe/blog-files/not-a-js-developer/github-js.png">
</a>
<p>
Here are snippets from actual emails I've received from recruiters:
</p>
<blockquote>
  We are currently expanding our Engineering team here in New York City and are
  looking for a <strong>JavaScript Developer</strong> to join our team. I
  think you'd be a great fit.
</blockquote>
<blockquote>
  Description: HTML <strong>5 Javascript Developer</strong><br>
  Looking for an experienced <strong>javascript engineer</strong>...
</blockquote>
<blockquote>
  This is a lead <strong>JavaScript developer</strong> that has a great object oriented
  foundation.
</blockquote>
<blockquote>
  I am also wondering if you would have any interest in learning more about a
  specific Lead <strong>JavaScript Developer</strong> position that I am
  searching on which looks as if it at least might be a good fit for you.
</blockquote>
<p>
  I am not a "JavaScript Developer".
</p>
<p>
I am a generalist.
</p>
<p>
I am most engaged when working on hard problems
that I have not yet wrapped my brain around.
</p>
<p>
Sometimes this involves writing
a web server.<br>
Sometimes it involves drawing designs or trigonometry problems
on paper.<br>
Sometimes it means making a video game in a language I've never used before.<br>
Sometimes it involves creating animations.<br>
Sometimes it means
<a href="jamulator.html">trying to recompile NES games into native executables</a>.<br>
Sometimes it involves using JavaScript.
</p>
<p>
But never does my thought process go like this:
</p>
<ol>
  <li>Hmm. I want to write a JavaScript program.</li>
  <li>What project can I work on so that I can use JavaScript?</li>
  <li>Hmm,
  <a href="http://nodejs.org">Node.js</a>
  looks fun, I think I will create a library to help you create
  <a href="http://minecraft.net">Minecraft</a>
  bots!
  </li>
</ol>
<p>
It's more like this:
</p>
<ol>
  <li>Minecraft is fun. I bet it would be even more fun to write bots to do this work for me.</li>
  <li>(2 days later)
  <a href="https://github.com/andrewrk/mineflayer/tree/e8bb1cddb0be855310eabc3f708c9f0715a17a7d">get a basic graphical client working</a>
  with C++ and
  <a href="http://qt-project.org">Qt</a>
  </li>
  <li>
  Hmm, with this framework in place it would be cool to allow people to write their own
  scripts using a stable API.
  </li>
  <li>
  JavaScript seems like a nice plugin scripting language, and it's built right into Qt!
  </li>
  <li>
  (10 months later)
  <a href="https://github.com/andrewrk/mineflayer/tree/cpp-qt-end">
    Done! The c++ platform runs user-land JavaScript bots and provides a Minecraft API.
  </a>
  </li>
  <li>(1 year later)
  Hmm. This abandoned project could probably be re-done pretty cleanly as a Node.js module.
  </li>
  <li>
  (2 months later)
  <a href="https://github.com/andrewrk/mineflayer">Done! Yep this is pretty nice and clean.</a>
  </li>
</ol>
<p>
Or sometimes it goes like this:
</p>
<ol>
<li>
I want a program to turn any reasonable audio file into a png of its waveform, and it has to be <em>fast</em>.
</li>
<li>
Okay then, I'll use <a href="http://sox.sourceforge.net/">libsox</a>,
<a href="http://www.libpng.org/pub/png/libpng.html">libpng</a>, and C.
</li>
<li>
<a href="https://github.com/andrewrk/waveform">Done! It's nice and fast.</a>
</li>
</ol>
<p>
I've created nontrivial applications languages including but not limited to:
</p>
<ul>
  <li>ActionScript 1-3</li>
  <li>C</li>
  <li>C++</li>
  <li>C#</li>
  <li>Go</li>
  <li>Java</li>
  <li>Perl</li>
  <li>Python</li>
  <li>Ruby</li>
  <li>VB.net</li>
  <li>Visual Basic 6</li>
</ul>
<p>
I could write you a
<a href="http://en.wikipedia.org/wiki/Genetic_algorithm">genetic algorithm</a>
or implement <a href="http://en.wikipedia.org/wiki/A*_search_algorithm">A* search</a>
in any of these languages without looking up any reference material.
</p>
<p>
I've worked on nontrivial projects using frameworks including but not limited to:
</p>
<ul>
  <li><a href="http://www.gnu.org/software/bison/">Bison</a></li>
  <li><a href="https://www.djangoproject.com/">Django</a></li>
  <li><a href="http://flex.sourceforge.net/">Flex</a></li>
  <li><a href="http://llvm.org/">LLVM</a></li>
  <li><a href="http://www.mongodb.org/">MongoDB</a></li>
  <li><a href="http://www.mysql.com/">MySQL</a></li>
  <li><a href="http://www.ogre3d.org/">Ogre 3D</a></li>
  <li><a href="http://www.postgresql.org/">PostgreSQL</a></li>
  <li><a href="http://www.pygame.org/">PyGame</a></li>
  <li><a href="http://www.pyglet.org/">pyglet</a></li>
  <li><a href="http://redis.io/">Redis</a></li>
  <li><a href="http://www.libsdl.org/">SDL</a></li>
  <li><a href="http://www.sfml-dev.org/">SFML</a></li>
</ul>
<p>
Some of the things I've built include but are not limited to:
</p>
<ul>
  <li>Video Games
  (example
  <a href="http://s3.amazonaws.com/superjoe/temp/pillagers/index.html">1</a>
  <a href="https://s3.amazonaws.com/superjoe/temp/games/tetris/index.htm">2</a>
  <a href="http://pyweek.org/e/superjoe/">3</a>
  )
  </li>
  <li>Artificial Intelligence (example 
  <a href="https://github.com/andrewrk/evo">1</a>
  <a href="https://github.com/andrewrk/tetrisbuster">2</a>
  <a href="https://github.com/andrewrk/mineflayer-navigate">3</a>
  )</li>
  <li>
  Contributions to <a href="http://lmms.sourceforge.net/">lmms</a>,
  a Digital Audio Workstation.
  </li>
  <li>
  Client and server <a href="https://github.com/thejoshwolfe/repatriator">software</a>
  to control a custom hardware specimen viewer
  for entomologists. (<a href="http://www.experts.scival.com/asu/pubDetail.asp?t=pm&id=84864628820&n=Quentin+Duane+Wheeler&u_id=1799&oe_id=1&o_id=">relevent publication</a>)
  </li>
  <li>
  A pascal-to-mips compiler
  </li>
  <li>
  Linux utilities
  </li>
  <li>
  Windows utilities
  </li>
</ul>
<p>
I could go on, but I'm not trying to brag; I'm trying to explain why I am
annoyed at being labeled a "JavaScript Developer".
</p>
<p>
If you want to call me something, it should be "Interesting Thing Developer"
or "Problem Solver". I'll even accept "Software Developer".
</p>
<p>
But I am not a "JavaScript Developer".
</p>
]]></description>
      </item>
      <item>
         <title>7dRTS Game Reviews</title>
         <pubDate>Tue, 30 Jul 2013 09:54:32 GMT</pubDate>

         <link>https://andrewkelley.me/post/7drts-game-reviews.html</link>
         <guid>https://andrewkelley.me/post/7drts-game-reviews.html</guid>
         <description><![CDATA[<h1>7dRTS Game Reviews</h1>
<p>
We all have limited time. This is a guide to help you decide which
<a href="http://www.ludumdare.com/compo/2013/07/05/minild-44-announcement/">7dRTS</a>
games to play, and which you might unfortunately have to skip if you don't have time to
play them all. It is only one person's opinion.
</p>
<h2>I haven't played these yet</h2>
<a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=22584">Power Grab</a>
<a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=24736">XenoCombat</a>
<a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=2430">Congregation</a>
<a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=24967">Craft and Conquer</a>
<a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=24772">TAvA</a>
<a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=21441">TANKBOX 30XX</a>
<a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=9874">War Shape</a>
<a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=24947">Frisk</a>




<a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=24050">Demon Front, A Hero-centric RTS</a>
<a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=24897">System Command</a>
<a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=20290">3D Chess</a>
<a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=24891">Attrition</a>
<a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=24743">CaptureTheFlag - Mech Style</a>
<a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=12415">Crime 1930</a>
<a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=24765">Total Knowledge Domination</a>
<a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=13812">Guardian III: Corruption</a>
<a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=24939">Microspace</a>
<a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=7542">MountainKing</a>
<a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=24695">o08</a>
<a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=7997">SpaceRTS</a>


<h2>Definitely check these out</h2>
<div class="media">
  <a class="pull-left" href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=19180">
    <img style="max-width: 200px" class="media-object" src="http://www.ludumdare.com/compo/wp-content/compo2/262546/19180-shot1.png">
  </a>
  <div class="media-body">
    <h4><a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=19180">Chess</a></h4>
    <p>
    Play Chess, but without taking turns. Frenetic and very fun. Plays in your browser.
    </p>
  </div>
</div>


<div class="media">
  <a class="pull-left" href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=15341">
    <img style="max-width: 200px" class="media-object" src="http://www.ludumdare.com/compo/wp-content/compo2/262546/15341-shot0.png">
  </a>
  <div class="media-body">
    <h4><a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=15341">Holo Wars</a></h4>
    <p>
    This game nails it. It's a perfect example of a simple Real Time Strategy game. It gives
    beginner friendly non-obtrusive help at the beginning to hint what the controls are
    and what you should do to get started. The graphics and music are fantastic. 
    </p>
  </div>
</div>

<div class="media">
  <a class="pull-left" href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=24697">
    <img style="max-width: 200px" class="media-object" src="http://www.ludumdare.com/compo/wp-content/compo2/262546/24697-shot3.png">
  </a>
  <div class="media-body">
    <h4><a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=24697">Pillagers!</a></h4>
    <p>
Yes this one is mine and I'm biased. But I do think it is worth your time. It has smooth physics,
sophisticated AI, and fantastic music and sound effects created by
<a href="http://brokensounds.com">Michael Weber</a>. It blurs the line between RTS and shmup.
    </p>
  </div>
</div>
<div class="media">
  <a class="pull-left" href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=24863">
    <img style="max-width: 200px" class="media-object" src="http://www.ludumdare.com/compo/wp-content/compo2/262546/24863-shot3.png">
  </a>
  <div class="media-body">
    <h4><a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=24863">The Great Story of DOTS</a></h4>
    <p>
    Beautifully executed. Gameplay is the right level of difficulty, the tutorial
    in the beginning is perfect, and it explores a different way of commanding
    using hand drawn lines of attack. Plays in your browser.
    </p>
  </div>
</div>

<a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=24863">The great story of DOTS</a>
<div class="media">
  <a class="pull-left" href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=20512">
    <img style="max-width: 200px" class="media-object" src="http://www.ludumdare.com/compo/wp-content/compo2/262546/20512-shot0.png">
  </a>
  <div class="media-body">
    <h4><a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=20512">Super Simple Switching Strategist</a></h4>
    <p>
    This Flash game is tastefully minimal. Gameplay is very simple, yet requires
    strategy to win. It presents 4 stages of increasing difficulty where you must flip
    switches that decide what units you will create and where they will go.
    </p>
  </div>
</div>

<div class="media">
  <a class="pull-left" href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=8791">
    <img style="max-width: 200px" class="media-object" src="http://www.ludumdare.com/compo/wp-content/compo2/262546/8791-shot1.png">
  </a>
  <div class="media-body">
    <h4><a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=8791">Age of Rice</a></h4>
    <p>
    Short and sweet, this game pulls off a simple real time strategy game that is
    fun to play. Plays in your browser and requires only a few minutes of your time.
    </p>
  </div>
</div>

<div class="media">
  <a class="pull-left" href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=5422">
    <img style="max-width: 200px" class="media-object" src="http://www.ludumdare.com/compo/wp-content/compo2/262546/5422-shot0.png">
  </a>
  <div class="media-body">
    <h4><a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=5422">Sons of a Dark Age</a></h4>
    <p>
    This Flash game is similar to
    <a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=20512">Super Simple Switching Strategist</a>
    except more frenetic, and more dependent upon your ability to click fast enough.
    Still, this solid production is worth the 10 minutes of your time it takes to beat.
    </p>
  </div>
</div>

<h2>Worth looking into if you have enough time</h2>
<dl>
  <dt>
    <a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=20953">Lode Storm</a> (Windows)
  </dt>
  <dd>
    This game is a little too hard, but with repeated tries it is possible to win.
    You have to try to gain control of unit-producing nodes.
    It has a nice retro feel to it.
  </dd>
  <dt>
<a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=15656">Metal Universe</a>
</dt>
<dd>
Minimal Flash game. Still, feels like a real-time strategy game.
Contains about 5 minutes of fun.
</dd>
  <dt><a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=22017">Manifest Destiny</a> (Windows)</dt>
  <dd>Holy shit look at that title screen. This game has some intriguing mechanics
but in the end is a bit tedius to play because it requires you to hold down the right mouse
button for upwards of 2 minutes at a time, waiting for your armies to increase in number.
  </dd>
  <dt> <a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=24957">Schaak!</a>
</dt>
<dd>
This is fascinating - a huge chess board filled with pieces equipped with their own AI.
Unfortunately it's local multiplayer - no network, no computer opponent.
It's a fun simulation to play with.
</dd>
<dt>
<a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=22213">Troll Defense</a>
</dt>
<dd>
This game is way too hard, but it'll give you a chuckle. Tower defense where meme faces try to
get into your website.
</dd>
<dt>
<a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=24628">Rock, Paper, Geometry!</a>
</dt>
<dd>
Although very short, it does feel like a game. You control shapes by giving them "impulses" in the physics engine. You must cause the correct shape on your team to collide with the correct shape
on the enemy team.
</dd>
<dt>
<a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=7159">Mini Wars</a>
(Windows)
</dt>
<dd>
Short game. Decently balanced. A bit lacking in gameplay depth. The graphics are very green.
</dd>
<dt>
<a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=10029">Knights &amp; Arrows</a>
</dt>
<dd>
The graphics and visual effects are stunning. Unfortunately, the gameplay is
extremely frustrating due to your units not automatically defending themselves.
Still, it's worth it to ooh and aah at the custom shader.
</dd>
<dt>
<a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=24894">Newspaperts</a>
</dt>
<dd>
Playable. It has some RTS elements to it but they all involve clicking
the mouse as fast as you can.
</dd>
<dt>
<a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=24864">City Fall</a>
</dt>
<dd>
It's not quite game status yet but has some impressive 3D graphics. Supposedly when a building
is destroyed there is an epic explosion. Controls are tricky to use and AI seems glitchy.
</dd>
<dt>
<a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=19347">Bunnies on Board</a>
</dt>
<dd>
The rules of the game are confusing and arbitrary but if you take the time to learn
them it can be fun. Plays in your browser.
</dd>
</dl>


<h2>Not quite playable yet</h2>
<ul>
  <li><a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=24856">Interstellar</a>
  - works in your web browser. Looks like it has some promise but there isn't much to do.
  </li>
  <li><a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=24745">Love Squad</a>
  (Windows) - this game is hilarious right from the title screen. Unfortunately it ends right
  before the gameplay is about to start.
  </li>
  <li><a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=20545">All Your Base!...</a>
  - training program seems to be working and then quits. Clicking stops working after a while.
  </li>
  <li><a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=20619">Monkeys vs Crocodiles</a>
  (Windows) - nothing to do besides watch the bad guys kill you
  and then you.</li>
  <li><a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=24975">Booty Defense</a> - Plants vs Zombies clone; no challenge yet.</li>
  <li><a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=18361">DEFEND THE CORN</a> -
  mouse clicking game; not much to do.</li>
  <li><a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=24878">Cherrys To Cherrys: CompEdition</a>
  (Windows/OSX/Unity) - glitchy and not really clear what to do.
  <li><a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=7085">Generic Space RTS (Prototype)</a>
  (Windows) - its own description says that it's not playable yet.
  <li><a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=5040">Life &amp; File</a>
  (Windows/Unity) - unlabeled buttons and impossible gameplay.</li>
  <li><a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=24974">Lightning Fox Kick V: Doom Vortex</a>
  - not clear what is happening. No strategy.</li>
  <li><a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=17652">BotWars</a>
  - nothing to do. Tanks don't even move.</li>
  <li><a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=21615">Scarab</a>
  - there is some gameplay here but it's unfinished and unfun.</li>
  <li><a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=24589">SyncMind</a>
  - web/text based. You get game over if a 3 sided dice gets a 1. No fun.</li>
  <li><a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=17764">Sting Wars</a>
  - author says on the description page it's not playable yet.</li>
  <li><a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=20723">The Unfinished</a>
  - author says it's not ready yet.</li>
  <li><a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=17385">CommaWar</a> - trouble installing, based on description does not look ready.</li>
  <li><a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=3098">Nocturne</a>
  - no idea what to do. Controls don't seem to work. Something about multiplayer?</li>
  <li><a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=14843">Tower Defense</a> - rudimentary. Some installation issues.</li>
</ul>

<h2>I did not get these to work</h2>

<ul>
  <li>
  <a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=7091">7dRTS - iforce2d</a> - error while loading shared libraries: libGLEW.so.1.6: cannot open shared object file: No such file or directory - I do have GLEW installed.
  </li>
  <li><a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=14535">Eggz</a>
  - spammy download mirror. Source code lacks setup instructions.</li>
  <li><a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=22834">Meanwhile, in Space...</a>
  - spammy download mirror.</li>
  <li>
  <a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=24834">Nebula Prime Ep1</a>
  - Linux version contains Windows exe file. Windows version depends on .NET framework.</li>
  <li>
  <a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=24741">Overseer</a>
  - Unity web player only.</li>
  <li>
  <a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=3935">FrogForce</a>
  - Windows only, depends on .NET framework.</li>
  <li><a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=9963">Capture Effect</a> - Unity web player only.  </li>
  <li><a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=20961">Polarity</a>
  - Unity web player only.</li>
  <li><a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=5432">RTS</a>
  - Unity web player only.</li>
  <li><a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=24827">Strat Souls</a>
  - Unity web player only.</li>
  <li><a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=14165">Android Arena</a>
  - Unity web player only.</li>
  <li><a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=12154">At the end</a>
  - Unity web player only.</li>
  <li><a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=1187">IKON_COLONIZE</a>
  - Unity web player only.</li>
  <li><a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=10065">Orchammer 1944</a> - Unity web player only.</li>
  <li><a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=24754">Quebec</a> - Unity web player only.</li>
  <li><a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=4598">Ultipoly</a>
  - source distribution only; written in an arcane language; no installation instructions.
  </li>
</ul>
<dl>
  <dt>System</dt>
  <dd>Linux 3.8.0-27-generic #40-Ubuntu SMP Tue Jul 9 00:17:05 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
  </dd>
  <dt>User agent</dt>
  <dd>Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Ubuntu Chromium/28.0.1500.71 Chrome/28.0.1500.71 Safari/537.3
  </dd>
  <dt>wine</dt>
  <dd>Version 1.4.1. No mono or .NET framework.</dd>
  <dt>unity web player</dt>
  <dd>Nope.</dd>
</dl>
<h2>No Linux or web version</h2>
<ul>
  <li><a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=24826">Marmoreal: Fleur de Lis</a> (Windows)</li>
  <li><a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=22269">Secrets of the divine -- WIP</a> (Windows)</li>
  <li><a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=18648">Ambush</a> (Windows)</li>
  <li><a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=20564">Defend the Famine</a> (Windows)</li>
  <li><a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=17717">Flaggers</a> (Windows)</li>
  <li><a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=24700">Flash of the Titans: Kallipso Nikon</a> (Windows)</li>
  <li><a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=12485">Mobius</a> (Windows/OSX/Unity)</li>
  <li><a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=18401">Combat Shock</a> (Windows/OSX/Unity)</li>
  <li><a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=18552">Icarus</a> (Windows)</li>
  <li><a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=24578">Lys kai</a> (Windows)</li>
  <li><a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=12517">Monster Defence</a> (Windows)</li>
  <li><a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=24775">SupremeOverlord Work In Progress</a> (Windows)</li>
  <li><a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=24838">Warblocks</a> (Windows/Unity)</li>
  <li><a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=24948">World's Aftermath</a> (Windows)</li>
  <li><a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=14284">Honk Konk</a> (Windows)</li>
  <li><a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=13747">infrontofaradar</a> (Windows)</li>
  <li><a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=24682">Jumpstarter</a> (Windows)</li>
  <li><a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=19778">Ninja Squad Commander</a> (Windows/Unity)</li>
  <li><a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=24829">Pirate AaaaaRTS!</a> (Windows)</li>
  <li><a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=7793">Protectors of the World Tree</a> (Windows)</li>
  <li><a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=14711">Spawner Pawners</a> (Windows)</li>
  <li><a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=24965">Substruct</a> (Windows/Unity)</li>
  <li><a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=11500">TrashScape</a> (Windows)</li>
</ul>
]]></description>
      </item>
      <item>
         <title>Pillagers! 7dRTS Game Development Journal</title>
         <pubDate>Tue, 23 Jul 2013 10:09:53 GMT</pubDate>

         <link>https://andrewkelley.me/post/pillagers-7drts-game-dev-journal.html</link>
         <guid>https://andrewkelley.me/post/pillagers-7drts-game-dev-journal.html</guid>
         <description><![CDATA[<h1>Pillagers! 7dRTS Game Development Journal</h1>
<p>
I have decided to participate in the
<a href="http://www.ludumdare.com/compo/2013/07/05/minild-44-announcement/">7-Day Real Time Strategy Challenge, July 2013 edition</a>.
</p>
<p>
I am working with <a href="http://brokensounds.com/">Michael Weber</a> who will
create sound effects, music, and be in charge of the "atmosphere" of the game.
</p>
<h2 id="day-1"><a href="#day-1">Day 1</a></h2>
<p>
<a href="http://www.ludumdare.com/compo/2013/07/23/pillagers-7drts-game-development-journal-day-1/">Cross-posted on ludumdare.com</a>.
</p>
<p>
I am using <a href="https://github.com/andrewrk/chem/">chem</a>,
a canvas-based game engine I made for rapid development occasions 
such as this. It has been working out quite well and I think it has been
playing a large role in my productivity today.
</p>
<p>
The game is codenamed "pillagers". The idea is to mix space physics with
real time strategy and see what comes out.
Instead of creating a carefully planned out base, you will be campaigning
through levels, pillaging for resources.
</p>
<p>That's the idea, anyway. We'll see how it pans out.</p>
<p>
Here's a list of the things that work right now:
</p>
<ul>
  <li>Selecting squads and telling them to move around.</li>
  <li>Ships shoot enemy ships, destroying them when their health reaches 0.</li>
  <li>Scrolling around the map.</li>
  <li>Auto generated parallax background with stars and a planet.</li>
</ul>
<p>
The code is
<a href="https://github.com/andrewrk/pillagers">open source, hosted on GitHub</a>.
It's 941 lines of JavaScript:
</p>
<pre>
   46 src/bullet.js
   33 src/explosion.js
  432 src/main.js
   19 src/militia_ship.js
  241 src/ship_ai.js
  165 src/ship.js
    5 src/uuid.js
  941 total
</pre>
<p>Screenshot of a squad of ships under attack:</p>
<img src="https://s3.amazonaws.com/superjoe/blog-files/pillagers-7drts-game-dev-journal/screenshot-day-1.png" alt=""/>
<p>Short video of me playing around with some elements of the game</p>
<iframe width="560" height="315" src="https://www.youtube.com/embed/QQ1jGk5e7o0" frameborder="0" allowfullscreen></iframe>
<p>
It was a fun challenge to write the AI to get the ships to stop at the intended
destinations. As is it's not optimal, but I think that's okay. Maybe later classes
of ships will have better AI.
</p>
<p>
You can actually <a href="http://s3.amazonaws.com/superjoe/temp/pillagers/index.html">play this game right now</a>. Since it's web based, you don't
even have to download anything.
</p>
<p>
Here's what the TODO list looks like currently:
</p>
<ul>
  <li>Edit the main militia ship to have short range.</li>
  <li>Auto targeting enemies should only work when close enough.</li>
  <li>Ability to right click an enemy ship to tell squad to target it.</li>
  <li>Add enemy turrets and ships to level 1.</li>
  <li>Add enemy flag which grants victory when destroyed.</li>
  <li>Add 2nd class of ship which you get some of at beginning of level 1.</li>
  <li>
  Instead of giving the player ships at the beginning, give them cash
  and buildings which they can use to create the ships they want.
  The user will thus be able to choose how many of class 1 and class 2 ships
  they want.
  </li>
  <li>Put instruction label text in level 1 to explain the controls.</li>
  <li>Start planning level 2.</li>
</ul>
<p>
Well, I'm off to bed to get some rest. Looking forward to Day 2!
</p>
<h2 id="day-2"><a href="#day-2">Day 2</a></h2>
<p>
<a href="http://www.ludumdare.com/compo/2013/07/24/pillagers-7drts-game-development-journal-day-2/">Cross-posted on ludumdare.com</a>.
</p>
<p>
<a href="http://opengameart.org/">opengameart.org</a> has been very good to
me. I think this will be my new go-to game dev art supply.
So far I've found everything I've wanted on that site, except for a
meteor sprite. Maybe I'll make one and contribute back.
</p>
<p>
Michael made a lazer sound and a ship thrusting sound, and I
added the sound effects to the game.
In addition to that, I made a ton of progress today:
</p>
<ul>
  <li>3pm - draw flags only if selected</li>
  <li>5pm - add another class of ship</li>
  <li>8pm - found out I was using <code>t * a ^ 2 + t * v</code> for
  acceleration instead of <code>a * t ^ 2 + v * t</code>. I fixed
  all the math and made ships arriving at gather points nice and smooth.
  </li>
  <li>9pm - added team colors overlayed on ships</li>
  <li>10pm - gave the Militia ship an electric melee attack</li>
  <li>11pm - gave Militia ships smart attacking AI</li>
  <li>12am - made Ranger ships pursue targets</li>
  <li>2am - added TurretShip and FlagShip</li>
  <li>3am - added support for ships with backwards thrusters</li>
  <li>3am - added move &amp; engage command</li>
  <li>3am - improved ship selection capabilities</li>
  <li>5am - added title screen, credits screen, and game over screen.
    I made it possible to win and lose the game.
  </li>
</ul>
<p>
Here's the title screen:
</p>
<img src="https://s3.amazonaws.com/superjoe/blog-files/pillagers-7drts-game-dev-journal/day-2-titlescreen.png" alt="">
<p>
Here's a screenshot of a battle underway:
</p>
<img src="https://s3.amazonaws.com/superjoe/blog-files/pillagers-7drts-game-dev-journal/day-2-screenshot.png" alt="">
<p>
Battles can get hectic and chaotic, but it is fun to watch it play out.
Here's a video of me demoing a battle:
</p>
<iframe width="560" height="315" src="https://www.youtube.com/embed/pnEsHqmNU2c" frameborder="0" allowfullscreen></iframe>
<p>
Again, you can playtest this game in your browser right now using the
<a href="http://s3.amazonaws.com/superjoe/temp/pillagers/index.html">same url</a> as Day 1.
Feel free to give me feedback or advice.
</p>
<p>
I came up with some ideas on what bigger picture gameplay will look like
which I am pretty pleased with.
There will be a bunch of different classes of ships, for example:
</p>
<dl>
  <dt>Militia</dt>
  <dd>Basic cheap unit. Has short range. Agile.
  Actually doesn't shoot lazers, shoots a short range but powerful
  lightning attack out the front. It must resort to charging other
  ships to attack them.
  </dd>
  <dt>Ranger</dt>
  <dd>Short range weak lazers. Less agile. Less defense. In a level, a
   Militia should be able to overtake a Ranger and kill it.
  </dd>
  <dt>Artillery</dt>
  <dd>Slower, shoots big lazers. Slow moving, weak defense.
  </dd>
  <dt>FlagShip</dt>
  <dd>Does not attack. Has high defense. Moves very slowly.
  </dd>
  <dt>Turret</dt>
  <dd>Stationary lazer shooter.
  </dd>
  <dt>Medi</dt>
  <dd>Heals other ships.</dd>
</dl>
<p>
There will be some obvious deficiencies with the ships targeting systems
and AI. These can be helped with upgrades, such as:
</p>
<ul>
  <li>Targeting
  <ul>
    <li>Target the first enemy found (default)</li>
    <li>Target the closest enemy.</li>
    <li>Communicate with others to divide up the targets evenly.</li>
 </ul>
  </li>
  <li>Ranger
  <ul>
    <li>Upgrade range.</li>
    <li>Update bullet count to 3.</li>
    <li>Evasive maneuvers 1.</li>
    <li>Evasive maneuvers 2.</li>
    <li>Smarter aiming.</li>
    <li>Backwards thrusters.</li>
  </ul>
  </li>
  <li>Militia
  <ul>
    <li>Evade lazers.</li>
    <li>Backwards thrusters.</li>
    <li>Smarter aiming.</li>
   </ul>
   </li>
</ul>
<p>
You will start with only a flagship, and as you progress throughout
the campaign, you will start to build up a convoy that gets ever bigger
and more powerful. You'll need it to be bigger and more powerful
to get through later levels, in fact.
</p>
<p>
Here's what's next on the TODO list:
</p>
<ul>
  <li>Build simpler level 1 where you only have to navigate your
  flag ship around meteors.</li>
  <li>Give the player some cash and let them choose to build
  Ranger ships or Militia ships using the Flagship.
  </li>
  <li>Create level 2 where you have to destroy the enemy flagship
  and then fly your flagship through the created portal.</li>
</ul>
<p>
I'm at a pretty good checkpoint right now. I'm calling it a night.
</p>
<h2 id="day-3"><a href="#day-3">Day 3</a></h2>
<p>
<a href="http://www.ludumdare.com/compo/2013/07/25/pillagers-7drts-game-development-journal-day-3/">Cross-posted on ludumdare.com</a>.
</p>
<p>
Today Michael gave me the first music track he composed for the game
and I am impressed. 
I am excited to have professional sounding music for a change.
</p>
<p>
That being said, the first thing I did after inserting the music
into the game was program a mute button so that I could keep
listening to techno while I coded.
</p>
<p>
Michael also delivered an electric explosion sound effect that works
perfectly.
</p>
<p>
I mentioned yesterday that I could not find a meteor graphic to use
on <a href="http://opengameart.org/">opengameart.org</a>. I actually
did end up finding
<a href="http://opengameart.org/content/rocks">art that works really well</a>
by searching for "rock".
</p>
<p>
You can see the three rock types in the current spritesheet for the game
which is autogenerated by
<a href="https://github.com/andrewrk/chem/">chem</a>:
</p>
<img src="https://s3.amazonaws.com/superjoe/blog-files/pillagers-7drts-game-dev-journal/day-3-spritesheet.png" alt="" style="background-color: #000000">
<p>
The first thing I did today was create a electrical disintegration
animation, and I'm pretty pleased with the result:
</p>
<img src="https://s3.amazonaws.com/superjoe/blog-files/pillagers-7drts-game-dev-journal/day-3-disintegrate.gif" alt="" style="background-color: #000000">
<p>
I wonder if this is considered good enough to submit to
<a href="http://opengameart.org/">opengameart.org</a>.
</p>
<p>
After I finished that animation, I worked hard and had a productive day:
</p>
<ul>
  <li>3pm - Finished electrical disintegration animation and added sound effect to game.</li>
  <li>3pm - Added in Michael's background music and made it so you can toggle it with M key.</li>
  <li>5pm - Added meteors with collision detection.</li>
  <li>5pm - Fixed "ships rotating for no reason" bug.</li>
  <li>6pm - Inserted a new Level 1 - meteor field that you have to navigate
  through.</li>
  <li>7pm - Created a spastic portal graphic and added it to the level.</li>
  <li>7pm - Fixed navigation bug for ships equipped with backwards thrusters.</li>
  <li>9pm - Fixed scrolling when manual overriding.</li>
  <li>11pm - Added a UI pane when ships are selected.</li>
  <li>12am - Added a mini-map.</li>
  <li>1am - Made it so you can send ships into a portal.</li>
  <li>3am - Made it so selected portal shows what's inside it.</li>
  <li>4am - Added ability to send ships out of a portal.</li>
  <li>5am - Added announcement support.</li>
  <li>6am - Made your convoy show up on the level complete screen.</li>
  <li>6am - Added stats to the level complete screen.</li>
  <li>8am - Finished level complete screen so that you can get to the next level.</li>
</ul>
<p>
The source tree has grown quite a bit since last time I checked, up to
2,704 lines:
</p>
<pre>
   49 src/bullet.js
   77 src/credits_screen.js
   43 src/flag_ship.js
   30 src/fx.js
   73 src/game.js
   34 src/game_over_screen.js
  178 src/level_complete_screen.js
   20 src/main.js
   81 src/meteor.js
   58 src/militia_ship.js
   91 src/physics_object.js
   82 src/portal.js
   60 src/ranger_ship.js
   37 src/sfx.js
  449 src/ship_ai.js
  177 src/ship.js
    7 src/ship_types.js
   57 src/squad.js
  933 src/state.js
   23 src/team.js
  105 src/title_screen.js
   35 src/turret_ship.js
    5 src/uuid.js
 2704 total
</pre>
<p>
I actually feel pretty good about the code organization right now.
I have only had to pause progress and refactor 2-3 times and each time
it was mostly painless.
</p>
<p>
Enough of the technical stuff. Let's see some screenshots.
</p>
<p>
Here's one of the meteor field level you start out in:
</p>
<img src="https://s3.amazonaws.com/superjoe/blog-files/pillagers-7drts-game-dev-journal/day-3-screenshot.png" alt="">
<p>
Here's what it looks like when you finish the level:
</p>
<img src="https://s3.amazonaws.com/superjoe/blog-files/pillagers-7drts-game-dev-journal/day-3-level-complete.png" alt="">
<p>
And here's me giving a walkthrough of the progress made today:
</p>
<iframe width="560" height="315" src="https://www.youtube.com/embed/yL-C7-h34qM" frameborder="0" allowfullscreen></iframe>
<p>
My TODO list is getting a bit unruly. I've divided it into "next steps" and
"nice-to-have"s:
</p>
<p>Next steps:
<ul>
  <li>Add text instructions in Level 1 to explain the controls and
  how to beat the level.</li>
  <li>Make it so that you start out the next level with the same fleet that
  you exited with.</li>
  <li>Show your cash in the UI</li>
  <li>Allow you to create ships that you unlock by spending cash.</li>
  <li>Insert a level between 1 and 2 with some attackers. You'll have to build
      some Ranger ships to defend your Flagship.</li>
</ul>
<p>Nice-to-haves:</p>
<ul>
  <li>Figure out why the game slows to a crawl when many ships are
  added and then deleted.</li>
  <li>Experiment with speed cap on ships. (I'm reluctant to do this one -
      I like the idea of high-velocity dogfighting.)</li>
  <li>If Ranger is in range, don't accelerate toward target.</li>
  <li>Fix the thruster sound glitchiness</li>
  <li>Each command should draw something on the screen to indicate
      that something is commanded. (Some commands do this already.)</li>
  <li>When forming a squad, don't assume all ships have the same radius.</li>
</ul>
<p>
And now I must rest. I am exhausted.
</p>
<h2 id="day-4"><a href="#day-4">Day 4</a></h2>
<p>
<a href="http://www.ludumdare.com/compo/2013/07/26/pillagers-7drts-game-development-journal-day-4/">Cross-posted on ludumdare.com</a>.
</p>
<p>
It has been wonderful relying only upon circles in this physics engine.
The math is simple and beautiful, and it's easy to write fast code.
Who needs polygons anyway?
</p>

<p>
Today was a good day. Pillagers is now
<a href="http://s3.amazonaws.com/superjoe/temp/pillagers/index.html">actually a game</a>.
What I got done:
</p>
<ul>
  <li>5pm - Added more of Michael's sound effects into the game.</li>
  <li>5pm - Tweaked the physics.</li>
  <li>5pm - Added some cheats to help speed up testing.</li>
  <li>6pm - Added text in Level 1 to explain controls.</li>
  <li>7pm - The game shows how much cash you have.</li>
  <li>10pm - You keep your same fleet when progressing to the next level.</li>
  <li>12am - Added a new Level 2.</li>
  <li>12am - Made unlocking ships work.</li>
  <li>12am - Made it so you get cash from killing enemy ships.</li>
  <li>12am - Tweak the money system.</li>
  <li>1am - Made it so you can scroll and give orders while paused.</li>
  <li>1am - Fixed some game crashes and bugs.</li>
  <li>2am - Modified attacking AI to stay within certain speed limits.
  Makes the game less chaotic.
  </li>
  <li>3am - Added ability to skip tutorial levels.</li>
  <li>4am - Better squad formation.</li>
  <li>4am - Updated the Move &amp; Engage command to be more effective.</li>
  <li>6am - Added Civilian ships and plan out Level 4.</li>
  <li>7am - Add Artillery ship and finish implementing Level 4.</li>
  <li>8am - Fix the crash which happened at the end of the demo below.</li>
</ul>
<p>
Screenshot:
</p>
<img src="https://s3.amazonaws.com/superjoe/blog-files/pillagers-7drts-game-dev-journal/day-4-screenshot.png" alt="">
<p>
Screencast demo:
</p>
<iframe width="560" height="315" src="https://www.youtube.com/embed/9nJp5Zr4nQw" frameborder="0" allowfullscreen></iframe>
</p>
<p>
Bed.
</p>
<h2 id="day-5"><a href="#day-5">Day 5</a></h2>
<p>
<a href="http://www.ludumdare.com/compo/2013/07/27/pillagers-7drts-game-development-journal-day-5/">Cross-posted on ludumdare.com</a>.
</p>
<p>
I spent a good chunk of the day trying to solve a performance problem.
After approximately 100,000 bullets were fired, the framerate would
drop to an excruciating 16 FPS.
</p>
<p>
I was able to figure it out by using Google Chrome's
<a href="https://developers.google.com/chrome-developer-tools/docs/heap-profiling">heap profiling tool</a>.
Here's what the heap snapshot looked like:
</p>
<img src="https://s3.amazonaws.com/superjoe/blog-files/pillagers-7drts-game-dev-journal/day-5-memory-leak.png">
<p>
I found out that there was a memory leak in the game engine - every time
a sprite was created it called <code>setInterval</code> but it did not
call <code>clearInterval</code> when the sprite was deleted.
Ever since I <a href="https://github.com/andrewrk/chem/commit/900ba7c6f5c27617d9917653ee4d63e4f043374d">fixed the issue</a>, FPS has stayed at a nice
and smooth 60, no matter how many objects are created and destroyed.
</p>
<p>
Here are some other things I got done today:
</p>
<ul>
  <li>9pm - Miscellanous small enhancements.</li>
  <li>11pm - Fixed the performance issue.</li>
  <li>12am - Added the new thruster sound that Michael gave me.</li>
  <li>3am - Added Sandbox Mode.</li>
  <li>4am - Added more features to Sandbox Mode.</li>
  <li>6am - Added Ctrl+A to select all your ships. Also made WASD and J
      work the same as arrow keys and space.</li>
  <li>7am - Added dogfighting mode.</li>
  <li>8am - Added graphics so that ships display their targets when selected.</li>
  <li>8am - Added 2 more dogfighting levels. Enhanced manual piloting of Militia
      ship so that you can use your melee attack.</li>
</ul>
<p>
Here's a screenshot of Sandbox mode:
</p>
<img src="https://s3.amazonaws.com/superjoe/blog-files/pillagers-7drts-game-dev-journal/day-5-sandbox-mode.png">
<p>
Here's me playing through Dogfighting Mode and messing around in
Sandbox Mode a bit:
</p>
<iframe width="560" height="315" src="https://www.youtube.com/embed/7bFireOA96Q" frameborder="0" allowfullscreen></iframe>
<p>
I have only 2 days left to work.
The last day will be primarily reserved for gameplay tweaking, bugfixes,
touchups, and testing.
That leaves only 1 more day to make actual progress.
</p>
<p>
These are my goals for tomorrow:
</p>
<ul>
  <li>Add a rotating turret to the flagship.</li>
  <li>Add a civilian to Level 1 and use that to explain the Move command
  vs the Engage command.</li>
  <li>Add a dogfighting level where there is an ongoing battle and you
      have to rack up a certain number of kills to win.</li>
  <li>Add more campaign levels. I reserve the right to ramp up the
      difficulty in the later levels!</li>
  <li>Pan sound effects and volume depending on scroll location</li>
  <li>Add some more ship classes and implement upgrades.</li>
</ul>
<p>
Thanks to the folks in the #7dRTS IRC channel for helping me playtest today.
Specifically Zapa and Orava.
</p>
<h2 id="day-6"><a href="#day-6">Day 6</a></h2>
<p>
<a href="http://www.ludumdare.com/compo/2013/07/28/pillagers-7drts-game-development-journal-day-6/">Cross-posted on ludumdare.com</a>.
</p>
<p>
Today I worked on making the level format and core engine more robust. I added a bunch
more dogfighting levels. There are 11 now, and they get pretty crazy at the end.
</p>
<p>
The cool thing about them is that the core game engine does not even know that
you are in "Dogfighting Mode". There's no <code>dogfightingMode</code> variable
that tells the engine that's what you're doing. Instead, the
<a href="https://github.com/andrewrk/pillagers/blob/7d09b2f0842e8962e059ebaa7c05fbf47cff890a/public/text/dogfight0.json">level format</a>
supports trigger conditions, events, and groups, and the logic of dogfighting
is done in the level JSON file.
</p>
<p>
Michael gave me the battle music today as well. It sounds amazing. I made it
so that the game automatically detects when you are entering or exiting a battle and
transitions the music appropriately. I'm pleased with the effect.
</p>
<p>
Here's a list of what I got done today:
</p>
<ul>
  <li>
  12am - Added battle music. The game automatically detects when a battle is
  happening and fades in the battle music, then fades back to normal music
  once the battle is over.
  </li>
  <li>
  1am - Added more dogfighting levels.
  </li>
  <li>
  3am - Worked on making dogfighting levels more fun. You have to try again if you die rather than respawning.
  </li>
  <li>
  4am - Made the Militia ship show the area of attack when you are manually piloting it.
  </li>
  <li>
  6am - Sandbox Mode: Ability to load and save. I was able to use this to build some dogfighting levels.
  </li>
  <li>
  7am - Added more dogfighting levels.
  </li>
</ul>
<p>
I did not accomplish most of what I set out to today. Regardless, I am really happy with Dogfighting Mode
right now. It's almost certainly more fun than Campaign Mode currently.
</p>
<p>
Here's a video of me playing through it:
</p>
<iframe width="560" height="315" src="https://www.youtube.com/embed/vye8Sfs3rFM" frameborder="0" allowfullscreen></iframe>
<p>
This version of the game is going to be pretty close to the final one that I submit for 7dRTS.
I will probably only spend about 6-8 more hours on this debugging on Firefox and bugfixing.
I will not be supporting Internet Explorer.
</p>
<h2 id="day-7"><a href="#day-7">Day 7</a></h2>
<p>
Today was short. I added more sound effects that Michael provided,
and lowered the difficulty on several levels. The game is in a stable
state and I don't want to screw it up before submitting.
</p>
<p>
I experimented with adding aiming hints when manually piloting a ship,
but I found that it actually made the game more difficult because
it tricks the player into aiming directly at the enemy ships
instead of compensating for relative velocity.
</p>
<p>
I tried to make my girlfriend play the game, but she said it was too hard.
It probably is too hard. I like hard games.
</p>
<p>
Either way, I'm submitting.
</p>
<p>
<a href="http://www.ludumdare.com/compo/minild-44/?action=preview&uid=24697">View my entry on ludumdare.com</a>.
</p>
<p>
<a href="http://s3.amazonaws.com/superjoe/temp/pillagers/index.html">Play the game in your web browser</a>.
</p>
]]></description>
      </item>
      <item>
         <title>Private Methods in JavaScript</title>
         <pubDate>Wed, 17 Jul 2013 02:20:54 GMT</pubDate>

         <link>https://andrewkelley.me/post/js-private-methods.html</link>
         <guid>https://andrewkelley.me/post/js-private-methods.html</guid>
         <description><![CDATA[<h1>Private Methods in JavaScript</h1>
<p>
In JavaScript, we don't have private methods right?
We must to resort to using <code class="language-javascript">this._somePrivateThing()</code> right?
</p>
<p>
Wrong.
</p>
<h2>Before</h2>
<pre>
<code class="language-javascript">function Cell(x, y) {
  this.x = x;
  this.y = y;
  this.things = [1, 2, 3];

  // I guess I'll have to use an underscore to indicate that this
  // method is private.
  this._initializeSomethingElse();
}

Cell.prototype._initializeSomethingElse = function() {
  this.dir = Math.atan2(this.y, this.x);
  this.total = 0;
  // hmm, I need a reference to this in that callback. I'll have to
  // save a copy or use bind
  var self = this;
  self.things.forEach(function(thing) {
    self.total += thing;
  });
};</code>
</pre>
<h2>After</h2>
<pre>
<code class="language-javascript">function Cell(x, y) {
  this.x = x;
  this.y = y;
  this.things = [1, 2, 3];

  // boom. private method.
  initializeSomethingElse(this);
}

function initializeSomethingElse(self) {
  self.dir = Math.atan2(self.y, self.x);
  self.total = 0;
  self.things.forEach(function(thing) {
    self.total += thing;
  });
}</code>
</pre>
<p>
Notice that as an added benefit, the new private method is given an explicit
reference to the instance, so if you need to use a callback you don't
have to shuffle around the <code>this</code> pointer.
</p>
<h2>"But it will fool the optimizer!"</h2>
<p>Wrong again.</p>
<p>Let's benchmark the above examples:</p>
<pre>
<code class="language-javascript">var PrivCell = require('./priv');
var NoPrivCell = require('./no-priv');

console.log("Test 1 - no priv method:", Math.round(test(NoPrivCell)) + "ms");
console.log("Test 1 - private method:", Math.round(test(PrivCell)) + "ms");

console.log("Test 2 - no priv method:", Math.round(test(NoPrivCell)) + "ms");
console.log("Test 2 - private method:", Math.round(test(PrivCell)) + "ms");

console.log("Test 3 - no priv method:", Math.round(test(NoPrivCell)) + "ms");
console.log("Test 3 - private method:", Math.round(test(PrivCell)) + "ms");

function test(Cell) {
  var start = new Date();
  var total = 0;
  for (var i = 0; i < 20000000; i += 1) {
    var c = new Cell();
    total += c.total;
  }
  return new Date() - start;
}</code>
</pre>
<p>Running the benchmark on my machine:</p>
<pre>
Test 1 - no priv method: 5525ms
Test 1 - private method: 5537ms
Test 2 - no priv method: 5537ms
Test 2 - private method: 5571ms
Test 3 - no priv method: 5572ms
Test 3 - private method: 5595ms
</pre>
<p>
Makes no difference.
It's clean, it solves the problem, and it has no performance implications.
</p>
<p>
Here's a <a href="http://jsperf.com/public-vs-private-methods">jsperf</a> for further evidence.
</p>
<h2>Browser Scoping</h2>
<p>
Note that if you're writing the code in an environment which does not
provide scoping, such as a &lt;script&gt; tag in the browser, you'll
want to wrap the entire thing in an anonymous function call:
</p>
<pre>
<code class="language-javascript">var Cell = (function() {
  function Cell(x, y) {
    this.x = x;
    this.y = y;
    this.things = [1, 2, 3];

    // boom. private method.
    initializeSomethingElse(this);
  }

  function initializeSomethingElse(self) {
    self.dir = Math.atan2(self.y, self.x);
    self.total = 0;
    self.things.forEach(function(thing) {
      self.total += thing;
    });
  }

  return Cell;
})();</code>
</pre>
]]></description>
      </item>
      <item>
         <title>Spot the Fail</title>
         <pubDate>Wed, 10 Jul 2013 21:47:20 GMT</pubDate>

         <link>https://andrewkelley.me/post/spot-the-fail.html</link>
         <guid>https://andrewkelley.me/post/spot-the-fail.html</guid>
         <description><![CDATA[<style>
  .stf {
    width: 1000px;
  }
  .spoiler {
    padding: 14px 0;
    color: transparent;
    font-size: 1.2em;
  }
  .spoiler:before {
    content: "Hover for spoiler -->";
    color: black;
  }
  .spoiler:after {
    content: "<--";
    color: black;
  }
  .spoiler:hover {
    color: inherit;
  }
</style>
<h1>Spot the Fail</h1>
<p>
It's time to have a little fun.
</p>
<p>
Sometimes when I'm programming, I do something so comically stupid that
I feel the need to screenshot the code and share my facepalm moment with
someone else.
</p>
<p>
There are a couple guidelines for my flavor of Spot the Fail:
</p>
<h3>Guidelines</h3>
<ul>
  <li>
  The fail should be spottable <em>even if you do not
    know the context of the code.</em>
  </li>
  <li>
  For trickier ones, the mouse or text cursor can optionally be nearby the
  fail to give a hint.
  </li>
</ul>
<p>
I have collected a few of these screenshots and explanations for your
enjoyment.
</p>
<div class="stf">
  <h2>1. <a href="https://github.com/andrewrk/labyrinth">Labyrinth</a></h2>
  <img alt="" src="https://s3.amazonaws.com/superjoe/blog-files/spot-the-fail/labyrinth.png"/>
  <div class="spoiler">
    South, South. Should be North, South.
  </div>
</div>

<div class="stf">
  <h2>2. <a href="https://github.com/thejoshwolfe/jax">jax</a></h2>
  <img alt="" src="https://s3.amazonaws.com/superjoe/blog-files/spot-the-fail/jax.png"/>
  <div class="spoiler">
    Classic. Missing break on one of the cases.
  </div>
</div>

<div class="stf">
  <h2>3. <a href="https://github.com/thejoshwolfe/jax">jax</a></h2>
  <img alt="" src="https://s3.amazonaws.com/superjoe/blog-files/spot-the-fail/jax-2.png"/>
  <div class="spoiler">
    Bytes are not 4 bits long.
  </div>
</div>

<div class="stf">
  <h2>4. "Code from work"</h2>
  <img alt="" src="https://s3.amazonaws.com/superjoe/blog-files/spot-the-fail/work.png"/>
  <div class="spoiler">
    Case 8 should be [0-7] not [0-8].
  </div>
</div>

<div class="stf">
  <h2>5. <a href="https://github.com/andrewrk/motrs">motrs</a></h2>
  <img alt="" src="https://s3.amazonaws.com/superjoe/blog-files/spot-the-fail/motrs.png"/>
  <div class="spoiler">
    It's the cliché break statement again. Seriously, gotta watch out for that.
  </div>
</div>

<div class="stf">
  <h2>6. <a href="https://github.com/andrewrk/motrs">motrs</a></h2>
  <img alt="" src="https://s3.amazonaws.com/superjoe/blog-files/spot-the-fail/motrs-2.png"/>
  <div class="spoiler">
    new T[newSizeX + m_sizeY * m_sizeZ] - the addition should be multiplication.
  </div>
</div>

<div class="stf">
  <h2>7. <a href="https://github.com/andrewrk/stinkomanlevels">stinkomanlevels.com</a></h2>
  <img alt="" src="https://s3.amazonaws.com/superjoe/blog-files/spot-the-fail/stinkoman.png"/>
  <div class="spoiler">
    D'oh! I asked for 1 more character than I should have.
  </div>
</div>

<div class="stf">
  <h2>8. <a href="https://github.com/andrewrk/motrs">motrs</a></h2>
  <img alt="" src="https://s3.amazonaws.com/superjoe/blog-files/spot-the-fail/motrs-3.png"/>
  <div class="spoiler">
    m_tileCountY = value; instead of value should be tileCount.
  </div>
</div>

<div class="stf">
  <h2>9. menu item</h2>
  <img alt="" src="https://s3.amazonaws.com/superjoe/blog-files/spot-the-fail/shortcut.png"/>
  <div class="spoiler">
    Tried to press the "OK" button with ALt+O; instead messed up the shortcut.
  </div>
</div>

<div class="stf">
  <h2>10. <a href="https://github.com/andrewrk/solidcomposer">solidcomposer.com</a></h2>
  <img alt="" src="https://s3.amazonaws.com/superjoe/blog-files/spot-the-fail/mimetime.png"/>
  <div class="spoiler">
    It's MIME TIME!!
  </div>
</div>

<div class="stf">
  <h2>11. <a href="https://github.com/andrewrk/solidcomposer">solidcomposer.com</a></h2>
  <img alt="" src="https://s3.amazonaws.com/superjoe/blog-files/spot-the-fail/td.png"/>
  <div class="spoiler">
    vim spotted the fail for me. Good job vim.
  </div>
</div>

<div class="stf">
  <h2>12. <a href="https://github.com/andrewrk/solidcomposer">solidcomposer.com</a></h2>
  <img alt="" src="https://s3.amazonaws.com/superjoe/blog-files/spot-the-fail/js.png"/>
  <div class="spoiler">
    state.user !============== null
  </div>
</div>

<div class="stf">
  <h2>13. <a href="https://github.com/thejoshwolfe/repatriator">repatriator</a></h2>
  <img alt="" src="https://s3.amazonaws.com/superjoe/blog-files/spot-the-fail/repatriator.png"/>
  <div class="spoiler">
    It's highlighted in red. Pretty silly.
  </div>
</div>

<div class="stf">
  <h2>14. <a href="https://github.com/thejoshwolfe/repatriator">repatriator</a></h2>
  <img alt="" src="https://s3.amazonaws.com/superjoe/blog-files/spot-the-fail/repatriator-2.png"/>
  <div class="spoiler">
    X and Y are swapped
  </div>
</div>

<div class="stf">
  <h2>15. Some Rails Code</h2>
  <img alt="" src="https://s3.amazonaws.com/superjoe/blog-files/spot-the-fail/ruby.png"/>
  <div class="spoiler">
    syntax error: enr
  </div>
</div>

<div class="stf">
  <h2>16. <a href="https://github.com/andrewrk/motrs">motrs</a></h2>
  <img alt="" src="https://s3.amazonaws.com/superjoe/blog-files/spot-the-fail/motrs-4.png"/>
  <div class="spoiler">
    "C:\out.bin" - accidentally escaping the 'o' but more importantly
    it's not even running on windows.
  </div>
</div>

<h3>Fin.</h3>
<p>
Well, hope that was fun.
</p>
<p>
On a relevant note, remember when Quixey in an act of hiring PR
<a href="http://blog.quixey.com/2011/10/03/quixey-challenge/">offered $100 
if you could spot the fail in 1 minute</a>? Good times.
</p>
]]></description>
      </item>
      <item>
         <title>Statically Recompiling NES Games into Native Executables with LLVM and Go</title>
         <pubDate>Fri, 07 Jun 2013 08:48:00 GMT</pubDate>

         <link>https://andrewkelley.me/post/jamulator.html</link>
         <guid>https://andrewkelley.me/post/jamulator.html</guid>
         <description><![CDATA[<script>
  Prism.languages['6502'] = Prism.languages.extend('clike', {
    'keyword': /\.(org|db|dw)/,
    'number': /(#\$|\$|\b\d)[0-9a-fA-F]*\b/,
    'comment': /;.*/,
    'operator': /\w[\w\d_]*:/,
    'regex': /\b[A-Z]{3}\b/,
  });
  Prism.languages['llvm'] = Prism.languages.extend('clike', {
    'keyword': /\b(private|constant|declare|define|c|noreturn|nounwind|alloca|br|store|load|getelementptr|and|icmp|eq|zext|call|unreachable|add|target|datalayout|triple|unnamed_addr|align|inbounds|uwtable|sext|ret|ne|phi|global|to|zeroinitializer|shl|or|ult|switch)\b/,
    'number': null,
    'comment': /;.*/,
    'operator': /\w[\w\d_]*:/,
    'regex': /[@%]\.?\w+/,
    'property': /\b(i8|i1|i32|i64|i16|void|label)\*?\b/,
    'punctuation': null,
    'function': null,
  });
  Prism.languages['go'] = Prism.languages.extend('clike', {
    'keyword': /\b(func|interface|struct|type|const|import|for|if|case|switch|select|defer|new|make|nil|return|default|panic|package|var)\b/,
    'property': /\b(string|map|chan|int|rune|int8|int16|int32|int64|uint8|uint16|uint32|uint64)\b/,
  });
  Prism.languages['make'] = Prism.languages.extend('clike', {
    'number': null,
    'punctuation': null,
    'function': null,
    'keyword': /\.PHONY:/,
    'property': /\t(.+)/,
    'operator': /[^.].*:/,
  });
  Prism.languages['ini'] = Prism.languages.extend('clike', {
    'comment': /#.*/,
    'number': null,
    'punctuation': null,
    'function': null,
    'keyword': null,
    'property': null,
    'operator': null,
  });
</script>
<h1>Statically Recompiling NES Games into Native Executables with LLVM and Go</h1>
<p>
  I have always wanted to write an emulator. I made a half-hearted attempt in college
  but it never made it to a demo-able state. I also didn't want to write Yet Another Emulator.
  It should at least bring something new to the table. When I discovered
  <a href="https://llvm.org/">LLVM</a>, I felt like I finally had worthwhile idea for a project.
</p>
<p>
  This article presents original research regarding the possibility of
  statically disassembling and recompiling
  <a href="https://en.wikipedia.org/wiki/Nintendo_Entertainment_System">Nintendo Entertainment System</a>
  games into native executables.
  I attempt to bring the reader along from beginning to end in a sort
  of "let's build it together" fashion.
  I assume as little as possible about what the reader knows
  about the subjects at hand, without compromising the depth of the article.
</p>
<p>
<a href="https://github.com/andrewrk/jamulator">See the project on GitHub</a>
</p>
<h2 id="toc">Table of Contents</h2>
<ol>
  <li><a href="#toc">Table of Contents</a></li>
  <li><a href="#discovering-llvm">Discovering LLVM</a></li>
  <li><a href="#nes-crash-course">NES Crash Course</a></li>
  <li><a href="#custom-abi">Creating a Custom ABI for Testing Purposes</a></li>
  <li><a href="#assembly">Assembly</a></li>
  <li><a href="#disassembly">Disassembly</a></li>
  <li><a href="#code-generation">Code Generation</a></li>
  <li><a href="#optimization">Optimization</a></li>
  <li><a href="#rom-layout">Layout of an NES ROM File</a></li>
  <li><a href="#runtime-address-checks">Challenge: Runtime Address Checks</a></li>
  <li><a href="#parallel-systems">Challenge: Parallel Systems Running Simultaneously</a></li>
  <li><a href="#interrupts">Challenge: Handling Interrupts</a></li>
  <li><a href="#detect-jump-table">Challenge: Detecting a Jump Table</a></li>
  <li><a href="#indirect-jumps">Challenge: Indirect Jumps</a></li>
  <li><a href="#dirty-assembly-tricks">Challenge: Dirty Assembly Tricks</a></li>
  <li><a href="#video">Video: Playing Super Mario Brothers 1 on Native Machine Code</a></li>
  <li><a href="#mappers">Unsolved Challenge: Mappers</a></li>
  <li><a href="#community-support">Community Support</a></li>
  <li><a href="#conclusion">Conclusion</a></li>
</ol>
<h2 id="discovering-llvm">Discovering LLVM</h2>
<p>
  I became giddy with excitement when I discovered <a href="https://llvm.org/">LLVM</a>.
  Generate code in <a href="https://llvm.org/docs/LangRef.html">LLVM IR format</a>,
  and LLVM can optimize and generate native code for any of its supported backends.
  Here's an incomplete list:
</p>
<ul style="columns: 2">
  <li>ARM</li>
  <li>STI CBEA Cell SPU [experimental]</li>
  <li>C++ backend</li>
  <li>Hexagon</li>
  <li>MBlaze</li>
  <li>Mips</li>
  <li>Mips64 [experimental]</li>
  <li>Mips64el [experimental]</li>
  <li>Mipsel</li>
  <li>MSP430 [experimental]</li>
  <li>NVIDIA PTX 32-bit</li>
  <li>NVIDIA PTX 64-bit</li>
  <li>PowerPC 32</li>
  <li>PowerPC 64</li>
  <li>Sparc</li>
  <li>Sparc V9</li>
  <li>Thumb</li>
  <li>32-bit X86: Pentium-Pro and above</li>
  <li>64-bit X86: EM64T and AMD64</li>
  <li>XCore</li>
</ul>
<p>
  Furthermore, you can write additional backends to support additional targets. For example,
  <a href="http://emscripten.org">emscripten</a> provides a JavaScript backend.
</p>
<p>
To seal the deal, the LLVM project offers <a href="http://clang.llvm.org/">clang</a>, a
C, C++, Objective C, and Objective C++ <em>front-end</em> for LLVM.
This means that because of the way LLVM is designed, emscripten allows you to compile C, C++,
Objective C, and Objective C++ code for the browser. Wow!
</p>
<p>
Just to show you how refreshing this technology is, have a look at the following C code:
</p>
<pre>
<code class="language-c">#include &lt;stdio.h&gt;
#include &lt;stdint.h&gt;

int main() {
    uint8_t foo = 0xf0;
    if (foo &amp; 0x80 == 0x80) {
        for (int i = 1; i <= 10; ++i) {
            printf("%d\n", i);
        }
    }
}
</code></pre>
<p>
Simple enough. Assign <code>11110000</code> to <code>foo</code>.
If the highest bit is 1 (which it is), print the integers from 1 to 10.
</p>
<p>
Compiling with gcc with default settings, we get:
</p>
<pre>
$ gcc test.c
test.c: In function ‘main’:
test.c:7:5: error: ‘for’ loop initial declarations are only allowed in C99 mode
test.c:7:5: note: use option -std=c99 or -std=gnu99 to compile your code
$
</pre>
<p>
Right, so default gcc doesn't have the nice things that <a href="https://en.wikipedia.org/wiki/C99">C99</a> brings.
Let's try that again with <code>gcc -std=c99</code>:
</p>
<pre>
$ c99 test.c
$ ./a.out
$
</pre>
<p>
Okay, so now it compiled and ran, but why didn't it print the integers from 1 to 10? Let's try compiling with <code>clang</code> and see what happens:
</p>
<pre>
$ clang test.c
test.c:6:13: warning: &amp; has lower precedence than ==; == will be evaluated
      first [-Wparentheses]
    if (foo &amp; 0x80 == 0x80) {
            ^~~~~~~~~~~~~~
test.c:6:13: note: place parentheses around the '==' expression to silence
      this warning
    if (foo &amp; 0x80 == 0x80) {
            ^
              (           )
test.c:6:13: note: place parentheses around the &amp; expression to evaluate it
      first
    if (foo &amp; 0x80 == 0x80) {
            ^
        (         )
1 warning generated.
$
</pre>
<p>
Aha! In C, <code>&amp;</code> has lower precedence than <code>==</code>. In addition to
outputting with fancy terminal colors, clang found the bug in our program and issued a warning.
Once we fix that up and run the code again, we get the expected behavior.
</p>
<p>
<small>
  Note: It has been shown to me that <code>gcc -Wall</code> produces a similar warning
  for the same code example here. I apologize for the bogus example. However,
  my point still stands.
  LLVM has <a href="http://clang.llvm.org/diagnostics.html">a much better explanation</a>
  of the philosophy and differences in error messages between clang and gcc.
</small>
</p>
<p>
There are many instances where cryptic gcc errors in your code are made clear by clang.
In the words of my friend
<a href="https://wolfesoftware.com/">Josh Wolfe</a>,
</p>
<blockquote>
  <p>
  Clang makes gcc look like a rusty old grandpa.
  </p>
</blockquote>
<p>
This is exciting because clang is merely a <em>front-end</em> to LLVM. What this means is that
if you generate LLVM IR code, you share the same code generation backend as clang.
With this I felt that powerful optimization and wide target platform support were within my
grasp.
</p>
<p>
What if we could translate NES assembly code into LLVM IR code? <strong>We could completely,
  <em>statically</em> recompile NES games into native binaries!</strong>
</p>
<h2 id="nes-crash-course">NES Crash Course</h2>
<p>
  Take a break from LLVM for a moment and consider the Nintendo Entertainment System.
</p>
<p>
  The NES consists of 4 systems working together in parallel:
</p>
<ul>
  <li>
    CPU - 8-bit processor. Basically a
    <a href="https://en.wikipedia.org/wiki/MOS_Technology_6502">MOS 6502</a>.
  </li>
  <li>Picture Processing Unit</li>
  <li>Audio Processing Unit</li>
  <li>Usually a mapper: custom hardware used to provide extra ROM (and sometimes other stuff)</li>
</ul>
<p>
The NES uses memory-mapped I/O, which means that, to interact with hardware, you read from and
write to special memory addresses. Take a look at a simplified version of the memory layout
of the NES at runtime:
</p>
<table>
  <tr>
    <th>Start</th>
    <th>End</th>
    <th>Description</th>
  </tr>
  <tr>
    <td>$0000</td>
    <td>$07ff</td>
    <td>RAM. Unless they use a mapper, NES games only get 2KB of RAM!</td>
  </tr>
  <tr>
    <td>$2000</td>
    <td>$2007</td>
    <td>
      These memory addresses are hooked up to the PPU and hence are used to control
      what is being displayed on the screen.
    </td>
  </tr>
  <tr>
    <td>$4000</td>
    <td>$4015</td>
    <td>
      These are hooked up to the APU and are used to control the sound that plays.
    </td>
  </tr>
  <tr>
    <td>$4016</td>
    <td>$4017</td>
    <td>
      These are used to read the state of the game controllers.
    </td>
  </tr>
  <tr>
    <td>$8000</td>
    <td>$ffff</td>
    <td>
      This is Read Only Memory. The game code itself is loaded here, so games can embed data
      in their program code and read it in this address range.
    </td>
  </tr>
</table>
<p>
Let's break this down. First, note that the entire addressable range of the NES is
64KB. Thus all addresses can be represented by 2 bytes, or 4 hex digits. Next,
note that in 6502 assembly, <code>$</code> is standard for hexadecimal notation.
So the address range is <code>$0000</code> to <code>$ffff</code>.
Of the 64KB addressable range, 29KB is (usually) completely unused!
</p>
<p>
Note the <code>$8000</code> - <code>$ffff</code> range. When you create an NES game, you
write 6502 assembly code, which is assembled into a 32KB binary, which is loaded into the <code>$8000</code> - <code>$ffff</code> range during bootup.
</p>
<p>
Let's write some example 6502 code to illustrate.
</p>
<h2 id="custom-abi">Creating a Custom ABI for Testing Purposes</h2>
<p>
One of the challenges of NES development is that a basic "Hello, World" program is
<a href="http://www.dwedit.org/files/hello.asm">pretty complicated</a>.
This is why I created a custom Application Binary Interface to make it a little bit easier.
Refer back to the memory layout above, and recall that many addresses are unused.
I decided to add a couple things:
</p>
<table>
  <tr>
    <th>Address</th>
    <th>Description</th>
  </tr>
  <tr>
    <td>$2008</td>
    <td>
      Write a byte to this address and the byte will go to standard out.
    </td>
  </tr>
  <tr>
    <td>$2009</td>
    <td>
      Write a byte to this address and the program will exit with that byte as the return code.
    </td>
  </tr>
</table>
<p>
With this add-on ABI, I was able to create a simple 6502 application that prints
<code>"Hello, World!\n"</code> to stdout and then exits.
In 6502 land, that looks like:
</p>
<pre>
<code class="language-6502">; Remember that this code gets loaded at memory address $8000.
; This instruction tells the assembler that the code and data
; that follow are starting at address $8000.
; For example, the 'W' character in the "Hello, World!\n\0" text
; below is located at memory address $8007.
.org $8000

; This is a label. The assembler will replace references to
; this label with the actual address that it represents, which
; in this case is $8000.
msg:
; 'db' stands for "Data Bytes". This line puts the string
; "Hello, World!\n\0" in bytes $8000 - $800e when this
; code is assembled.
  .db "Hello, World!", 10, 0

Reset_Routine:
  LDX #$00       ; put the starting index, 0, in register X

loop:
; This instruction does 3 things:
; 1. Take the address of msg and add the value of register X to get a pointer.
; 2. Read the byte at the pointer and put it into register A
; 3. Set the zero flag to 1 if A is zero; or 0 if A is nonzero.
  LDA msg, X     ; read 1 char
; If the zero flag is 1, goto loopend. Otherwise, go to the next instruction.
  BEQ loopend    ; end loop if we hit the \0
; Put the value of register A into memory address $2008. But remember,
; with our custom ABI, this means write the value of register A to stdout.
  STA $2008      ; putchar
; Register X = Register X + 1
  INX
  JMP loop       ; goto loop

loopend:

  LDA #$00       ; put the return code, 0, in register A
  STA $2009      ; exit with a return code of register A (0 in this case)

; These are interrupt handlers. Let's not think about interrupts yet.
IRQ_Routine: RTI ; do nothing
NMI_Routine: RTI ; do nothing

; Another org statement. Remember that NES programs are 32KB. This tells the assembler
; to fill the unused space up until $fffa with zeroes and that we are now
; specifying the data at $fffa.
.org   $fffa

; 'dw' stands for "Data Words". These lines put the address of the NMI_Routine
; label at $fffa, the address of Reset_Routine at $fffc, and the address
; of IRQ_Routine at $fffe.
.dw  NMI_Routine
.dw  Reset_Routine
.dw  IRQ_Routine</code>
</pre>
<p>
There are some things to point out here. First, note the entry points into the program.
Whereas in most programming environments, there is a <code>main</code> function,
in NES games, the data at <code>$fffa</code> - <code>$ffff</code> has hardcoded special
meaning. When a game starts, the NES reads the 2 bytes at <code>$fffc</code> and
<code>$fffd</code> and uses that as the memory address of the first instruction to run.
</p>
<p>
  Next let's talk about the CPU. In the code snippet above I used register A and register X.
  There are 6 registers total:
</p>
<table>
  <tr>
    <th>
      Name
    </th>
    <th>
      Bits
    </th>
    <th>
      Description
    </th>
  </tr>
  <tr>
    <td>
      A
    </td>
    <td>
      <nobr>8-bit</nobr>
    </td>
    <td>
      The "main" register. Most of the arithmetic operations are performed on this register.
    </td>
  </tr>
  <tr>
    <td>
      X
    </td>
    <td>
      8-bit
    </td>
    <td>
      Another general purpose register.
      Fewer instructions support X than A.
      Often used as an index, as in the above code.
    </td>
  </tr>
  <tr>
    <td>
      Y
    </td>
    <td>
      8-bit
    </td>
    <td>
      Pretty much the same as X.
      Fewer instructions support Y than X.
      Also often used as an index.
    </td>
  </tr>
  <tr>
    <td>
      P
    </td>
    <td>
      8-bit
    </td>
    <td>
      The "status" register.
      When the assembly program mentions the "zero flag" it is
      actually referring to bit 1 (the second smallest bit) of the status
      register. The other bits mean other things, which we will get into later.
    </td>
  </tr>
  <tr>
    <td>
      SP
    </td>
    <td>
      8-bit
    </td>
    <td>
      The "stack pointer". The stack is located at <code>$100</code> - <code>$1ff</code>.
      The stack pointer is initialized to <code>$1ff</code>.
      When you push a byte on the stack, the stack pointer
      is decremented, and when you pull a byte from the stack,
      the stack pointer is incremented.
    </td>
  </tr>
  <tr>
    <td>
      PC
    </td>
    <td>
      <nobr>16-bit</nobr>
    </td>
    <td>
      The program counter. You can't directly modify this register
      but you can indirectly modify it using the stack.
    </td>
  </tr>
</table>
<p>
After coming up with this well-defined task -
get "Hello World" running natively - it is time to write some code.
First things first - this project should be able to <em>assemble</em> our
source code into binary machine code format.
</p>
<h2 id="assembly">Assembly</h2>
<p>
One tried and true method to parsing source code is to use a
<a href="https://en.wikipedia.org/wiki/Lexical_analysis">lexer</a>
to <em>tokenize</em> the source, and then a
<a href="https://en.wikipedia.org/wiki/Parser#Programming_languages">parser</a>
to turn the tokens into an
<a href="https://en.wikipedia.org/wiki/Abstract_syntax_tree">Abstract Syntax Tree</a>.
From there we can process the AST and turn it into the 32KB binary payload.
</p>
<p>
In Go land, this is straightforward thanks to <a href="https://github.com/blynn/nex">nex</a>,
a lexer, and <a href="https://golang.org/cmd/yacc/">go tool yacc</a>, a parser that
is bundled with Go.
Here is the parser code that can build an AST for our "Hello, World" example:
</p>
<pre>
<code class="language-go">%{
package jamulator

import (
	"fmt"
	"strconv"
	"container/list"
)

// Define the structures that make up the abstract syntax tree.

// Example:
//     Label1234:
type LabelStatement struct {
	LabelName string
	Line int
}

// Example:
//     Label1234: BRK
type LabeledStatement struct {
	Label *LabelStatement
	Stmt interface{}
}

// Example:
//     .org $8000
type OrgPseudoOp struct {
	Value int
	Fill byte
	Line int
}

type InstructionType int
const (
	// Example:
	//     ADC #$44
	ImmediateInstruction InstructionType = iota

	// Example:
	//     INX
	ImpliedInstruction

	// Example:
	//     ADC Label1234, X
	DirectWithLabelIndexedInstruction

	// Example:
	//     ADC $44, X
	DirectIndexedInstruction

	// Example:
	//     ADC Label1234
	DirectWithLabelInstruction

	// Example:
	//     ADC $44
	DirectInstruction

	// Example:
	//     ADC ($44, X)
	IndirectXInstruction

	// Example:
	//     ADC ($44), Y
	IndirectYInstruction

	// Example:
	//     JMP ($81cc)
	IndirectInstruction
)

type Instruction struct {
	Type InstructionType
	OpName string
	Line int

	// not all fields are used by all instruction types.
	Value int
	LabelName string
	RegisterName string

	// filled in later
	OpCode byte
	Offset int
	Payload []byte
}

type DataStmtType int
const (
	// Example:
	//     .db $44
	ByteDataStmt DataStmtType = iota

	// Example:
	//     .dw Label1234
	WordDataStmt
)

type DataStatement struct {
	Type DataStmtType
	dataList *list.List
	Line int

	// filled in later
	Offset int
	Payload []byte
}


type IntegerDataItem int
type StringDataItem string
type LabelCall struct {
	LabelName string
}
type ProgramAst struct {
	List *list.List
}

// This is the root node of the abstract syntax tree.
var programAst ProgramAst
%}

// This instructs yacc how to generate a structure that will fit
// any data type that we need to use for a node.
%union {
	integer int
	str string
	list *list.List
	orgPsuedoOp *OrgPseudoOp
	node interface{}
}

// These are the nodes that we build based on other nodes and tokens.
%type &lt;list&gt; statementList
%type &lt;node&gt; statement
%type &lt;node&gt; instructionStatement
%type &lt;node&gt; dataStatement
%type &lt;list&gt; dataList
%type &lt;list&gt; wordList
%type &lt;node&gt; dataItem
%type &lt;orgPsuedoOp&gt; orgPsuedoOp
%type &lt;node&gt; numberExpr
%type &lt;node&gt; numberExprOptionalPound

// These are tokens that we will encounter when reading the output from
// the lexer.
%token &lt;str&gt; tokIdentifier
%token &lt;integer&gt; tokInteger
%token &lt;str&gt; tokQuotedString
%token tokEqual
%token tokPound
%token tokDot
%token tokComma
%token tokNewline
%token tokDataByte
%token tokDataWord
%token tokProcessor
%token tokLParen
%token tokRParen
%token tokDot
%token tokColon
%token tokOrg

%%

// An assembly program is defined as a statementList.

programAst : statementList {
	programAst = ProgramAst{$1}
}

// A statementList is defined as another statementList, plus a single
// statement, or simply as a statement. This creates a linked list
// of statements.
statementList : statementList tokNewline statement {
	if $3 == nil {
		$$ = $1
	} else {
		$$ = $1
		$$.PushBack($3)
	}
} | statement {
	if $1 == nil {
		$$ = list.New()
	} else {
		$$ = list.New()
		$$.PushBack($1)
	}
}

// This defines a statement, which can be many things.

statement : tokDot tokIdentifier instructionStatement {
	$$ = &amp;LabeledStatement{
		&amp;LabelStatement{"." + $2, parseLineNumber},
		$3,
	}
} | tokIdentifier tokColon instructionStatement {
	$$ = &amp;LabeledStatement{
		&amp;LabelStatement{$1, parseLineNumber},
		$3,
	}
} | orgPsuedoOp {
	$$ = $1
} | instructionStatement {
	$$ = $1
} | tokDot tokIdentifier dataStatement {
	$$ = &amp;LabeledStatement{
		&amp;LabelStatement{"." + $2, parseLineNumber},
		 $3,
	 }
} | tokIdentifier tokColon dataStatement {
	$$ = &amp;LabeledStatement{
		&amp;LabelStatement{$1, parseLineNumber},
		$3,
	}
} | dataStatement {
	$$ = $1
} | tokIdentifier tokColon {
	$$ = &amp;LabelStatement{$1, parseLineNumber}
} | {
	// empty statement
	$$ = nil
}

dataStatement : tokDataByte dataList {
	$$ = &amp;DataStatement{
		Type: ByteDataStmt,
		dataList: $2,
		Line: parseLineNumber,
	}
} | tokDataWord wordList {
	$$ = &amp;DataStatement{
		Type: WordDataStmt,
		dataList: $2,
		Line: parseLineNumber,
	}
}

wordList : wordList tokComma numberExprOptionalPound {
	$$ = $1
	$$.PushBack($3)
} | numberExprOptionalPound {
	$$ = list.New()
	$$.PushBack($1)
}

dataList : dataList tokComma dataItem {
	$$ = $1
	$$.PushBack($3)
} | dataItem {
	$$ = list.New()
	$$.PushBack($1)
}

numberExpr : tokPound tokInteger {
	tmp := IntegerDataItem($2)
	$$ = &amp;tmp
} | tokIdentifier {
	$$ = &amp;LabelCall{$1}
}

numberExprOptionalPound : numberExpr {
	$$ = $1
} | tokInteger {
	tmp := IntegerDataItem($1)
	$$ = &amp;tmp
}

dataItem : tokQuotedString {
	tmp := StringDataItem($1)
	$$ = &amp;tmp
} | numberExprOptionalPound {
	$$ = $1
}

orgPsuedoOp : tokOrg tokInteger {
	$$ = &amp;OrgPseudoOp{$2, 0xff, parseLineNumber}
} | tokOrg tokInteger tokComma tokInteger {
	if $4 &gt; 0xff {
		yylex.Error("ORG directive fill parameter must be a single byte.")
	}
	$$ = &amp;OrgPseudoOp{$2, byte($4), parseLineNumber}
}

instructionStatement : tokIdentifier tokPound tokInteger {
	$$ = &amp;Instruction{
		Type: ImmediateInstruction,
		OpName: $1,
		Value: $3,
		Line: parseLineNumber,
	}
} | tokIdentifier {
	$$ = &amp;Instruction{
		Type: ImpliedInstruction,
		OpName: $1,
		Line: parseLineNumber,
	}
} | tokIdentifier tokIdentifier tokComma tokIdentifier {
	$$ = &amp;Instruction{
		Type: DirectWithLabelIndexedInstruction,
		OpName: $1,
		LabelName: $2,
		RegisterName: $4,
		Line: parseLineNumber,
	}
} | tokIdentifier tokInteger tokComma tokIdentifier {
	$$ = &amp;Instruction{
		Type: DirectIndexedInstruction,
		OpName: $1,
		Value: $2,
		RegisterName: $4,
		Line: parseLineNumber,
	}
} | tokIdentifier tokIdentifier {
	$$ = &amp;Instruction{
		Type: DirectWithLabelInstruction,
		OpName: $1,
		LabelName: $2,
		Line: parseLineNumber,
	}
} | tokIdentifier tokInteger {
	$$ = &amp;Instruction{
		Type: DirectInstruction,
		OpName: $1,
		Value: $2,
		Line: parseLineNumber,
	}
} | tokIdentifier tokLParen tokInteger tokComma tokIdentifier tokRParen {
	if $5 != "x" &amp;&amp; $5 != "X" {
		yylex.Error("Register argument must be X.")
	}
	$$ = &amp;Instruction{
		Type: IndirectXInstruction,
		OpName: $1,
		Value: $3,
		Line: parseLineNumber,
	}
} | tokIdentifier tokLParen tokInteger tokRParen tokComma tokIdentifier {
	if $6 != "y" &amp;&amp; $6 != "Y" {
		yylex.Error("Register argument must be Y.")
	}
	$$ = &amp;Instruction{
		Type: IndirectYInstruction,
		OpName: $1,
		Value: $3,
		Line: parseLineNumber,
	}
} | tokIdentifier tokLParen tokInteger tokRParen {
	$$ = &amp;Instruction{
		Type: IndirectInstruction,
		OpName: $1,
		Value: $3,
		Line: parseLineNumber,
	}
}

%%</code>
</pre>
<p>
And here is the lexer code that generates the tokens listed above:
</p>
<pre>
<code class="language-go">/\.[dD][bB]/ {
	return tokDataByte
}
/\.[dD][wW]/ {
	return tokDataWord
}
/\.[oO][rR][gG]/ {
	return tokOrg
}
/"[^"\n]*"/ {
	t := yylex.Text()
	lval.str = t[1:len(t)-1]
	return tokQuotedString
}
/[a-zA-Z][a-zA-Z_.0-9]*/ {
	lval.str = yylex.Text()
	return tokIdentifier
}
/%[01]+/ {
	binPart := yylex.Text()[1:]
	n, err := strconv.ParseUint(binPart, 2, 16)
	if err != nil {
		yylex.Error("Invalid binary integer: " + binPart)
	}
	lval.integer = int(n)
	return tokInteger
}
/\$[0-9a-fA-F]+/ {
	hexPart := yylex.Text()[1:]
	n, err := strconv.ParseUint(hexPart, 16, 16)
	if err != nil {
		yylex.Error("Invalid hexademical integer: " + hexPart)
	}
	lval.integer = int(n)
	return tokInteger
}
/[0-9]+/ {
	n, err := strconv.ParseUint(yylex.Text(), 10, 16)
	if err != nil {
		yylex.Error("Invalid decimal integer: " + yylex.Text())
	}
	lval.integer = int(n)
	return tokInteger
}
/=/ {
	return tokEqual
}
/:/ {
	return tokColon
}
/#/ {
	return tokPound
}
/\./ {
	return tokDot
}
/,/ {
	return tokComma
}
/\(/ {
	return tokLParen
}
/\)/ {
	return tokRParen
}
/[ \t\r]/ {
	// ignore whitespace
}
/;[^\n]*\n/ {
	// ignore comments
	parseLineNumber += 1
	return tokNewline
}
/\n+/ {
	parseLineNumber += len(yylex.Text())
	return tokNewline
}
/./ {
	yylex.Error(fmt.Sprintf("Unexpected character: %q", yylex.Text()))
}

//

package jamulator

import (
	"strconv"
	"os"
	"fmt"
)

var parseLineNumber int
var parseFilename string
var parseErrors ParseErrors

type ParseErrors []string

func (errs ParseErrors) Error() string {
	return strings.Join(errs, "\n")
}

func Parse(reader io.Reader) (ProgramAst, error) {
	parseLineNumber = 1

	lexer := NewLexer(reader)
	yyParse(lexer)
	if len(parseErrors) &gt; 0 {
		return ProgramAst{}, parseErrors
	}
	return programAst, nil
}

func ParseFile(filename string) (ProgramAst, error) {
	parseFilename = filename

	fd, err := os.Open(filename)
	if err != nil { return ProgramAst{}, err }
	programAst, err := Parse(fd)
	err2 := fd.Close()
	if err != nil { return ProgramAst{}, err }
	if err2 != nil { return ProgramAst{}, err2 }
	return programAst, nil
}

func (yylex Lexer) Error(e string) {
	s := fmt.Sprintf("%s line %d %s", parseFilename, parseLineNumber, e)
	parseErrors = append(parseErrors, s)
}</code></pre>
<p>
Let's demonstrate our ability to parse the source code by printing
and inspecting the abstract syntax tree.
</p>
<pre>
<code class="language-go">import (
	"fmt"
	"reflect"
	"os"
)

func astPrint(indent int, n interface{}) {
	for i := 0; i &lt; indent; i++ {
		fmt.Print(" ")
	}
	fmt.Println(reflect.TypeOf(n))
}

func (ast ProgramAst) Print() {
	for e := ast.List.Front(); e != nil; e = e.Next() {
		astPrint(0, e.Value)
		switch t := e.Value.(type) {
		case *LabeledStatement:
			astPrint(2, t.Label)
			astPrint(2, t.Stmt)
		case *DataStatement:
			for de := t.dataList.Front(); de != nil; de = de.Next() {
				astPrint(2, de.Value)
			}
		}
	}
}
func main() {
	programAst, err := jamulator.ParseFile("hello.asm")
	if err != nil {
		fmt.Fprintf(os.Stderr, "%s\n", err.Error())
		os.Exit(1)
	}
	programAst.Print()
}</code></pre>
<p>
Now we need to compile this code. At this point, we can no longer rely on Go
to build everything because we now have 2 generated Go files:
</p>
<table>
  <tr>
    <th>
      Source File
    </th>
    <th>
      Generated File
    </th>
  </tr>
  <tr>
    <td>
      asm6502.nex
    </td>
    <td>
      asm6502.nn.go
    </td>
  </tr>
  <tr>
    <td>
      asm6502.y
    </td>
    <td>
      y.go
    </td>
  </tr>
</table>
<p>
The most straightforward way to build the project at this point is to use a Makefile.
Simple enough:
</p>
<pre>
<code class="language-make">build: jamulator/y.go jamulator/asm6502.nn.go
	go build -o jamulate main.go

jamulator/y.go: jamulator/asm6502.y
	go tool yacc -o jamulator/y.go -v /dev/null jamulator/asm6502.y

jamulator/asm6502.nn.go: jamulator/asm6502.nex
	${GOPATH}/bin/nex -e jamulator/asm6502.nex

clean:
	rm -f jamulator/asm6502.nn.go
	rm -f jamulator/y.go
	rm -f jamulate

.PHONY: build clean</code></pre>
<p>
Now we can build the project with one command:
</p>
<pre>
$ make
go tool yacc -o jamulator/y.go -v /dev/null jamulator/asm6502.y
/home/andy/golang/bin/nex -e jamulator/asm6502.nex
go build -o jamulate main.go
</pre>
<p>
And, sure enough, when we run this code, we get a nice breakdown of the source:
</p>
<pre>
$ ./jamulate
*jamulator.OrgPseudoOp
*jamulator.LabelStatement
*jamulator.DataStatement
  *jamulator.StringDataItem
  *jamulator.IntegerDataItem
  *jamulator.IntegerDataItem
*jamulator.LabelStatement
*jamulator.Instruction
*jamulator.LabelStatement
*jamulator.Instruction
*jamulator.Instruction
*jamulator.Instruction
*jamulator.Instruction
*jamulator.Instruction
*jamulator.LabelStatement
*jamulator.Instruction
*jamulator.Instruction
*jamulator.LabeledStatement
  *jamulator.LabelStatement
  *jamulator.Instruction
*jamulator.LabeledStatement
  *jamulator.LabelStatement
  *jamulator.Instruction
*jamulator.OrgPseudoOp
*jamulator.DataStatement
  *jamulator.LabelCall
*jamulator.DataStatement
  *jamulator.LabelCall
*jamulator.DataStatement
  *jamulator.LabelCall
</pre>
<p>
Once we have this AST, building a binary is a cinch:
</p>
<ol>
  <li>Loop over the AST and compute the byte offset for each instruction.</li>
  <li>Use the computed byte offsets to resolve the instructions with labels.</li>
  <li>Final pass to write the payload to the disk.</li>
</ol>
<h2 id="disassembly">Disassembly</h2>
<p>
Now we need to work backwards - we have the binary payload and we want to get
the source.
There is a challenge here. Remember that in assembly programs, we can insert
arbitrary data with <code>.db</code> or <code>.dw</code> statements alongside instructions.
In order to disassemble effectively, we have to be able to figure out what
is "data" and what is "code".
</p>
<p>
One possible technique is to emulate the assembly program, and then record
the ways in which memory addresses are accessed. After playing a game for a while,
you would have a pretty good record of exactly which sections are data and which are code.
I decided not to use this technique, however, since the goal of this project is <em>static</em>
recompilation. I want to explore just how much is possible to do at compile-time.
</p>
<p>
So what can we do?
</p>
<p>
First, recall that the last 6 bytes in NES programs are 3 memory addresses which are the
3 entry points into the program:
</p>
<pre>
<code class="language-6502">.org $fffa
    .dw NMI_Routine
    .dw Reset_Routine
    .dw IRQ_Routine</code>
</pre>
<p>
Given this, a workable strategy becomes clear:
</p>
<ol>
  <li>Create an AST where every single byte is a single <code>.db</code> statement.</li>
  <li>
  Replace the <code>.db</code> statements at <code>$fffa</code> and <code>$fffb</code>
  with a <code>.dw</code> statement which references an <code>NMI_Routine</code> label.
  </li>
  <li>
  Calculate the address that the <code>.db</code> statements at <code>$fffa</code> and <code>$fffb</code>
  were referring to, and insert a <code>LabelStatement</code> with the <code>NMI_Routine</code>
  label before the <code>.db</code> statement at that address.
  </li>
  <li>
  Mark the <code>.db</code> statement at that address as an instruction.
  </li>
</ol>
<p>
When I say "mark as an instruction", what I mean is that we should interpret the
<code>.db</code> statement at that location as an op code, and then use that
to replace the following <code>.db</code> statements as part of the instruction
as necessary. Then, based on the instruction, we want to recursively
mark other locations as instructions:
</p>
<table>
  <tr>
    <th>
      Instruction
    </th>
    <th>
      What to Do
    </th>
  </tr>
  <tr>
    <td>
      BPL, BMI, BVC, BVS, BCC, BCS, BNE, BEQ, JSR
    </td>
    <td>
      Mark the jump target address <em>and</em> the next address as an instruction.
    </td>
  </tr>
  <tr>
    <td>
      JMP absolute
    </td>
    <td>
      Mark the jump target address as an instruction.
    </td>
  </tr>
  <tr>
    <td>
      RTI, RTS, BRK, JMP indirect
    </td>
    <td>
      Do nothing.
    </td>
  </tr>
  <tr>
    <td>
      everything else
    </td>
    <td>
      Mark the next address as an instruction.
    </td>
  </tr>
</table>
<p>
The instructions that start with "B" are all <em>conditional branch</em> instructions.
This means that they test some condition, and then either transfer control flow to
the next instruction, or to a different label. This means that we can mark the
possible branch address <em>and</em> the next address as instructions.
</p>
<p>
<code>JSR</code> stands for "Jump to SubRoutine". This will transfer control to a
target address and then later when the <code>RTS</code> ("ReTurn from Subroutine")
instruction is reached, continue execution where the <code>JSR</code> instruction left off.
</p>
<p>
It is <em>possible</em> for assembly programmers to use <code>JSR</code> and then
inside the subroutine, do tricks with the stack to return to a different location.
This is a problem that will be tackled later. It's not an issue with our "Hello World" example.
</p>
<p>
<code>RTI</code>, <code>RTS</code>, and <code>BRK</code> modify control flow, but the destination
address is not constant, so these instructions do not help us know what else to mark as instructions.
</p>
<p>
As seen in this table, there are 2 types of <code>JMP</code> instructions: absolute and indirect:
</p>
<table>
  <tr>
    <th>
      JMP Type
    </th>
    <th>
      Example Assembly
    </th>
    <th>
      Description
    </th>
  </tr>
  <tr>
    <td>
      Absolute
    </td>
    <td>
      <code class="language-6502">JMP Label_80a2</code>
    </td>
    <td>
      This version is used in the "Hello World" program. It sends control
      flow to the operand address - which in this example is a label. This
      version is convenient for disassembly because the destination address
      is known statically - the address is <em>hard-coded</em>.
    </td>
  </tr>
  <tr>
    <td>
      Indirect
    </td>
    <td>
      <code class="language-6502">JMP ($0123)</code>
    </td>
    <td>
      Uses the operand address as a pointer, sending control flow
      to the address at the pointer. This will prove to be one of the big challenges
      of this project. More on this later.
    </td>
  </tr>
</table>
<p>
The instructions in the earlier table are the only ones that modify control flow.
All other instructions execute serially. Thus if we encounter one of these,
we can reliably decode the next byte as an instruction.
</p>
<p>
Here's what it looks like when we apply this algorithm to our Hello World binary code:
</p>
<pre>
<code class="language-6502">.org $8000
Label_8000:
    .db $48
    .db $65
    .db $6c
    .db $6c
    .db $6f
    .db $2c
    .db $20
    .db $57
    .db $6f
    .db $72
    .db $6c
    .db $64
    .db $21
    .db $0a
    .db $00
Reset_Routine:
    LDX #$00
Label_8011:
    LDA Label_8000, X
    BEQ Label_801d
    STA $2008
    INX
    JMP Label_8011
Label_801d:
    LDA #$00
    STA $2009
IRQ_Routine:
    RTI
NMI_Routine:
    RTI
    .db $ff

    ...
    (this is repeated about 30,000 times)
    ...

    .db $ff
    .dw NMI_Routine
    .dw Reset_Routine
    .dw IRQ_Routine</code>
</pre>
<p>
This technique was able to decode all of the instructions, but we can't
read the text, and it's pretty annoying having <code>$ff</code> -
the filler byte value - repeated
so many times. Let's add a pass to detect ASCII strings. We can do this
by counting how many characters in a <code>.db</code> statement in a row
are considered "ASCII" and when a threshold of 4 is reached, replace
the <code>.db</code> statements with a quoted string:
</p>
<pre>
<code class="language-6502">.org $8000
Label_8000:
    .db "Hello, World!"
    .db $0a
    .db $00
Reset_Routine:
    LDX #$00
Label_8011:
    LDA Label_8000, X
    BEQ Label_801d
    STA $2008
    INX
    JMP Label_8011
Label_801d:
    LDA #$00
    STA $2009
IRQ_Routine:
    RTI
NMI_Routine:
    RTI
    .db $ff
    .db $ff

    ...
    (repeated about 30,000 times)
    ...

    .db $ff
    .db $ff
    .dw NMI_Routine
    .dw Reset_Routine
    .dw IRQ_Routine</code>
</pre>
<p>
Better! Let's solve the problem of the repeating <code>$ff</code>
by adding a pass to detect where it would make sense to place
<code>.org</code> statements. We can do this in much the same way
as ASCII detection, but in this case we will look out for repeating
bytes rather than bytes which fit in the ASCII range. 64 seems
like a good threshold. If a byte repeats 64 times, replace all the
repeated occurences with a <code>.org</code> statement:
</p>
<pre>
<code class="language-6502">.org $8000
Label_8000:
    .db "Hello, World!"
    .db $0a
    .db $00
Reset_Routine:
    LDX #$00
Label_8011:
    LDA Label_8000, X
    BEQ Label_801d
    STA $2008
    INX
    JMP Label_8011
Label_801d:
    LDA #$00
    STA $2009
IRQ_Routine:
    RTI
NMI_Routine:
    RTI
.org $fffa
    .dw NMI_Routine
    .dw Reset_Routine
    .dw IRQ_Routine</code>
</pre>
<p>
And finally, one minor detail. Let's do a final pass to collapse <code>.db</code>
statements together:
</p>
<pre>
<code class="language-6502">.org $8000
Label_8000:
    .db "Hello, World!", $0a, $00
Reset_Routine:
    LDX #$00
Label_8011:
    LDA Label_8000, X
    BEQ Label_801d
    STA $2008
    INX
    JMP Label_8011
Label_801d:
    LDA #$00
    STA $2009
IRQ_Routine:
    RTI
NMI_Routine:
    RTI
.org $fffa
    .dw NMI_Routine
    .dw Reset_Routine
    .dw IRQ_Routine</code>
</pre>
<p>
Not bad. This is as close as we can get to the original source.
It's impossible to know what label names were used, but we can
give good names to the interrupt vectors. We just turned a
binary machine code program into human-readable assembly.
</p>
<p>
Now that we can figure out the assembly source code from 6502 machine code,
we can start the fun part - converting the assembly program into native
machine code.
</p>
<h2 id="code-generation">Code Generation</h2>
<p>
Our code generation code will generate an LLVM <em>bitcode</em> module.
We can then use <code>llc</code> to compile the bitcode into an object file,
the same as if we used <code>gcc -c module.c</code> and looked at the resulting
module.o.
</p>
<p>
LLVM is written in C++, but it also exposes a C interface.
This means we can integrate cleanly using <a href="https://golang.org/cmd/cgo/">cgo</a>.
In fact, <a href="http://awilkins.id.au/">Andrew Wilkins</a> maintains a convenient Go module called
<a href="https://github.com/axw/gollvm">gollvm</a> which gives us seamless integration.
</p>
<p>
At any time we can debug the LLVM module we are generating by calling <code>module.Dump()</code>
which prints the <a href="https://llvm.org/docs/LangRef.html">LLVM IR code</a>
for the module to stderr. Let's start by manually creating the
IR code that we want to generate for Hello World, so we know what to work toward.
We can get a head start by writing it in C and using <code>clang</code> to generate the IR
code for us:
</p>
<pre>
<code class="language-c">#include &lt;stdio.h&gt;

char * msg = "Hello, World!\n";

int main() {
    char * ptr = msg;
    while (*ptr) {
        putchar(*ptr);
        ++ptr;
    }
}</code>
</pre>
<pre>
$ clang -emit-llvm -S test.c
$
</pre>
<p>Now looking at test.s:</p>
<pre>
<code class="language-llvm">; ModuleID = 'test.c'
target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64-S128"
target triple = "x86_64-pc-linux-gnu"

@.str = private unnamed_addr constant [15 x i8] c"Hello, World!\0A\00", align 1
@msg = global i8* getelementptr inbounds ([15 x i8]* @.str, i32 0, i32 0), align 8

define i32 @main() nounwind uwtable {
  %1 = alloca i32, align 4
  %ptr = alloca i8*, align 8
  store i32 0, i32* %1
  %2 = load i8** @msg, align 8
  store i8* %2, i8** %ptr, align 8
  br label %3

; &lt;label&gt;:3                                       ; preds = %7, %0
  %4 = load i8** %ptr, align 8
  %5 = load i8* %4, align 1
  %6 = icmp ne i8 %5, 0
  br i1 %6, label %7, label %14

; &lt;label&gt;:7                                       ; preds = %3
  %8 = load i8** %ptr, align 8
  %9 = load i8* %8, align 1
  %10 = sext i8 %9 to i32
  %11 = call i32 @putchar(i32 %10)
  %12 = load i8** %ptr, align 8
  %13 = getelementptr inbounds i8* %12, i32 1
  store i8* %13, i8** %ptr, align 8
  br label %3

; &lt;label&gt;:14                                      ; preds = %3
  %15 = load i32* %1
  ret i32 %15
}

declare i32 @putchar(i32)</code>
</pre>
<p>
Alright - that looks a bit different than the code we're going to generate, but
it's a good start. If that looks complicated, don't worry - we're going
to do 3 things to make it less so:
</p>
<ol>
  <li>Read up on the <a href="https://llvm.org/docs/LangRef.html">language reference</a>.</li>
  <li>Delete the stuff that seems unnecessary and see if it still works.</li>
  <li>Modify the code to look like what we want to generate.</li>
</ol>
<p>
Here's an updated version with inline comments breaking it down:
</p>
<pre>
<code class="language-llvm">; Here we declare the text that we will print.

; "private" means that only this module can see it - we do not export this
; symbol. Always declare private when possible. There are optimizations to be
; had when a symbol is not exported.

; "constant" means that this data is read-only. Again use constant when
; possible so that optimization passes can take advantage of this fact.

; [15 x i8] is the type of this data. i8 means an 8-bit integer.
@msg = private constant [15 x i8] c"Hello, World!\0a\00"


; Here we declare a dependency on the `putchar` symbol.

; When this module is linked, `putchar` must be defined somewhere, and with
; this signature.
declare i32 @putchar(i32)


; Same thing for `exit`.

; `noreturn` indicates that we do not expect this function to return. It will
; end the process, after all.

; `nounwind` has to do with LLVM's error handling model. We use `nounwind`
; because we know that `exit` will not throw an exception.
declare void @exit(i32) noreturn nounwind


; Note that we will be performing the final link step with gcc, which will
; automatically statically link against libc. This will provide the `putchar`
; and `exit` symbols, as well as set up the executable entry point to call `main`.
define i32 @main() {

; This label statement indicates the start of a basic block.
Entry:

; Here we allocate some variables on the stack. These are X, Y, and A,
; 3 of the 6502's 8-bit registers.
      %X = alloca i8
      %Y = alloca i8
      %A = alloca i8

; Note that here we are allocating variables which are single bits.
; These represent 2 of the bits from the status register.
; After this source listing there is a table explaining each bit
; of the status register.
      %S_neg = alloca i1
      %S_zero = alloca i1

; Send control flow to the Reset_Routine basic block.
      br label %Reset_Routine

Reset_Routine:

; This is the code to generate for
; LDX #$00

  ; Store 0 in the X register.
      store i8 0, i8* %X

  ; Clear the negative status bit, because we just stored 0 in X,
  ; and 0 is not negative.
      store i1 0, i1* %S_neg

  ; Set the zero status bit, because we just stored 0 in X.
      store i1 1, i1* %S_zero


      br label %Label_loop

Label_loop:

; This is the code to generate for
; LDA msg, X

  ; Load the value of X into %0.
      %0 = load i8* %X

  ; Get a pointer to the character in msg indexed by %0, which contains the
  ; value of X.
      %1 = getelementptr [15 x i8]* @msg, i64 0, i8 %0

  ; Read a byte of memory located at the pointer we just computed into %2.
      %2 = load i8* %1

  ; Store the byte we just loaded into %A, which is the variable we have
  ; allocated for A.
      store i8 %2, i8* %A

  ; Now we need to set the negative status bit correctly.

  ; The byte of memory we just loaded into %A is negative if
  ; and only if the highest bit is set.

  ; Perform a bitwise AND with 1000 0000.
      %3 = and i8 128, %2

  ; Test if the result is equal to 1000 0000.
      %4 = icmp eq i8 128, %3

  ; Save the answer to the negative status bit.
      store i1 %4, i1* %S_neg

  ; Now we need to set the zero status bit correctly.

  ; Test if the byte is equal to zero.
      %5 = icmp eq i8 0, %2

  ; Store the answer to the zero status bit.
      store i1 %5, i1* %S_zero


; This is the code to generate for
; BEQ loopend
      %6 = load i1* %S_zero

  ; If zero bit is set, go to Label_loopend. Otherwise, go to AutoLabel_0
      br i1 %6, label %Label_loopend, label %AutoLabel_0

AutoLabel_0:

; This is the code to generate for
; STA $2008
      %7 = load i8* %A

  ; Convert the 8-bit integer that we just loaded from A into
  ; a 32-bit integer to match the signature of `putchar`.
      %8 = zext i8 %7 to i32

      %9 = call i32 @putchar(i32 %8)
      br label %AutoLabel_1
AutoLabel_1:
; This is the code to generate for
; INX
      %10 = load i8* %X
      %11 = add i8 %10, 1
      store i8 %11, i8* %X

  ; Set the negative status bit correctly.
      %12 = and i8 128, %11
      %13 = icmp eq i8 128, %12
      store i1 %13, i1* %S_neg

  ;Set the zero status bit correctly.
      %14 = icmp eq i8 0, %11
      store i1 %14, i1* %S_zero

; This is the code to generate for
; JMP loop
      br label %Label_loop

Label_loopend:
; This is the code to generate for
; LDA #$00

      store i8 0, i8* %A

      ; Clear the negative status bit.
      store i1 0, i1* %S_neg

      ; Set the zero status bit.
      store i1 1, i1* %S_zero

; This is the code to generate for
; STA $2009
      %15 = load i8* %A
      %16 = zext i8 %15 to i32
      call void @exit(i32 %16) noreturn nounwind

  ; Terminate this basic block with `unreachable` because
  ; exit never returns.
      unreachable

; Generate dummy basic blocks for the
; interrupt vectors, because we don't support them yet.
IRQ_Routine:
      unreachable

NMI_Routine:
      unreachable
}</code>
</pre>
<p>
In this code we use <code>S_neg</code> and <code>S_zero</code>, 2 of the status register bits.
These bits, along with the other status bits that we did not mention yet,
are updated after certain instructions and used for things such as branching.
Here is a full description of all the status bits:
</p>
<table>
  <tr>
    <th>
      Bit Mask
    </th>
    <th>
      Variable Name
    </th>
    <th>
      Description
    </th>
  </tr>
  <tr>
    <td>
      <code>0000 0001</code>
    </td>
    <td>
      S_carry
    </td>
    <td>
      Used for arithmetic and bitwise instructions, typically to make it
      easier to deal with integers that are larger than 8 bits.
      We don't need to deal with this yet.
      <code>BCC</code> and <code>BCS</code> use this status bit to decide
      whether to branch.
    </td>
  </tr>
  <tr>
    <td>
      <code>0000 0010</code>
    </td>
    <td>
      S_zero
    </td>
    <td>
      When a computation results in a value that is equal to zero, this
      status bit is set. Otherwise, it is cleared. <code>BEQ</code> and
      <code>BNE</code> use this status bit to decide whether to branch.
    </td>
  </tr>
  <tr>
    <td>
      <code>0000 0100</code>
    </td>
    <td>
      S_int
    </td>
    <td>
      This bit indicates whether interrupts are disabled. You can use <code>SEI</code>
      to set this bit, and <code>CLI</code> to clear this bit. More on interrupts later.
    </td>
  </tr>
  <tr>
    <td>
      <code>0000 1000</code>
    </td>
    <td>
      S_dec
    </td>
    <td>
      Normally, this bit would toggle <em>decimal mode</em> on and off.
      However, the NES disables this feature of the CPU, so it effectively
      does nothing. You can use <code>SED</code> to set this bit, and
      <code>CLD</code> to clear this bit.
    </td>
  </tr>
  <tr>
    <td>
      <code>0001 0000</code>
    </td>
    <td>
      S_brk
    </td>
    <td>
      This bit is set when a <code>BRK</code> instruction has been executed and an
      interupt has been generated to process it.
      We'll ignore this bit for now.
    </td>
  </tr>
  <tr>
    <td>
      <code>0010 0000</code>
    </td>
    <td>
      -
    </td>
    <td>
      This bit is unused. It remains 0 at all times.
    </td>
  </tr>
  <tr>
    <td>
      <code>0100 0000</code>
    </td>
    <td>
      S_over
    </td>
    <td>
      When a computation results in an invalid
      <a href="https://en.wikipedia.org/wiki/Two's_complement">two's complement</a>
      value, this bit is set.
      Otherwise, it is cleared.
      <code>BVC</code> and <code>BVS</code> use this to decide whether to branch.
    </td>
  </tr>
  <tr>
    <td>
      <code>1000 0000</code>
    </td>
    <td>
      S_neg
    </td>
    <td>
      When a computation results in a negative two's complement value, this bit is set.
      Otherwise, it is cleared. <code>BPL</code> and <code>BMI</code> use this status
      bit to decide whether to branch.
    </td>
  </tr>
</table>
<p>
Let's make sure our goal LLVM code does what we expect:
</p>
<pre>
$ llc -filetype=obj hello.llvm
$ gcc hello.llvm.o
$ ./a.out
Hello, World!
$
</pre>
<p>
<code>llc</code> converts the LLVM IR code into a native object file, and then
<code>gcc</code> does the final link step, statically linking against libc
to hook up <code>main</code>, <code>putchar</code>, and <code>exit</code>.
By default, <code>gcc</code> creates an executable named <code>a.out</code>,
which we run, and viola!
</p>
<p>
The next step is to generate this code from our disassembly.
With the help of <a href="https://github.com/axw/gollvm">gollvm</a>
we will:
</p>
<ol>
  <li>Create a LLVM module.</li>
  <li>Declare our dependency on <code>putchar</code> and <code>exit</code>.</li>
  <li>
  Create the <code>main</code> function and allocate stack variables for the
  registers.
  </li>
  <li>
  Perform one pass over the AST to identify labeled data
  and create global variables. We save the index of label name
  to global variable in a map.
  </li>
  <li>
  Perform a second pass over the AST to generate the basic blocks
  and save an index of label name to basic block in a map.
  </li>
  <li>
  Final pass over the AST to generate code inside of the basic blocks.
  </li>
</ol>
<p>
Here is a structure that contains the state we need while
compiling:
</p>
<pre>
<code class="language-go">type Compilation struct {
	// Keep track of the warnings and errors
	// that occur while compiling.
	Warnings []string
	Errors   []string

	// program is our abstract syntax tree -
	// we will make several passes over it during
	// the compilation process.
	program   *Program

	// LLVM variable which represents the module we
	// are creating.
	mod       llvm.Module

	// This is the object that we will use to do all the
	// code generation. It's used to create every kind
	// of statement.
	builder   llvm.Builder

	// Reference to our main function.
	mainFn    llvm.Value

	// References to the functions we declared, so we
	// can call them.
	putCharFn llvm.Value
	exitFn    llvm.Value

	// References to the variables we allocate on the stack
	// so we can use them in instructions.
	rX        llvm.Value
	rY        llvm.Value
	rA        llvm.Value
	rSNeg     llvm.Value
	rSZero    llvm.Value

	// Maps label name to the global variables that we add,
	// for when the code loads data from a label.
	labeledData   map[string]llvm.Value

	// Maps label name to basic block, so that when
	// code branches to another label, we can branch to
	// the relevant basic block.
	labeledBlocks map[string]llvm.BasicBlock

	// Keeps track of the basic block we are currently
	// generating code for, if any.
	currentBlock *llvm.BasicBlock

	// Keeps a reference to the reset routine basic block
	// so we know where to first jump to from the entry point.
	resetBlock     *llvm.BasicBlock
}</code>
</pre>
<p>
  With this structure, we can execute our plan:
</p>
<pre>
<code class="language-go">func (p *Program) Compile(filename string) (c *Compilation) {
	llvm.InitializeNativeTarget()

	c = new(Compilation)
	c.program = p
	c.Warnings = []string{}
	c.Errors = []string{}
	c.mod = llvm.NewModule("asm_module")
	c.builder = llvm.NewBuilder()
	defer c.builder.Dispose()
	c.labeledData = map[string]llvm.Value{}
	c.labeledBlocks = map[string]llvm.BasicBlock{}

	// First pass to identify labeled data and create
	// global variables, saving the indexes in `c.labeledData`.
	c.identifyLabeledDataPass()

	// declare i32 @putchar(i32)
	i32Type := llvm.Int32Type()
	putCharType := llvm.FunctionType(i32Type, []llvm.Type{i32Type}, false)
	c.putCharFn = llvm.AddFunction(c.mod, "putchar", putCharType)
	c.putCharFn.SetLinkage(llvm.ExternalLinkage)

	// declare void @exit(i32) noreturn nounwind
	exitType := llvm.FunctionType(llvm.VoidType(), []llvm.Type{i32Type}, false)
	c.exitFn = llvm.AddFunction(c.mod, "exit", exitType)
	c.exitFn.AddFunctionAttr(llvm.NoReturnAttribute | llvm.NoUnwindAttribute)
	c.exitFn.SetLinkage(llvm.ExternalLinkage)

	// main function / entry point
	mainType := llvm.FunctionType(i32Type, []llvm.Type{}, false)
	c.mainFn = llvm.AddFunction(c.mod, "main", mainType)
	c.mainFn.SetFunctionCallConv(llvm.CCallConv)
	entry := llvm.AddBasicBlock(c.mainFn, "Entry")
	c.builder.SetInsertPointAtEnd(entry)
	c.rX = c.builder.CreateAlloca(llvm.Int8Type(), "X")
	c.rY = c.builder.CreateAlloca(llvm.Int8Type(), "Y")
	c.rA = c.builder.CreateAlloca(llvm.Int8Type(), "A")
	c.rSNeg = c.builder.CreateAlloca(llvm.Int1Type(), "S_neg")
	c.rSZero = c.builder.CreateAlloca(llvm.Int1Type(), "S_zero")

	// Second pass to build basic blocks.
	c.buildBasicBlocksPass()

	// Finally, one last pass for codegen.
	c.codeGenPass()

	// hook up the first entry block to the reset block
	c.builder.SetInsertPointAtEnd(entry)
	c.builder.CreateBr(*c.resetBlock)

	err := llvm.VerifyModule(c.mod, llvm.ReturnStatusAction)
	if err != nil {
		c.Errors = append(c.Errors, err.Error())
		return
	}

	// Uncomment this to print the LLVM IR code we just generated
	// to stderr.
	//c.mod.Dump()

	fd, err := os.Create(filename)
	if err != nil {
		c.Errors = append(c.Errors, err.Error())
		return
	}

	err = llvm.WriteBitcodeToFile(c.mod, fd)
	if err != nil {
		c.Errors = append(c.Errors, err.Error())
		return
	}

	err = fd.Close()
	if err != nil {
		c.Errors = append(c.Errors, err.Error())
		return
	}
}</code>
</pre>
<p>
Here's what it the module looks like after only the first pass
which identifies labeled data:
</p>
<pre>
<code class="language-llvm">; ModuleID = 'asm_module'

@Label_c000 = private global [15 x i8] c"Hello, World!\0A\00"</code>
</pre>
<p>
That's it. All we've done is created a global variable for <code>msg</code>
(which is known as <code>Label_c000</code> in our disassembly) and saved it in
the <code class="language-go">c.labeledData</code> map.
</p>
<p>
Now add in the declares and main function definition, and we get:
</p>
<pre>
<code class="language-llvm">; ModuleID = 'asm_module'

@Label_c000 = private global [15 x i8] c"Hello, World!\0A\00"

declare i32 @putchar(i32)

declare void @exit(i32) noreturn nounwind

define i32 @main() {
Entry:
  %X = alloca i8
  %Y = alloca i8
  %A = alloca i8
  %S_neg = alloca i1
  %S_zero = alloca i1
}</code>
</pre>
<p>
Nothing fancy here. Next let's add in the second pass to identify basic blocks:
</p>
<pre>
<code class="language-llvm">; ModuleID = 'asm_module'

@Label_c000 = private global [15 x i8] c"Hello, World!\0A\00"

declare i32 @putchar(i32)

declare void @exit(i32) noreturn nounwind

define i32 @main() {
Entry:
  %X = alloca i8
  %Y = alloca i8
  %A = alloca i8
  %S_neg = alloca i1
  %S_zero = alloca i1

Reset_Routine:                                    ; No predecessors!

Label_c011:                                       ; No predecessors!

Label_c01d:                                       ; No predecessors!

IRQ_Routine:                                      ; No predecessors!

NMI_Routine:                                      ; No predecessors!
}</code>
</pre>
<p>
At this point we've created basic blocks for each of our labels and saved them
in our <code class="language-go">c.labeledBlocks</code> map.
</p>
<p>
And finally, we do the actual code generation pass and add the jump
from the entry block to the reset block.
</p>
<p>
The actual code generation code is surprisingly simple.
It consists of a switch case, and a few calls into the llvm builder
object. For example, here's the code generation code for LDX #$00:
</p>
<pre>
<code class="language-go">func (i *Instruction) Compile(c *Compilation) {
	v := llvm.ConstInt(llvm.Int8Type(), uint64(i.Value), false)
	switch i.OpCode {
	case 0xa2: // ldx
		c.builder.CreateStore(v, c.rX)
		c.testAndSetZero(i.Value)
		c.testAndSetNeg(i.Value)
	// more cases for other immediate instructions
	}
}

func (c *Compilation) testAndSetZero(v int) {
	if v == 0 {
		c.setZero()
		return
	}
	c.clearZero()
}

func (c *Compilation) setZero() {
	c.builder.CreateStore(llvm.ConstInt(llvm.Int1Type(), 1, false), c.rSZero)
}

func (c *Compilation) clearZero() {
	c.builder.CreateStore(llvm.ConstInt(llvm.Int1Type(), 0, false), c.rSZero)
}

func (c *Compilation) testAndSetNeg(v int) {
	if v&amp;0x80 == 0x80 {
		c.setNeg()
		return
	}
	c.clearNeg()
}

func (c *Compilation) setNeg() {
	c.builder.CreateStore(llvm.ConstInt(llvm.Int1Type(), 1, false), c.rSNeg)
}

func (c *Compilation) clearNeg() {
	c.builder.CreateStore(llvm.ConstInt(llvm.Int1Type(), 0, false), c.rSNeg)
}</code>
</pre>
<p>
For completeness's sake, let's look at one more code gen example. Here's
the code generation code for LDA msg, X:
</p>
<pre>
<code class="language-go">func (i *Instruction) Compile(c *Compilation) {
	switch i.OpCode {
	case 0xbd: // LDA label, X
		// Look up the global module variable based on the label name.
		dataPtr := c.labeledData[i.LabelName]
		index := c.builder.CreateLoad(c.rX, "")
		// This is what we index into the global variable with.
		indexes := []llvm.Value{
			llvm.ConstInt(llvm.Int8Type(), 0, false),
			index,
		}
		// Obtain a pointer to the element that we want to load.
		ptr := c.builder.CreateGEP(dataPtr, indexes, "")
		v := c.builder.CreateLoad(ptr, "")
		c.builder.CreateStore(v, c.rA)
		c.dynTestAndSetNeg(v)
		c.dynTestAndSetZero(v)
	// more cases for other direct-with-label-indexed instructions
	}
}

func (c *Compilation) dynTestAndSetNeg(v llvm.Value) {
	x80 := llvm.ConstInt(llvm.Int8Type(), uint64(0x80), false)
	masked := c.builder.CreateAnd(v, x80, "")
	isNeg := c.builder.CreateICmp(llvm.IntEQ, masked, x80, "")
	c.builder.CreateStore(isNeg, c.rSNeg)
}

func (c *Compilation) dynTestAndSetZero(v llvm.Value) {
	zeroConst := llvm.ConstInt(llvm.Int8Type(), uint64(0), false)
	isZero := c.builder.CreateICmp(llvm.IntEQ, v, zeroConst, "")
	c.builder.CreateStore(isZero, c.rSZero)
}</code>
</pre>
<p>
After implementing codegen for the rest of the instructions,
here is the module dump after the code generation pass:
</p>
<pre>
<code class="language-llvm">; ModuleID = 'asm_module'

@Label_c000 = private global [15 x i8] c"Hello, World!\0A\00"

declare i32 @putchar(i32)

declare void @exit(i32) noreturn nounwind

define i32 @main() {
Entry:
  %X = alloca i8
  %Y = alloca i8
  %A = alloca i8
  %S_neg = alloca i1
  %S_zero = alloca i1
  br label %Reset_Routine

Reset_Routine:                                    ; preds = %Entry
  store i8 0, i8* %X
  store i1 true, i1* %S_zero
  store i1 false, i1* %S_neg
  br label %Label_c011

Label_c011:                                       ; preds = %else, %Reset_Routine
  %0 = load i8* %X
  %1 = getelementptr [15 x i8]* @Label_c000, i8 0, i8 %0
  %2 = load i8* %1
  store i8 %2, i8* %A
  %3 = and i8 %2, -128
  %4 = icmp eq i8 %3, -128
  store i1 %4, i1* %S_neg
  %5 = icmp eq i8 %2, 0
  store i1 %5, i1* %S_zero
  %6 = load i1* %S_zero
  br i1 %6, label %Label_c01d, label %else

else:                                             ; preds = %Label_c011
  %7 = load i8* %A
  %8 = zext i8 %7 to i32
  %9 = call i32 @putchar(i32 %8)
  %10 = load i8* %X
  %11 = add i8 %10, 1
  store i8 %11, i8* %X
  %12 = and i8 %11, -128
  %13 = icmp eq i8 %12, -128
  store i1 %13, i1* %S_neg
  %14 = icmp eq i8 %11, 0
  store i1 %14, i1* %S_zero
  br label %Label_c011

Label_c01d:                                       ; preds = %Label_c011
  store i8 0, i8* %A
  store i1 true, i1* %S_zero
  store i1 false, i1* %S_neg
  %15 = load i8* %A
  %16 = zext i8 %15 to i32
  call void @exit(i32 %16)
  unreachable

IRQ_Routine:                                      ; preds = %Label_c01d
  unreachable

NMI_Routine:                                      ; No predecessors!
  unreachable
}</code>
</pre>
<p>
Great! This looks just like the code we wanted to get, minus the
comments and with a few things renamed.
Let's make sure it runs as expected:
</p>
<pre>
$ llc -filetype=obj hello.bc
$ gcc hello.bc.o
$ ./a.out
Hello, World!
$
</pre>
<p>
At this point in the project we are able to recompile a simple
"Hello, World" 6502 program with a small custom ABI into native machine code
and then execute it.
</p>
<h2 id="optimization">Optimization</h2>
<p>
Did you notice that we never used the value of <code>S_neg</code>?
We only ever stored it.
This is a waste of CPU cycles. We can do better. However, we don't want to
completely remove the ability to compute <code>S_neg</code> -
although this "Hello World" example does
not use the value, other code might.
</p>
<p>
Optimization is an enormously complicated topic, with its own well-deserved field.
Luckly, we won't have to wrap our heads around it in order to benefit. LLVM IR code is designed to be optimized.
LLVM comes with state of the art optimization techniques, in the form of <em>passes</em>
that you run on a module.
</p>
<p>
Let's run several optimization passes on the module we generate before rendering
bitcode:
</p>
<table>
  <tr>
    <th>
      Optimization Name
    </th>
    <th>
      Description
    </th>
  </tr>
  <tr>
    <td>
      <a href="https://llvm.org/docs/Passes.html#constprop-simple-constant-propagation">Constant Propagation</a>
    </td>
    <td>
      Looks for instructions involving only constants and replaces them with a
      constant value instead of an instruction.
    </td>
  </tr>
  <tr>
    <td>
      <a href="https://llvm.org/docs/Passes.html#instcombine-combine-redundant-instructions">Combine Redundant Instructions</a>
    </td>
    <td>
      Combines instructions to form fewer, simple instructions. For example if you add 1 twice, it will instead
      add 2.
    </td>
  </tr>
  <tr>
    <td>
      <a href="https://llvm.org/docs/Passes.html#mem2reg-promote-memory-to-register">Promote Memory to Register</a>
    </td>
    <td>
      This pass allows us to load every "register" variable (X, Y, A, etc) before performing
      an instruction, and then store the register variable back after performing the instruction.
      This optimization pass will convert our allocated variables into registers, eliminating
      all the redundancy.
    </td>
  </tr>
  <tr>
    <td>
      <a href="https://llvm.org/docs/Passes.html#gvn-global-value-numbering">Global Value Numbering</a>
    </td>
    <td>
      Eliminates partially and fully redundant instructions, and delete redundant load instructions.
    </td>
  </tr>
  <tr>
    <td>
      <a href="https://llvm.org/docs/Passes.html#simplifycfg-simplify-the-cfg">Control Flow Graph Simplification</a>
    </td>
    <td>
      Removes unnecessary code and merges basic blocks together when possible.
    </td>
  </tr>
</table>
<p>
Here's what it looks like to add these optimization passes to our
<code class="language-go">Compile</code> code:
</p>
<pre>
<code class="language-go">	// ...
	err := llvm.VerifyModule(c.mod, llvm.ReturnStatusAction)
	if err != nil {
		c.Errors = append(c.Errors, err.Error())
		return
	}

	// This creates an engine object which has useful information
	// about our machine as a target.
	engine, err := llvm.NewJITCompiler(c.mod, 3)
	if err != nil {
		c.Errors = append(c.Errors, err.Error())
		return
	}
	defer engine.Dispose()

	pass := llvm.NewPassManager()
	defer pass.Dispose()

	pass.Add(engine.TargetData())
	pass.AddConstantPropagationPass()
	pass.AddInstructionCombiningPass()
	pass.AddPromoteMemoryToRegisterPass()
	pass.AddGVNPass()
	pass.AddCFGSimplificationPass()
	pass.Run(c.mod)

	// Uncomment this to print the LLVM IR code we just generated
	// to stderr.
	//c.mod.Dump()

	// ...</code>
</pre>
<p>
LLVM offers <a href="https://llvm.org/docs/Passes.html">many more available passes</a>,
and it's up to the user to choose which ones, and in which order, to run.
Let's see how the ones chosen in the table above perform on our "Hello World" code:
</p>
<pre>
<code class="language-llvm">; ModuleID = 'asm_module'

@Label_c000 = private global [15 x i8] c"Hello, World!\0A\00"

declare i32 @putchar(i32)

declare void @exit(i32) noreturn nounwind

define i32 @main() {
Entry:
  br label %Label_c011

Label_c011:                                       ; preds = %else, %Entry
  %storemerge = phi i8 [ 0, %Entry ], [ %6, %else ]
  %0 = sext i8 %storemerge to i64
  %1 = getelementptr [15 x i8]* @Label_c000, i64 0, i64 %0
  %2 = load i8* %1, align 1
  %3 = icmp eq i8 %2, 0
  br i1 %3, label %Label_c01d, label %else

else:                                             ; preds = %Label_c011
  %4 = zext i8 %2 to i32
  %5 = call i32 @putchar(i32 %4)
  %6 = add i8 %storemerge, 1
  br label %Label_c011

Label_c01d:                                       ; preds = %Label_c011
  call void @exit(i32 0)
  unreachable
}</code>
</pre>
<p>
Wow! The code looks completely different, and much simpler. But does it still run?
</p>
<pre>
$ llc -filetype=obj hello.bc
$ gcc hello.bc.o
$ ./a.out
Hello, World!
$
</pre>
<p>Sure does!</p>
<p>
Now, not only are we recompiling a simple 6502 program for native machine code,
but we're actually generating highly <em>optimized</em> code.
But it's not time to congratulate ourselves yet. This is a contrived case.
Let's see if these techniques can work on an actual NES game.
</p>
<h2 id="rom-layout">Layout of an NES ROM File</h2>
<p>
The generally accepted standard for distributing NES games is the .nes
format, originally started by the <a href="http://fms.komkon.org/iNES/">iNES emulator</a>.
A .nes file looks like:
</p>
<ol>
  <li>16 byte header with metadata, such as:
    <ul>
      <li>What mapper, if any, the game uses.</li>
      <li>Whether the PPU uses vertical or horizontal mirroring.
      For now, don't worry so much about exactly what this means, as much as that
      it is a setting that the PPU needs to know about at bootup.
      </li>
    </ul>
  </li>
  <li>
    The 32 KB of assembled code which gets loaded into $8000 - $ffff on bootup.
    If there is a mapper, there might be more of this program data, but we'll talk
    about that later.
  </li>
  <li>8KB of graphics data, known as CHR-ROM.
  This gets loaded into the PPU on bootup.
  Again if there is a mapper,
  there might be more of this graphics data.
  </li>
</ol>
<p>
  If you're looking for a reference for the gory details of this format, see the
  <a href="http://wiki.nesdev.com/w/index.php/INES">Nesdev Wiki</a> or the
  <a href="http://fms.komkon.org/EMUL8/NES.html">official specification</a>.
</p>
<p>
The first "real" ROM I tried to recompile is this demo ROM made by
<a href="http://www.chrismcovell.com/">Chris Covell</a> in 1998 called
<a href="https://www.castledragmire.com/hynes/reference/resource/index.html">Zelda Title Screen Simulator</a>.
If you run this ROM in an emulator, you get a title screen that looks nearly identical to Zelda 1:
</p>
<img alt="Zelda title screen screenshot" src="https://s3.amazonaws.com/superjoe/blog-files/jamulator/zelda-ref.png"/>
<p>
Now that we're recompiling a real ROM, we'll have to do several more
things to make it work:
</p>
<ul>
  <li>Include the graphics data in the binary.</li>
  <li>Include the mirroring setting (vertical or horizontal) in the binary.</li>
  <li>Support code gen for more instructions.</li>
  <li>
    Include a runtime to create a window and render
    the video to the screen.
  </li>
</ul>
<p>
Including the graphics data and mirroring setting in the binary is trivial.
We get all the information we need when we read the ROM file; all
we need to do is convert it to global variables in our LLVM module:
</p>
<pre>
<code class="language-go">// In this snippet, `program` is an object which contains
// the information we loaded from the ROM file.
// chrRom is [][]byte - n banks of 8KB. At this point,
// we only support 1 bank.
if len(program.chrRom) != 1 {
	panic("Only 1 bank of CHR ROM is supported.")
}
dataLen := 0x2000 * len(chrRom)
chrDataValues := make([]llvm.Value, 0, dataLen)
int8type := llvm.Int8Type()
for _, bank := range chrRom {
	for _, b := range bank {
		chrDataValues = append(chrDataValues, llvm.ConstInt(int8type, uint64(b), false))
	}
}
chrDataConst := llvm.ConstArray(llvm.ArrayType(llvm.Int8Type(), dataLen), chrDataValues)
// `c.mod` is our `llvm.Module` object. `c` is an instance of our
// `Compilation` struct from a previous code listing.
chrDataGlobal := llvm.AddGlobal(c.mod, chrDataConst.Type(), "rom_chr_data")
// Setting the linkage to private here. We'll figure out what to do
// with this information soon.
chrDataGlobal.SetLinkage(llvm.PrivateLinkage)
chrDataGlobal.SetInitializer(chrDataConst)
chrDataGlobal.SetGlobalConstant(true)

// Now we add the global variable specifying the mirroring setting.
mirroringValue := llvm.ConstInt(llvm.Int8Type(), uint64(program.Mirroring), false)
mirroringGlobal := llvm.AddGlobal(c.mod, mirroringValue.Type(), "rom_mirroring")
mirroringGlobal.SetLinkage(llvm.PrivateLinkage)
mirroringGlobal.SetInitializer(mirroringValue)
mirroringGlobal.SetGlobalConstant(true)</code>
</pre>
<p>
The next step to getting our selected ROM to work is to support the additional
instructions that this program uses. Let's unpack that rom
and see what we're up against:
</p>
<pre>
$ ./jamulate -unrom roms/Zelda.NES
loading roms/Zelda.NES
disassembling to roms/Zelda
$ ls -l roms/Zelda
-rw-rw-r-- 1 andy andy 8192 May 24 23:53 chr0.chr
-rw-rw-r-- 1 andy andy 7977 May 24 23:53 prg.asm
-rw-rw-r-- 1 andy andy  485 May 24 23:53 Zelda.jam
</pre>
<p>
The code to unpack a .nes file into CHR data, PRG data, and metadata
is left as an exercise to the reader.
</p>
<p>
<code>chr0.chr</code> is binary data; it is the 8KB of graphics data mentioned
before which is loaded into the PPU on bootup.
</p>
<p>
<code>Zelda.jam</code> is where I decided to put the program metadata
in a simple key-value text format:
</p>
<pre>
<code class="language-ini"># output file name when this rom is assembled
filename=Zelda.NES
# see http://wiki.nesdev.com/w/index.php/Mapper
mapper=0
# 'Horizontal', 'Vertical', or 'FourScreenVRAM'
# see http://wiki.nesdev.com/w/index.php/Mirroring
mirroring=Horizontal
# whether SRAM in CPU $6000-$7FFF is present
sram=true
# whether the SRAM in CPU $6000-$7FFF, if present, is battery backed
battery=false
# 'NTSC', 'PAL', or 'DualCompatible'
tvsystem=NTSC
# assembly code
prg=prg.asm
# video data
chr=chr0.chr</code>
</pre>
<p>
<a href="https://github.com/andrewrk/jamulator/issues/1">In retrospect</a>,
I think it would have been simpler to have the assembler
support special declarations for this metadata rather than having a separate
file. But hey, it works.
</p>
<p>
You may notice there is some metadata there which is never addressed in
this article. For the purposes of this project, we don't need to think about
SRAM, battery, or tvsystem. None of the examples we look at use them.
</p>
<p>
Without further ado, let's see how the disassembler fared:
</p>
<pre>
<code class="language-6502">.org $c000
    .db "Zelda Simulator, ", $a9, "1998 Chris Covell (ccovell@direct.ca)"
Reset_Routine:
    CLD
    SEI
Label_c039:
    LDA $2002
    BPL Label_c039
    LDX #$00
    STX $2000
    STX $2001
    DEX
    TXS
    LDY #$06
    STY $01
    LDY #$00
    STY $00
    LDA #$00
Label_c052:
    STA ($00), Y
    DEY
    BNE Label_c052
    DEC $01
    BPL Label_c052
    LDX #$3f
    STX $2006
    LDX #$00
    STX $2006
    LDX #$27
    LDY #$20
Label_c069:
    STX $2007
    DEY
    BNE Label_c069
    LDX #$3f
    STX $2006
    LDX #$00
    STX $2006
    LDX #$00
    LDY #$20
Label_c07d:
    LDA Label_c101, X
    STA $2007
    INX
    DEY
    BNE Label_c07d
    LDX #$23
    STX $2006
    LDX #$c0
    STX $2006
    LDX #$00
    LDY #$40
Label_c095:
    LDA Label_c121, X
    STA $2007
    INX
    DEY
    BNE Label_c095
    LDX #$20
    STX $2006
    LDX #$00
    STX $2006
    LDX #$00
    LDY #$00
Label_c0ad:
    LDA Label_c261, X
    STA $2007
    INX
    DEY
    BNE Label_c0ad
    LDX #$00
    LDY #$00
Label_c0bb:
    LDA Label_c361, X
    STA $2007
    INX
    DEY
    BNE Label_c0bb
    LDX #$00
    LDY #$00
Label_c0c9:
    LDA Label_c461, X
    STA $2007
    INX
    DEY
    BNE Label_c0c9
    LDX #$00
    LDY #$c0
Label_c0d7:
    LDA Label_c561, X
    STA $2007
    INX
    DEY
    BNE Label_c0d7
    LDX #$00
    STX $2003
    LDX #$00
    LDY #$00
Label_c0ea:
    LDA Label_c161, X
    STA $2004
    INX
    DEY
    BNE Label_c0ea
    LDA #$90
    STA $2000
    LDA #$1e
    STA $2001
Label_c0fe:
    JMP Label_c0fe
Label_c101:
    .db $36, $0d, $00, $10, $36, $17, $27, $0d
    .db $36, $08, $1a, $28, $36, $30, $31, $22
    .db $36, $30, $31, $11, $36, $15, $15, $15
    .db $36, $08, $1a, $28, $36, $30, $31, $22
Label_c121:
    .db $05, $05, $05, $05, $05, $05, $05, $05
    .db $08, "jZZZZ", $9a, $22, $00, "fUUUU", $99, $00
    .db $00, $6e, $5f, $55, $5d, $df, $bb, $00
    .db $00, $0a, $0a, $0a, $0a, $0a, $0a, $00
    .db $00, $00, $c0, $30, $00, $00, $00, $00
    .db $00, $00, $cc, $33, $00, $00, $00, $00
    .db $00, $20, $fc, $f3, $00, $00, $f0, $f0
Label_c161:
    .db $27, $ca, $02, $28, $2f, $cb, $02, $28
    .db $27, $cc, $02, $30, $2f, $cd, $02, $30
    .db $27, $d6, $02, $50, $2c, $d6, $02, $a0
    .db $27, $cc, $42, $c8, $2f, $cd, $42, $c8
    .db $27, $ca, $42, $d0, $2f, $cb, $42, $d0
    .db $31, $d2, $02, $57, $31, $d4, $02, $5f
    .db $3f, $d4, $02, $24, $41, $d4, $02, $63
    .db $4f, $d6, $02, $2c, $4f, $d2, $02, $cb
    .db $57, $ce, $02, $73, $5f, $cf, $02, $73
    .db $57, $d0, $02, $7b, $5f, $d1, $02, $7b
    .db $67, $d2, $02, $7a, $7b, $d4, $02, $90
    .db $7b, $d6, $02, $bc, $82, $d2, $02, $50
    .db $77, $cb, $82, $28, $7f, $ca, $82, $28
    .db $77, $cd, $82, $30, $7f, $cc, $82, $30
    .db $77, $cd, $c2, $c8, $7f, $cc, $c2, $c8
    .db $77, $cb, $c2, $d0, $7f, $ca, $c2, $d0
    .db $af, $a3, $00, $50, $af, $a5, $00, $58
    .db $af, $a7, $00, $60, $af, $a9, $00, $68
    .db $b7, $b2, $00, $50, $b7, $b4, $00, $58
    .db $b7, $b6, $00, $60, $b7, $b8, $00, $68
    .db $c9, $c2, $00, $50, $d1, $c3, $00, $50
    .db $c9, $c4, $00, $58, $d1, $c5, $00, $58
    .db $c9, $c6, $00, $60, $d1, $c7, $00, $60
    .db $c9, $c8, $00, $68, $d1, $c9, $00, $68
    .db $d9, $c2, $00, $50, $e1, $c3, $00, $50
    .db $d9, $c4, $00, $58, $e1, $c5, $00, $58
    .db $d9, $c6, $00, $60, $e1, $c7, $00, $60
    .db $d9, $c8, $00, $68, $e1, $c9, $00, $68
    .db $67, $a0, $03, $58, $67, $a0, $03, $60
    .db $67, $a0, $03, $68, $67, $a0, $03, $70
    .db $67, $a0, $03, $78, $67, $a0, $03, $80
    .db $67, $a0, $03, $88, $00, $b0, $00, $00
Label_c261:
    .db "$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$", $e0, $d5, "$$$$$$$$$$$$$$$$$$$$$$$$$$$$", $d4, $e0, $dc, $d7
    .db "$$$$$$$$$$$$$$$$$$$$$$$$$$$$", $d6, $dd, $dc, $ee, "$$$$$$$$$$$$$$$$$$$$$$$$$$$$", $d6, $db
    .db $de, $d7, $24, $24, $24, $e6, $e4, $e5
    .db $e4, $e5, $e4, $e5, $e4, $e5, $e4, $e5
    .db $e4, $e5, $e4, $e5, $e4, $e5, $e4, $e5
    .db $e4, $e5, $e6, $24, $24, $24, $d6, $db
    .db $dc, $d7, $24, $24, $24, $e2, "$$$$$$$$$$$$$$$$$$$$", $e3
    .db $24, $24, $24, $d6, $df, $de, $ee, $24
    .db $24, $24, $e3, "$$qrstuvwxyyyz{$$$$$", $e2, $24, $24, $24
    .db $d6, $db
Label_c361:
    .db $de, $d8, $ef, $24, $24, $e2, $24, $7c
    .db $7d, $7e, $7f, $80, $81, $82, $83, $84
    .db $85, $86, $87, $88, $89, $8a, $8b, $24
    .db $24, $24, $e3, $24, $24, $24, $d6, $df
    .db $dc, $da, $d7, $24, $24, $e3, $24, $8c
    .db $8d, $8e, $8f, $90, $91, $92, $93, $94
    .db $95, $96, $97, $98, $99, $9a, $9b, $9c
    .db $24, $24, $e2, $24, $24, $d4, $d9, $db
    .db $dc, $d9, $ee, $24, $24, $e2, $24, $9d
    .db $9e, $9f, $a0, $a1, $a2, $a3, $a4, $a5
    .db $a6, $a7, $a8, $a9, $aa, $ab, $ac, $ad
    .db $ae, $24, $e3, $24, $24, $d6, $db, $df
    .db $de, $db, $d7, $24, $24, $e3, $70, $af
    .db $b0, $b1, $b2, $b3, $b4, $b5, $b6, $b7
    .db $b8, $b9, $ba, $bb, $bc, $bd, $be, $bf
    .db $c0, $24, $e2, $24, $24, $d6, $db, $db
    .db $dc, $dd, $d7, $24, $24, $e2, "$$$$$$$", $c1
    .db $c2, $c3, $c4, $c5, "$$$$$$$$", $e3, $24, $24
    .db $d6, $db, $df, $de, $db, $ee, $24, $24
    .db $e3, $24, $c6, $c7, $c8, $c8, $c8, $24
    .db $c9, $ca, $cb, $cc, $cd, $c8, $c8, $c8
    .db $c8, $e8, $e9, $d3, $24, $e2, $24, $24
    .db $d6, $db, $db, $dc, $db, $d7, $24, $24
    .db $e2, "$$$$$$$$$", $ce, $cf, "$$$$$", $ea, $eb, $ec
    .db $24, $e3, $24, $24, $d6, $db, $df, $dc
    .db $db, $d7, $24, $24, $e3, "$$$$$$$$$", $d1, $d2
    .db "$$$$$$$$$", $e2, $24, $24, $d6, $db, $db
Label_c461:
    .db $dc, $d8, $e1, $d5, $24, $e6, $e4, $e5
    .db $e4, $e5, $e4, $e5, $e4, $e5, $e4, $e5
    .db $e4, $e5, $e4, $e5, $e4, $e5, $e4, $e5
    .db $e4, $e5, $e6, $24, $d4, $e1, $d9, $dd
    .db $dc, $da, $dc, $d7, "$$$$$$$$$", $f0, $01, $09
    .db $08, $06, $24, $17, $12, $17, $1d, $0e
    .db $17, $0d, $18, $24, $d6, $dc, $db, $df
    .db $dc, $da, $dc, $ee, "$$$$$$$$$$$$$$$$$$$$$$$$", $d6, $de, $ed
    .db $dd, $dc, $da, $de, $d7, "$$$$$$$$$$$$$$$$$$$$$$$$", $d6, $de
    .db $db, $dd, $e1, $d9, $dc, $ed, $e0, $ef
    .db $24, $24, $19, $1e, $1c, $11, $24, $1c
    .db $1d, $0a, $1b, $1d, $24, $0b, $1e, $1d
    .db $1d, $18, $17, $24, $24, $24, $d6, $d8
    .db $e1, $d9, $dd, $ed, $de, $d8, $e1, $e1
    .db $d5, "$$$$$$$$$$$$$$$$$$$$$", $d6, $da, $dd, $ed, $dd, $db
    .db $de, $da, $dc, $dd, $d8, $e0, $e0, $ef
    .db $24, $24, $24, $24, $d4, $e0, $e0, $d5
    .db "$$$$$$$$", $d4, $ef, $da, $da, $df, $db, $df
    .db $db, $dc, $da, $de, $df, $da, $dc, $dd
    .db $db, $26, $26, $26, $26, $da, $dc, $dd
    .db $ed, $e0, $e0, $ef, $24, $d4, $e0, $e0
    .db $e0, $d9, $db, $da, $da, $df, $db
Label_c561:
    .db $ed, $d8, $e1, $d9, $de, $d8, $e1, $d9
    .db $df, $db, $26, $26, $26, $26, $da, $dc
    .db $d8, $e1, $d9, $dc, $d8, $e0, $d9, $dc
    .db $dd, $dd, $d8, $e1, $e1, $d9, $dd, $ed
    .db $ed, $da, $dd, $ed, $de, $da, $dc, $db
    .db $dd, $db, $26, $26, $26, $26, $da, $de
    .db $da, $dc, $db, $de, $da, $dd, $ed, $dc
    .db $dc, $dd, $da, $dc, $dc, $db, $dd, $ed
    .db $ed, $da, $dd, $d8, $e1, $da, $dc, $db
    .db $df, $db, $26, $26, $26, $26, $da, $d8
    .db $d9, $dc, $db, $dc, $da, $d8, $e1, $d9
    .db $de, $df, $da, $d8, $e1, $e1, $d9, $e1
    .db $ed, $d9, $df, $da, $dc, $da, $de, $db
    .db $dd, $db, $26, $26, $26, $26, $da, $da
    .db $ed, $de, $d8, $e1, $e1, $d9, $dc, $db
    .db $de, $d8, $e1, $d9, $dd, $dd, $db, $dc
    .db $df, $db, $df, $da, $dc, $db, $de, $db
    .db $dd, $db, $26, $26, $26, $26, $da, $da
    .db $db, $dd, $da, $de, $d8, $e1, $d9, $db
    .db $de, $da, $dd, $db, $de, $df, $d8, $e1
    .db $df, $db, $df, $da, $dc, $db, $de, $db
    .db $dd, $db, $26, $26, $26, $26, $da, $da
    .db $db, $dd, $da, $de, $da, $dd, $db, $db
    .db $de, $da, $dd, $db, $de, $df, $dc, $e1
NMI_Routine:
    LDX #$00
    STX $2005
    STX $2005
    RTI
IRQ_Routine:
    RTI
.org $fffa, $00
    .dw NMI_Routine
    .dw Reset_Routine
    .dw IRQ_Routine</code>
</pre>
<p>
Looking at this disassembly, several things come to mind:
</p>
<ul>
  <li>
    Our ASCII detection did a good job with that string at the beginning. Nice.
  </li>
  <li>
  This code introduces only a few new instructions. It won't be too
  cumbersome to add code gen to support this.
  </li>
  <li>
  It looks like the disassembler was able to disassemble the entire
  program. You can see at the end there is an infinite loop and then
  data follows.
  </li>
  <li>
  ASCII detection found some false positives. Oh well. It makes no difference
  to the execution of the program.
  </li>
  <li>
  The IRQ interrupt is not used, but it looks like the NMI interrupt might be used.
  </li>
</ul>
<p>
Let's see if we can once again delay dealing with interrupts at all.
Modify the <code>NMI_Routine</code> block so that it looks like this:
</p>
<pre>
<code class="language-6502">NMI_Routine:
    RTI</code>
</pre>
<p>
Good thing we bothered to make an assembler. Now we can put the ROM
back together, run it in an emulator, and see if it still works:
</p>
<pre>
$ ./jamulate -rom roms/Zelda/Zelda.jam
building rom from roms/Zelda/Zelda.jam
saving rom Zelda.NES
</pre>
<p>
Again, writing the code to pack the rom back together into an
.nes file is left as an exercise to the reader.
</p>
<p>
Now we observe the ROM running in an emulator and see if it still works:
</p>
<img alt="Screenshot of the Zelda title screen but the sword is bent" src="https://superjoe.s3.amazonaws.com/blog-files/jamulator/zelda-bent.png"/>
<p>
Ha! It works but the sword is bent. That's actually pretty convenient - now
when we want to make interrupts work, we know how to test them.
But like true software developers, we're going to ignore any and
all problems as long as possible.
</p>
<p>
The only thing left to do then, is to add code gen support for the new
instructions. This leads us to our first challenge.
</p>
<h2 id="runtime-address-checks">Challenge: Runtime Address Checks</h2>
<p>
Looking at the disassembled code for Zelda Title Screen Simulator,
there is one instruction that stands out as a bit more tricky to
recompile than the others:
</p>
<pre>
<code class="language-6502">    STA ($00), Y</code>
</pre>
<p>
This is an <em>indirect</em> instruction. It tells the CPU to:
</p>
<ol>
  <li>Interpret the values in $0000 and $0001 as a memory address.</li>
  <li>Add the value of register Y to the address to compute a new address.</li>
  <li>Store the value of register A to the new address.</li>
</ol>
<p>
Quite a useful instruction, but it poses a challenge for recompilation.
Because values in memory addresses $0000 and $0001 could be <em>anything</em>,
only at runtime will we learn which memory address is about to be updated.
</p>
<p>
Recall that the NES uses memory mapped I/O. This means that this instruction
could be saving a value in memory, but it could also be talking to the PPU,
the APU, the game pad controller, or even a mapper.
Since this one instruction could be doing any of these things
depending on runtime state, we will have to add a
<em>runtime address check</em>.
</p>
<p>
Here's what the code generation looks like for STA-indirect-Y:
</p>
<pre>
<code class="language-go">func (i *Instruction) Compile(c *Compilation) {
	switch i.OpCode {
	// ... more cases for other indirect instructions

	case 0x91: // sta indirect y
		// load the base address at the address indicated by the instruction
		baseAddr := c.loadWord(i.Value)
		// load the 8-bit value of register Y
		rY := c.builder.CreateLoad(c.rY, "")
		// zero extend the 8-bit value of Y to 16 bits
		// so that we can add it to baseAddr
		rYw := c.builder.CreateZExt(rY, llvm.Int16Type(), "")
		// compute the pointer to the address to store a byte
		addr := c.builder.CreateAdd(baseAddr, rYw, "")
		// load the value of register A
		rA := c.builder.CreateLoad(c.rA, "")
		// call dynStore with our computed address and value of A.
		// dynStore will figure out what to do with the value and address.
		c.dynStore(addr, rA)
	}
}

func (c *Compilation) loadWord(addr int) llvm.Value {
	// Load a little endian word.

	// Load the low byte.
	ptrByte1 := c.load(addr)
	// Load the high byte.
	ptrByte2 := c.load(addr + 1)

	// Zero extend the 8-bit values to 16-bit.
	ptrByte1w := c.builder.CreateZExt(ptrByte1, llvm.Int16Type(), "")
	ptrByte2w := c.builder.CreateZExt(ptrByte2, llvm.Int16Type(), "")

	// Shift the high byte left by 8.
	shiftAmt := llvm.ConstInt(llvm.Int16Type(), 8, false)
	word := c.builder.CreateShl(ptrByte2w, shiftAmt, "")

	// Bitwise OR the high and low bytes together and return that.
	return c.builder.CreateOr(word, ptrByte1w, "")
}

func (c *Compilation) load(addr int) llvm.Value {
	// This function is used to load a byte from an address that we know
	// at compile-time.
	switch {
	default:
		c.Errors = append(c.Errors, fmt.Sprintf("reading from $%04x not implemented", addr))
		return llvm.ConstNull(llvm.Int8Type())
	case 0x0000 &lt;= addr &amp;&amp; addr &lt; 0x2000:
		// 2KB working RAM. mask because mirrored
		maskedAddr := addr &amp; (0x800 - 1)
		indexes := []llvm.Value{
			llvm.ConstInt(llvm.Int16Type(), 0, false),
			llvm.ConstInt(llvm.Int16Type(), uint64(maskedAddr), false),
		}
		ptr := c.builder.CreateGEP(c.wram, indexes, "")
		v := c.builder.CreateLoad(ptr, "")
		return v
	case 0x2000 &lt;= addr &amp;&amp; addr &lt; 0x4000:
		// PPU registers. mask because mirrored
		switch addr &amp; (0x8 - 1) {
		case 2:
			return c.builder.CreateCall(c.ppuReadStatusFn, []llvm.Value{}, "")
		case 4:
			return c.builder.CreateCall(c.ppuReadOamDataFn, []llvm.Value{}, "")
		case 7:
			return c.builder.CreateCall(c.ppuReadDataFn, []llvm.Value{}, "")
		default:
			c.Errors = append(c.Errors, fmt.Sprintf("reading from $%04x not implemented", addr))
			return llvm.ConstNull(llvm.Int8Type())
		}

	// ... There are more cases to handle; here we only include WRAM and the PPU.

	}
	panic("unreachable")
}

func (c *Compilation) dynStore(addr llvm.Value, val llvm.Value) {
	// runtime memory check
	storeDoneBlock := c.createBlock("StoreDone")
	x2000 := llvm.ConstInt(llvm.Int16Type(), 0x2000, false)
	inWRam := c.builder.CreateICmp(llvm.IntULT, addr, x2000, "")
	notInWRamBlock := c.createIf(inWRam)
	// this generated code runs if the write is happening in the WRAM range
	maskedAddr := c.builder.CreateAnd(addr, llvm.ConstInt(llvm.Int16Type(), 0x800-1, false), "")
	indexes := []llvm.Value{
		llvm.ConstInt(llvm.Int16Type(), 0, false),
		maskedAddr,
	}
	ptr := c.builder.CreateGEP(c.wram, indexes, "")
	c.builder.CreateStore(val, ptr)
	c.builder.CreateBr(storeDoneBlock)
	// this generated code runs if the write is &gt; WRAM range
	c.selectBlock(notInWRamBlock)
	x4000 := llvm.ConstInt(llvm.Int16Type(), 0x4000, false)
	inPpuRam := c.builder.CreateICmp(llvm.IntULT, addr, x4000, "")
	notInPpuRamBlock := c.createIf(inPpuRam)
	// this generated code runs if the write is in the PPU RAM range
	maskedAddr = c.builder.CreateAnd(addr, llvm.ConstInt(llvm.Int16Type(), 0x8-1, false), "")
	badPpuAddrBlock := c.createBlock("BadPPUAddr")
	sw := c.builder.CreateSwitch(maskedAddr, badPpuAddrBlock, 7)
	// this generated code runs if the write is in a bad PPU RAM addr
	c.selectBlock(badPpuAddrBlock)
	c.createPanic("invalid store address: $%04x\n", []llvm.Value{addr})

	ppuCtrlBlock := c.createBlock("ppuctrl")
	sw.AddCase(llvm.ConstInt(llvm.Int16Type(), 0, false), ppuCtrlBlock)
	c.selectBlock(ppuCtrlBlock)
	c.builder.CreateCall(c.ppuCtrlFn, []llvm.Value{val}, "")
	c.builder.CreateBr(storeDoneBlock)

	ppuMaskBlock := c.createBlock("ppumask")
	sw.AddCase(llvm.ConstInt(llvm.Int16Type(), 1, false), ppuMaskBlock)
	c.selectBlock(ppuMaskBlock)
	c.builder.CreateCall(c.ppuMaskFn, []llvm.Value{val}, "")
	c.builder.CreateBr(storeDoneBlock)

	oamAddrBlock := c.createBlock("oamaddr")
	sw.AddCase(llvm.ConstInt(llvm.Int16Type(), 3, false), oamAddrBlock)
	c.selectBlock(oamAddrBlock)
	c.builder.CreateCall(c.oamAddrFn, []llvm.Value{val}, "")
	c.builder.CreateBr(storeDoneBlock)

	oamDataBlock := c.createBlock("oamdata")
	sw.AddCase(llvm.ConstInt(llvm.Int16Type(), 4, false), oamDataBlock)
	c.selectBlock(oamDataBlock)
	c.builder.CreateCall(c.setOamDataFn, []llvm.Value{val}, "")
	c.builder.CreateBr(storeDoneBlock)

	ppuScrollBlock := c.createBlock("ppuscroll")
	sw.AddCase(llvm.ConstInt(llvm.Int16Type(), 5, false), ppuScrollBlock)
	c.selectBlock(ppuScrollBlock)
	c.builder.CreateCall(c.setPpuScrollFn, []llvm.Value{val}, "")
	c.builder.CreateBr(storeDoneBlock)

	ppuAddrBlock := c.createBlock("ppuaddr")
	sw.AddCase(llvm.ConstInt(llvm.Int16Type(), 6, false), ppuAddrBlock)
	c.selectBlock(ppuAddrBlock)
	c.builder.CreateCall(c.ppuAddrFn, []llvm.Value{val}, "")
	c.builder.CreateBr(storeDoneBlock)

	ppuDataBlock := c.createBlock("ppudata")
	sw.AddCase(llvm.ConstInt(llvm.Int16Type(), 7, false), ppuDataBlock)
	c.selectBlock(ppuDataBlock)
	c.builder.CreateCall(c.setPpuDataFn, []llvm.Value{val}, "")
	c.builder.CreateBr(storeDoneBlock)

	// this generated code runs if the write is &gt;= 0x4000
	// There are more cases to handle; here we only show
	// handling WRAM and the PPU.
	c.selectBlock(notInPpuRamBlock)
	c.createPanic()

	// done. X_X
	c.selectBlock(storeDoneBlock)
}

func (c *Compilation) createIf(cond llvm.Value) llvm.BasicBlock {
	// Create a conditional branch along with the 2 target blocks.
	// Returns the else block and set the current block to the then block.

	elseBlock := c.createBlock("else")
	thenBlock := c.createBlock("then")
	c.builder.CreateCondBr(cond, thenBlock, elseBlock)
	c.selectBlock(thenBlock)
	return elseBlock
}

func (c *Compilation) createBlock(name string) llvm.BasicBlock {
	bb := llvm.InsertBasicBlock(*c.currentBlock, name)
	bb.MoveAfter(*c.currentBlock)
	return bb
}

func (c *Compilation) selectBlock(bb llvm.BasicBlock) {
	c.builder.SetInsertPointAtEnd(bb)
	c.currentBlock = &amp;bb
}

func (c *Compilation) createPanic() {
	bytePointerType := llvm.PointerType(llvm.Int8Type(), 0)
	ptr := c.builder.CreatePointerCast(c.runtimePanicMsg, bytePointerType, "")
	c.builder.CreateCall(c.putsFn, []llvm.Value{ptr}, "")
	exitCode := llvm.ConstInt(llvm.Int32Type(), 1, false)
	c.builder.CreateCall(c.exitFn, []llvm.Value{exitCode}, "")
	c.builder.CreateUnreachable()
}</code>
</pre>
<p>
Let's take a look at what the IR code might look like for this
instruction. First, a simplified assembly source file:
</p>
<pre>
<code class="language-6502">.org $c000

Reset_Routine:
    STA ($00), Y
    STA $2009 ; exit

NMI_Routine:
    RTI

IRQ_Routine:
    RTI

.org $fffa
    .dw NMI_Routine
    .dw Reset_Routine
    .dw IRQ_Routine</code>
</pre>
<p>
And then the LLVM module that would be generated, before optimizations:
</p>
<pre>
<code class="language-llvm">; ModuleID = 'asm_module'

@wram = private global [2048 x i8] zeroinitializer
@panicMsg = private global [51 x i8] c"panic: attempted to write to invalid memory address"

declare i32 @putchar(i32)

declare i32 @puts(i8*)

declare void @exit(i32) noreturn nounwind

declare i8 @rom_ppu_read_status()

declare void @rom_ppu_write_control(i8)

declare void @rom_ppu_write_mask(i8)

declare void @rom_ppu_write_address(i8)

declare void @rom_ppu_write_data(i8)

declare void @rom_ppu_write_oamaddress(i8)

declare void @rom_ppu_write_oamdata(i8)

declare void @rom_ppu_write_scroll(i8)

define i32 @main() {
Entry:
  %X = alloca i8
  %Y = alloca i8
  %A = alloca i8
  %SP = alloca i8
  %S_neg = alloca i1
  %S_zero = alloca i1
  %S_dec = alloca i1
  %S_int = alloca i1
  br label %Reset_Routine

Reset_Routine:                                    ; preds = %Entry
  %0 = load i8* getelementptr inbounds ([2048 x i8]* @wram, i8 0, i8 0)
  %1 = load i8* getelementptr inbounds ([2048 x i8]* @wram, i8 0, i8 1)
  %2 = zext i8 %0 to i16
  %3 = zext i8 %1 to i16
  %4 = shl i16 %3, 8
  %5 = or i16 %4, %2
  %6 = load i8* %Y
  %7 = zext i8 %6 to i16
  %8 = add i16 %5, %7
  %9 = load i8* %A
  %10 = icmp ult i16 %8, 8192
  br i1 %10, label %then, label %else

then:                                             ; preds = %Reset_Routine
  %11 = and i16 %8, 2047
  %12 = getelementptr [2048 x i8]* @wram, i16 0, i16 %11
  store i8 %9, i8* %12
  br label %STA_done

else:                                             ; preds = %Reset_Routine
  %13 = icmp ult i16 %8, 16384
  br i1 %13, label %then2, label %else1

then2:                                            ; preds = %else
  %14 = and i16 %8, 7
  switch i16 %14, label %BadPPUAddr [
    i16 0, label %ppuctrl
    i16 1, label %ppumask
    i16 3, label %oamaddr
    i16 4, label %oamdata
    i16 5, label %ppuscroll
    i16 6, label %ppuaddr
    i16 7, label %ppudata
  ]

BadPPUAddr:                                       ; preds = %then2
  %15 = call i32 @puts(i8* getelementptr inbounds ([51 x i8]* @panicMsg, i32 0, i32 0))
  call void @exit(i32 1)
  unreachable

ppuctrl:                                          ; preds = %then2
  call void @rom_ppu_write_control(i8 %9)
  br label %STA_done

ppumask:                                          ; preds = %then2
  call void @rom_ppu_write_mask(i8 %9)
  br label %STA_done

oamaddr:                                          ; preds = %then2
  call void @rom_ppu_write_oamaddress(i8 %9)
  br label %STA_done

oamdata:                                          ; preds = %then2
  call void @rom_ppu_write_oamdata(i8 %9)
  br label %STA_done

ppuscroll:                                        ; preds = %then2
  call void @rom_ppu_write_scroll(i8 %9)
  br label %STA_done

ppuaddr:                                          ; preds = %then2
  call void @rom_ppu_write_address(i8 %9)
  br label %STA_done

ppudata:                                          ; preds = %then2
  call void @rom_ppu_write_data(i8 %9)
  br label %STA_done

else1:                                            ; preds = %else
  %16 = call i32 @puts(i8* getelementptr inbounds ([51 x i8]* @panicMsg, i32 0, i32 0))
  call void @exit(i32 1)
  unreachable

STA_done:                                         ; preds = %ppudata, %ppuaddr, %ppuscroll, %oamdata, %oamaddr, %ppumask, %ppuctrl, %then
  %17 = load i8* %A
  %18 = zext i8 %17 to i32
  call void @exit(i32 %18)
  br label %NMI_Routine

NMI_Routine:                                      ; preds = %STA_done
  unreachable

IRQ_Routine:                                      ; No predecessors!
  unreachable
}</code>
</pre>
<p>
With this framework in place, we can recompile instructions that store to
memory addresses only known at runtime.
Code generation for the other instructions in this assembly program
is straightforward at this point.
</p>
<p>
So now we can generate a new, native program to run. But how do we
get the video to actually display on the screen?
</p>
<p>
This question brings us to the next challenge.
</p>
<h2 id="parallel-systems">Challenge: Parallel Systems Running Simultaneously</h2>
<p>
To find out how to get something rendering on the screen, I looked at
<a href="https://github.com/scottferg/Fergulator">Fergulator</a> - an
already-working NES emulator written in Go.
</p>
<p>
Fergulator correctly emulates Super Mario Brothers 1 as well as
Zelda Title Screen Simulator, so it will certainly work for our purpose -
understanding how the video gets onto the screen.
</p>
<p>
The answer is easy to find in this well factored codebase.
Looking at the main loop in <code>machine.go</code>, the core
logic is revealed:
</p>
<pre>
<code class="language-go">cycles = cpu.Step()

for i := 0; i &lt; 3*cycles; i++ {
	ppu.Step()
}

for i := 0; i &lt; cycles; i++ {
	apu.Step()
}</code>
</pre>
<p>
The CPU runs one instruction and returns how many cycles the instruction
took to run, and then the PPU is stepped for 3 times as many cycles, and
the APU is stepped the same number of cycles as the CPU.
</p>
<p>
A problem presents itself here.
Our recompiled code replaces the CPU stepping, but we still have the PPU
and the APU code to reckon with.
We can start by eliminating audio from the equation and dealing with sound
later.
But no matter how you spin it, the fact remains that the PPU and the CPU
run independently of one another, and at differing speeds.
</p>
<p>
You can choose to recompile program code and run that as the main loop, but
then after every instruction
you must emulate the PPU for 3 times as many cycles as the instruction took.
Or you can choose to run the PPU as the main loop, but then after the
appropriate amount of cycles you must emulate the CPU for one instruction.
</p>
<p>
<em>One of the systems must be emulated.</em>
</p>
<p>
This is how I solved the problem:
</p>
<ul>
  <li>
    Have code generation call
    <code>rom_cycle</code>, an external function, after every instruction completes,
    passing the number of cycles the instruction took as a parameter.
  </li>
  <li>
    Bundle in a runtime with the generated executable which implements <code>rom_cycle</code>
    and emulates the PPU for the appropriate amount of steps.
  </li>
</ul>
<p>
It is a shame that we have to do some amount of emulation in this project, where the
goal is to statically recompile as much as possible.
The solution to this challenge represents a slight compromise to the project's integrity.
Yet we press on.
</p>
<p>
Given this solution, I ported the PPU code as well as the
<a href="http://www.libsdl.org/">SDL</a>
and
<a href="http://www.opengl.org/">OpenGL</a>
front-end code from Fergulator to a
<a href="https://github.com/andrewrk/jamulator/tree/c9b4de0424d4dcc623f594be4a165685874713fa/runtime">small C runtime</a>,
which is compiled with clang.
Here is a snippet explaining the <code>rom_cycle</code> and
<code>main</code> functions:
</p>
<pre>
<code class="language-c">#include "rom.h"
#include "assert.h"
#include "ppu.h"
#include "SDL/SDL.h"
#include "GL/glew.h"

Ppu* p;

// This function is called by the generated module after every instruction.
void rom_cycle(uint8_t cycles) {
    // Check the SDL event loop and quit if the Close event occurs.
    flush_events();

    // Step the PPU for 3 times number of cycles that just finished.
    for (int i = 0; i &lt; 3 * cycles; ++i) {
        Ppu_step(p);
    }
}

// This function is our new main entry point. We rename the
// main rom entry point to `rom_start` so that we can call it
// from this function.
int main(int argc, char* argv[]) {
    // Create a new instance of the PPU emulator core.
    p = Ppu_new();

    // The PPU code will call `render` when there is a frame ready to display
    // on the screen. The `render` function performs the SDL and OpenGL
    // calls to render the frame to the window.
    p-&gt;render = &amp;render;

    // Remember that in the generated rom module, we export the ROM mirroring
    // setting after reading it from the .nes file. Here we use it to configure
    // the nametable code.
    Nametable_setMirroring(&amp;p-&gt;nametables, rom_mirroring);

    // We currently only support ROMs with 1 CHR bank.
    assert(rom_chr_bank_count == 1);

    // In the generated rom module, we have the CHR ROM data in memory, as well
    // as `rom_read_chr`, an exported function which will copy the data to a
    // pointer. We use that to initialize the video RAM.
    rom_read_chr(p-&gt;vram);

    // This does the SDL and OpenGL setup such as creating the display window.
    init_video();

    // Here we call into the main entry point to the recompiled ROM code.
    rom_start();

    // Free up memory associated with the PPU emulator instance.
    Ppu_dispose(p);
}

uint8_t rom_ppu_read_status() {
    return Ppu_readStatus(p);
}

void rom_ppu_write_control(uint8_t b) {
    Ppu_writeControl(p, b);
}

void rom_ppu_write_mask(uint8_t b) {
    Ppu_writeMask(p, b);
}

void rom_ppu_write_oamaddress(uint8_t b) {
    Ppu_writeOamAddress(p, b);
}

void rom_ppu_write_address(uint8_t b) {
    Ppu_writeAddress(p, b);
}

void rom_ppu_write_data(uint8_t b) {
    Ppu_writeData(p, b);
}

void rom_ppu_write_oamdata(uint8_t b) {
    Ppu_writeOamData(p, b);
}

void rom_ppu_write_scroll(uint8_t b) {
    Ppu_writeScroll(p, b);
}</code>
</pre>
<p>
Notice that we include <code>rom.h</code>. This is a header file that defines the contract
that the generated rom module will fulfill:
</p>
<pre>
<code class="language-c">#include "stdint.h"

enum {
    ROM_MIRRORING_VERTICAL,
    ROM_MIRRORING_HORIZONTAL,
    ROM_MIRRORING_SINGLE_UPPER,
    ROM_MIRRORING_SINGLE_LOWER,
};

uint8_t rom_mirroring;
uint8_t rom_chr_bank_count;

// write the chr rom into dest
void rom_read_chr(uint8_t* dest);

// starts executing the PRG ROM.
// this function returns when the the program exits.
void rom_start();

// called after every instruction with the number of
// cpu cycles that have passed.
void rom_cycle(uint8_t);

// PPU hooks
uint8_t rom_ppu_read_status();
void rom_ppu_write_control(uint8_t);
void rom_ppu_write_mask(uint8_t);
void rom_ppu_write_oamaddress(uint8_t);
void rom_ppu_write_oamdata(uint8_t);
void rom_ppu_write_scroll(uint8_t);
void rom_ppu_write_address(uint8_t);
void rom_ppu_write_data(uint8_t);</code>
</pre>
<p>
When we build the final executable file, we can link against
this runtime to get a fully operational executable with
a video display.
</p>
<p>
After adding runtime compilation instructions, <code>make</code> does this:
</p>
<pre>
$ make
go tool yacc -o jamulator/y.go -v /dev/null jamulator/asm6502.y
/home/andy/gocode/bin/nex -e jamulator/asm6502.nex
clang -o runtime/main.o -c runtime/main.c
clang -o runtime/ppu.o -c runtime/ppu.c
clang -o runtime/nametable.o -c runtime/nametable.c
ar rcs runtime/runtime.a runtime/main.o runtime/ppu.o runtime/nametable.o
go build -o jamulate main.go
$
</pre>
<p>
Next we can implement a command which will read a .nes ROM file and perform
the recompilation process:
</p>
<pre>
$ ./jamulate -recompile roms/Zelda.NES
loading roms/Zelda.NES
Disassembling...
Decompiling to /tmp/302858262/prg.bc...
llc -o /tmp/302858262/prg.o -filetype=obj /tmp/302858262/prg.bc
gcc /tmp/302858262/prg.o runtime/runtime.a -lGLEW -lGL -lSDL -lSDL_gfx -o roms/Zelda
Done: roms/Zelda
$ file ./roms/Zelda
./roms/Zelda: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.24, BuildID[sha1]=0xa79ee93a78a4745b60ba9d10a58b209d756e8ffa, not stripped
$
</pre>
<p>
And so here we have it: an executable binary that depends on SDL and OpenGL, which
supposedly will execute the Zelda Title Screen Simulator when run. Does it work?
</p>
<img src="https://superjoe.s3.amazonaws.com/blog-files/jamulator/zelda-exe.png" alt="screenshot of Zelda title screen with bent sword and an ubuntu terminal showing some log output">
<p>
<small>
Note: There is an obvious discrepancy between the screenshot and the output listed
above. All output is real; the difference is that the article is organized, ordered,
and simplified for clarity and understanding. I did not necessarily write the code
in the same order that it is listed in this article.
</small>
</p>
<p>
So we made a small concession to solve this challenge, and we have a
simple demo NES ROM running natively.
Next let's see if we can fix the dent in that sword.
</p>
<h2 id="interrupts">Challenge: Handling Interrupts</h2>
<p>
In NES programming, after any instruction, it is possible that
the program counter is yanked away from the next expected instruction and instead
sent to a predefined location.
This is called an <strong>interrupt</strong>.
</p>
<p>
There are 3 kinds of interrupts in NES programming:
</p>
<dl>
  <dt>Reset</dt>
  <dd>
    Occurs when the user presses the reset button.
    This is also where the program counter starts when the NES
    powers on.
  </dd>
  <dt>IRQ</dt>
  <dd>
    Stands for Interrupt ReQuest. This interrupt can only be fired
    if a game uses a mapper or uses the <code>BRK</code> instruction.
    This request can be enabled and disabled with the <code>CLI</code> and
    <code>SEI</code> instructions.
  </dd>
  <dt>NMI</dt>
  <dd>
  Stands for Non-Maskable Interrupt because there are no instructions
  to enable and disable this interrupt. It occurs when the
  <a href="https://en.wikipedia.org/wiki/Vertical_blanking_interval">vertical blank</a>
  begins.
  </dd>
</dl>
<p>
When an interrupt occurs, the program counter and status register are
pushed onto the stack, and the program counter is set to the location
defined by the <em>interrupt vector table</em>, as seen in code listings above:
</p>
<pre>
<code class="language-6502">.org $fffa
    .dw NMI_Routine
    .dw Reset_Routine
    .dw IRQ_Routine</code>
</pre>
<p>
When a program has finished processing an interrupt,
it typically executes the <code>RTI</code> instruction
to return control flow back to where it was before
the interrupt occurred.
</p>
<p>
We don't yet have to deal with handling the reset button
being pressed, mappers, or games which execute the
<code>BRK</code> instruction,
so let's work on solving this
problem for the NMI interrupt only.
</p>
<p>
The PPU performs the vertical blanking which signals the NMI interrupt,
and the PPU emulation code is in the runtime.
In order to handle NMI interrupts correctly, we have to be ready to jump
to the interrupt routine after every single instruction.
</p>
<p>
We already call into the runtime by calling <code>rom_cycle</code> after
every instruction. This is where the PPU emulation is performed, which
is where the NMI interrupt is generated.
Given this, a natural solution might be something like this:
</p>
<ul>
  <li>
  Have <code>rom_cycle</code> return a value indicating which interrupt,
  if any, has occurred.
  </li>
  <li>
  After every instruction, check if an interrupt has occurred, and if so,
  jump to the interrupt routine. Otherwise, continue.
  </li>
</ul>
<p>
This solution is rather ugly, however. Branching after every instruction
decreases execution speed and executable leanness. Further, it leaves a
critical problem unsolved: how to jump back to where the program counter
was before the interrupt occurred.
The solution I came up with instead feels better, but it comes with a caveat.
Here's the idea:
</p>
<ul>
  <li>
  Switch the register variables from being allocated on the stack in
  <code>rom_start</code>, to global variables in the generated module.
  </li>
  <li>
  Update <code>rom_start</code>, the main entry point, to take a parameter
  indicating which interrupt vector to execute.
  </li>
  <li>
  In the runtime, when <code>rom_start</code> is called for the first time,
  pass Reset as the interrupt vector to execute.
  </li>
  <li>
  In the generated <code>rom_start</code> code, insert code at the beginning
  of the NMI interrupt routine block to push the program counter and the
  processor status to the stack.
  </li>
  <li>
  In <code>rom_cycle</code>, if there is no interrupt, return as normal.
  However, if there is an interrupt, call <code>rom_start</code>, passing
  the correct interrupt vector to execute.
  </li>
  <li>
  Generate a return statement for the <code>RTI</code> instruction.
  </li>
</ul>
<p>
The beauty of this solution is that it uses the real native stack for interrupts,
in what is probably the most efficient and elegant way to get the desired behavior.
</p>
<p>
The caveat is that if the game uses <code>RTI</code> for its side-effects instead
of the usual "return from interrupt" behavior, the executable will unexpectedly exit.
Further, if the game simulates returning from an interrupt without using
<code>RTI</code>, the game will crash due to a stack overflow.
</p>
<p>
Acknowledging these weaknesses, let's plow ahead until we are forced to solve
this problem a different way.
</p>
<p>
Let's see what it looks like to implement the plan.
</p>
<p>
  Update <code>rom_start</code> to take a parameter indicating which interrupt
  vector to execute:
</p>
<pre>
<code class="language-go">func (p *Program) Compile(filename string) (c *Compilation) {
	// ...

	mainType := llvm.FunctionType(llvm.VoidType(), []llvm.Type{llvm.Int8Type()}, false)
	c.mainFn = llvm.AddFunction(c.mod, "rom_start", mainType)
	c.mainFn.SetFunctionCallConv(llvm.CCallConv)

	// ...
}</code>
</pre>
<p>
Add a reset vector enum to rom.h:
</p>
<pre>
<code class="language-c">enum {
    ROM_INTERRUPT_NONE,
    ROM_INTERRUPT_NMI,
    ROM_INTERRUPT_RESET,
    ROM_INTERRUPT_IRQ,
};</code>
</pre>
<p>
  In the runtime, when <code>rom_start</code> is called for the first time,
  pass Reset as the interrupt vector to execute.
  Also, in <code>rom_cycle</code>, if there is no interrupt, return as normal.
  However, if there is an interrupt, call <code>rom_start</code>, passing
  the correct interrupt vector to execute:
</p>
<pre>
<code class="language-c">int interruptRequested = ROM_INTERRUPT_NONE;

void rom_cycle(uint8_t cycles) {
    flush_events();

    for (int i = 0; i &lt; 3 * cycles; ++i) {
        Ppu_step(p);
    }

    int req = interruptRequested;
    if (req != ROM_INTERRUPT_NONE) {
        interruptRequested = ROM_INTERRUPT_NONE;
        rom_start(req);
    }
}

void vblankInterrupt() {
    interruptRequested = ROM_INTERRUPT_NMI;
}

int main(int argc, char* argv[]) {
    p = Ppu_new();
    p-&gt;render = &amp;render;

    // The PPU code will call this function when an NMI interrupt occurs.
    p-&gt;vblankInterrupt = &amp;vblankInterrupt;

    Nametable_setMirroring(&amp;p-&gt;nametables, rom_mirroring);
    assert(rom_chr_bank_count == 1);
    rom_read_chr(p-&gt;vram);
    init_video();

    // Start the rom executing at the Reset interrupt routine.
    rom_start(ROM_INTERRUPT_RESET);

    Ppu_dispose(p);
}</code>
</pre>
<p>
  In the generated <code>rom_start</code> code, insert code at the beginning
  of the NMI interrupt routine block to push the program counter and the
  processor status to the stack:
</p>
<pre>
<code class="language-go">func (c *Compilation) addNmiInterruptCode() {
	c.builder.SetInsertPointBefore(c.nmiBlock.FirstInstruction())
	// * push PC high onto stack
	// * push PC low onto stack
	c.pushWordToStack(c.builder.CreateLoad(c.rPC, ""))
	// * push processor status onto stack
	c.pushToStack(c.getStatusByte())
}

func (c *Compilation) pushWordToStack(word llvm.Value) {
	high16 := c.builder.CreateLShr(word, llvm.ConstInt(llvm.Int16Type(), 8, false), "")
	high := c.builder.CreateTrunc(high16, llvm.Int8Type(), "")
	c.pushToStack(high)
	low16 := c.builder.CreateAnd(word, llvm.ConstInt(llvm.Int16Type(), 0xff, false), "")
	low := c.builder.CreateTrunc(low16, llvm.Int8Type(), "")
	c.pushToStack(low)
}

func (c *Compilation) pushToStack(v llvm.Value) {
	// write the value to the address at current stack pointer
	sp := c.builder.CreateLoad(c.rSP, "")
	spZExt := c.builder.CreateZExt(sp, llvm.Int16Type(), "")
	addr := c.builder.CreateAdd(spZExt, llvm.ConstInt(llvm.Int16Type(), 0x100, false), "")
	c.dynStore(addr, v)
	// stack pointer = stack pointer - 1
	spMinusOne := c.builder.CreateSub(sp, llvm.ConstInt(llvm.Int8Type(), 1, false), "")
	c.builder.CreateStore(spMinusOne, c.rSP)
}

func (c *Compilation) getStatusByte() llvm.Value {
	// zextend
	s7z := c.builder.CreateZExt(c.builder.CreateLoad(c.rSNeg, ""), llvm.Int8Type(), "")
	s6z := c.builder.CreateZExt(c.builder.CreateLoad(c.rSOver, ""), llvm.Int8Type(), "")
	s4z := c.builder.CreateZExt(c.builder.CreateLoad(c.rSBrk, ""), llvm.Int8Type(), "")
	s3z := c.builder.CreateZExt(c.builder.CreateLoad(c.rSDec, ""), llvm.Int8Type(), "")
	s2z := c.builder.CreateZExt(c.builder.CreateLoad(c.rSInt, ""), llvm.Int8Type(), "")
	s1z := c.builder.CreateZExt(c.builder.CreateLoad(c.rSZero, ""), llvm.Int8Type(), "")
	s0z := c.builder.CreateZExt(c.builder.CreateLoad(c.rSCarry, ""), llvm.Int8Type(), "")
	// shift
	s7z = c.builder.CreateShl(s7z, llvm.ConstInt(llvm.Int8Type(), 7, false), "")
	s6z = c.builder.CreateShl(s6z, llvm.ConstInt(llvm.Int8Type(), 6, false), "")
	s4z = c.builder.CreateShl(s4z, llvm.ConstInt(llvm.Int8Type(), 4, false), "")
	s3z = c.builder.CreateShl(s3z, llvm.ConstInt(llvm.Int8Type(), 3, false), "")
	s2z = c.builder.CreateShl(s2z, llvm.ConstInt(llvm.Int8Type(), 2, false), "")
	s1z = c.builder.CreateShl(s1z, llvm.ConstInt(llvm.Int8Type(), 1, false), "")
	// or
	s0z = c.builder.CreateOr(s0z, s1z, "")
	s0z = c.builder.CreateOr(s0z, s2z, "")
	s0z = c.builder.CreateOr(s0z, s3z, "")
	s0z = c.builder.CreateOr(s0z, s4z, "")
	s0z = c.builder.CreateOr(s0z, s6z, "")
	s0z = c.builder.CreateOr(s0z, s7z, "")
	return s0z
}</code>
</pre>
<p>
  Generate a return statement for the <code>RTI</code> instruction:
</p>
<pre>
<code class="language-go">func (i *Instruction) Compile(c *Compilation) {
	switch i.OpCode {
	// ... other cases
	case 0x40: // RTI implied
		// Restore the old status register values from the stack.
		c.pullStatusReg()

		// We get the PC from the stack, but since we rely on the native
		// stack, we trash the value.
		_ = c.pullWordFromStack()

		// This will call the `rom_cycle` function. RTI always takes
		// 6 cycles.
		c.cycle(6)

		// Generate a return statement.
		c.builder.CreateRetVoid()

		// So that when another block is encountered, we do not
		// create an unconditional branch to it.
		c.currentBlock = nil

	// ... other cases
	}
}

func (c *Compilation) cycle(count int) {
	v := llvm.ConstInt(llvm.Int8Type(), uint64(count), false)
	c.builder.CreateCall(c.cycleFn, []llvm.Value{v}, "")
}</code>
</pre>
<p>
And so here we have it: an elegant but imperfect solution to the interrupt problem.
Let's see if it fixes the bent sword:
</p>
<img src="https://superjoe.s3.amazonaws.com/blog-files/jamulator/zelda-straight.png">
<p>
Looks like progress to me!
At this point the project can recompile a small, simple title screen demo
NES program. The real challenge awaits: can it be made to work for a real NES game?
</p>
<p>
There are many games to choose from. Ideally we would pick one that poses fewer
additional challenges. One filter we can apply is to eliminate games that use mappers,
since we have hitherto ignored mapper support entirely.
</p>
<p>
This limits the choices significantly. The only games worth noting that do not use
a mapper are:
</p>
<ul>
  <li><a href="http://bootgod.dyndns.org:7777/profile.php?id=1091">Donkey Kong</a></li>
  <li><a href="http://bootgod.dyndns.org:7777/profile.php?id=281">Ice Climber</a></li>
  <li><a href="http://bootgod.dyndns.org:7777/profile.php?id=18">Excitebike</a></li>
  <li><a href="http://bootgod.dyndns.org:7777/profile.php?id=1099">Mario Bros.</a></li>
  <li><a href="http://bootgod.dyndns.org:7777/profile.php?id=270">Super Mario Brothers 1</a></li>
  <li><a href="http://bootgod.dyndns.org:7777/profile.php?id=570">Pac-Man</a></li>
</ul>
<p>
Of these, there is an obvious answer to which game we should support first,
which, of course, is Super Mario Brothers 1.
</p>
<p>
Our next challenge is revealed when we crack open the disassembly for SMB.
</p>
<h2 id="detect-jump-table">Challenge: Detecting a Jump Table</h2>
<p>
Here's a small section of Super Mario Brothers 1 disassembly:
</p>
<pre>
<code class="language-6502">Label_8212:
    LDA $0770
    JSR Label_8e04
    AND ($82), Y
    .db $dc, $ae, $8b, $83, $18, $92</code>
</pre>
<p>
Notice anything peculiar?
</p>
<p>
After the value at <code>$0770</code> is loaded into register A, and we return
from the <code>Label_8e04</code> subroutine,
there is an uncommon indirect <code>AND</code> instruction,
followed by data.
What could possibly be happening here?
</p>
<p>
Super Mario Brothers 1 is using a common assembly programming technique
called a <em>dynamic jump table</em>.
Take a look at the <code>Label_8e04</code> subroutine:
</p>
<pre>
<code class="language-6502">Label_8e04:
    ASL
    TAY
    PLA
    STA $04
    PLA
    STA $05
    INY
    LDA ($04), Y
    STA $06
    INY
    LDA ($04), Y
    STA $07
    JMP ($0006)</code>
</pre>
<p>
Notice that although this label is jumped to with <code>JSR</code>,
it never uses the <code>RTS</code> instruction.
Let's break it down further into readable pseudocode:
</p>
<pre>
<code class="language-6502">; Dynamic Jump Table. Call this label with JSR
; so that the old PC is on the stack.
; Immediately following the JSR statement should be
; .dw statements indicating the labels to jump to
; depending on the value of register A.
Label_8e04:
    ; Register A holds the index of the label that we wish to jump to.
    ; Multiply A by 2 because each table entry is 2 bytes.
    ASL           ; A = A * 2

    ; The useful indirect instructions use Y as the index, and we need
    ; to repurpose A.
    TAY           ; Y = A

    ; Since this label was called with JSR, the old PC is on the top
    ; of the stack. Here we get the lower byte since this is a little
    ; endian system.
    PLA           ; A = Stack.Pop()

    ; Save the lower byte of the old PC into memory.
    STA $04       ; Memory[$04] = A

    ; Get the higher byte of the old PC off the stack.
    PLA           ; A = Stack.Pop()

    ; Save the higher byte of the old PC into memory.
    STA $05       ; Memory[$05] = A

    ; JSR pushes the address - 1 of the next instruction to the stack.
    ; So we add 1 to Y to get the index of the first byte of the jump
    ; destination.
    INY           ; Y = Y + 1

    ; Get the first byte of the jump destination.
    LDA ($04), Y  ; A = Memory[Memory[$04] + Y]

    ; Save the first byte of the jump destination.
    STA $06       ; Memory[$06] = A

    ; Increment Y to get the index of the 2nd byte of the jump destination.
    INY           ; Y = Y + 1

    ; Get the 2nd byte of the jump instruction.
    LDA ($04), Y  ; A = Memory[Memory[$04] + Y]

    ; Save the 2nd byte of the jump instruction.
    STA $07       ; Memory[$07] = A

    ; Jump to the location that was just constructed.
    JMP ($0006)     ; Jump to address at $0006 - $0007</code>
</pre>
<p>
If we know that <code>Label_8e04</code> is a jump table, we can
mark the bytes following the <code>JSR</code> as <code>.dw</code>
labels which enables us to further disassemble the program.
The disassembled snippet from earlier would look like this:
</p>
<pre>
<code class="language-6502">Label_8212:
    LDA $0770
    JSR Label_8e04
    .dw Label_8231
    .dw Label_aedc
    .dw Label_838b
    .dw Label_9218</code>
</pre>
<p>
Without this jump table detection, the bytes at each of those labels
remain <code>.db</code> statements, unable to be decoded.
This is problematic because our strategy currently
depends on all instructions being completely disassembled so that
they can be decompiled and recompiled.
</p>
<p>
This is not a mere corner-case either. This technique is repeated in many games,
including
<a href="http://sonicepoch.com/sm3mix/disassembly.html#DynJump">Super Mario Brothers 3</a> and Pac-Man.
</p>
<p>
Fortunately, it is straightforward to identify and process a jump table
like this without changing too much code.
I solved it with a state machine - essentially pattern-matching or
a regular expression for instructions:
</p>
<pre>
<code class="language-go">func (d *Disassembly) detectJumpTable(addr int) bool {
	const (
		expectAsl = iota
		expectTay
		expectPlaA
		expectStaA
		expectPlaB
		expectStaB
		expectInyC
		expectLdaC
		expectStaC
		expectInYD
		expectLdaD
		expectStaD
		expectJmp
	)
	state := expectAsl
	var memA, memC int
	for elem := d.prog.elemAtAddr(addr); elem != nil; elem = elem.Next() {
		switch state {
		case expectAsl:
			i, ok := elem.Value.(*Instruction)
			if !ok {
				return false
			}
			if i.OpCode != 0x0a {
				return false
			}
			state = expectTay
		case expectTay:
			i, ok := elem.Value.(*Instruction)
			if !ok {
				return false
			}
			if i.OpCode != 0xa8 {
				return false
			}
			state = expectPlaA
		case expectPlaA:
			i, ok := elem.Value.(*Instruction)
			if !ok {
				return false
			}
			if i.OpCode != 0x68 {
				return false
			}
			state = expectStaA
		case expectStaA:
			i, ok := elem.Value.(*Instruction)
			if !ok {
				return false
			}
			if i.OpCode != 0x85 &amp;&amp; i.OpCode != 0x8d {
				return false
			}
			memA = i.Value
			state = expectPlaB
		case expectPlaB:
			i, ok := elem.Value.(*Instruction)
			if !ok {
				return false
			}
			if i.OpCode != 0x68 {
				return false
			}
			state = expectStaB
		case expectStaB:
			i, ok := elem.Value.(*Instruction)
			if !ok {
				return false
			}
			if i.OpCode != 0x85 &amp;&amp; i.OpCode != 0x8d {
				return false
			}
			if i.Value != memA+1 {
				return false
			}
			state = expectInyC
		case expectInyC:
			i, ok := elem.Value.(*Instruction)
			if !ok {
				return false
			}
			if i.OpCode != 0xc8 {
				return false
			}
			state = expectLdaC
		case expectLdaC:
			i, ok := elem.Value.(*Instruction)
			if !ok {
				return false
			}
			if i.OpCode != 0xb1 {
				return false
			}
			if i.Value != memA {
				return false
			}
			state = expectStaC
		case expectStaC:
			i, ok := elem.Value.(*Instruction)
			if !ok {
				return false
			}
			if i.OpCode != 0x85 &amp;&amp; i.OpCode != 0x8d {
				return false
			}
			memC = i.Value
			state = expectInYD
		case expectInYD:
			i, ok := elem.Value.(*Instruction)
			if !ok {
				return false
			}
			if i.OpCode != 0xc8 {
				return false
			}
			state = expectLdaD
		case expectLdaD:
			i, ok := elem.Value.(*Instruction)
			if !ok {
				return false
			}
			if i.OpCode != 0xb1 {
				return false
			}
			if i.Value != memA {
				return false
			}
			state = expectStaD
		case expectStaD:
			i, ok := elem.Value.(*Instruction)
			if !ok {
				return false
			}
			if i.OpCode != 0x85 &amp;&amp; i.OpCode != 0x8d {
				return false
			}
			if i.Value != memC+1 {
				return false
			}
			state = expectJmp
		case expectJmp:
			i, ok := elem.Value.(*Instruction)
			if !ok {
				return false
			}
			if i.OpCode != 0x6c {
				return false
			}
			if i.Value != memC {
				return false
			}
			return true
		}
	}
	return false
}</code>
</pre>
<p>
Given this detection function, it is a matter of adding logic to
the <code>JSR</code> disassembly. If a jump table is detected,
mark the following bytes as <code>.dw</code> label statements.
Otherwise, continue marking the next address as an instruction
as usual.
</p>
<p>
After adding jump table detection, it looks like all the program instructions
are disassembled successfully. But there are still some tricks up the
assembly programmers' sleeves.
</p>
<h2 id="indirect-jumps">Challenge: Indirect Jumps</h2>
<p>
We just looked at a dynamic jump table implementation which included this
instruction:
</p>
<pre>
<code class="language-6502">JMP ($0006)</code>
</pre>
<p>
This is problematic because we do not know what will be in the memory addresses
$0006 and $0007 until the instruction is actually executed.
</p>
<p>
One solution is to use our own jump table.
We can use our knowledge of the address of each label
to create a basic block to jump to when an indirect jump is encountered:
</p>
<pre>
<code class="language-go">func (c *Compilation) addDynJumpTable() {
	c.dynJumpBlock = llvm.AddBasicBlock(c.mainFn, "DynJumpTable")
	c.builder.SetInsertPointAtEnd(c.dynJumpBlock)
	pc := c.builder.CreateLoad(c.rPC, "")
	// Here, panic block causes a runtime error and crashes.
	sw := c.builder.CreateSwitch(pc, panicBlock, len(c.dynJumpAddrs))
	for addr, block := range c.dynJumpAddrs {
		addrVal := llvm.ConstInt(llvm.Int16Type(), uint64(addr), false)
		sw.AddCase(addrVal, block)
	}
}</code>
</pre>
<p>
And when a <code>JMP</code> indirect is encountered:
</p>
<pre>
<code class="language-go">func (i *Instruction) Compile(c *Compilation) {
	switch i.OpCode {
	// ... other cases

	case 0x6c: // jmp indirect
		newPc := c.loadWord(i.Value)
		c.builder.CreateStore(newPc, c.rPC)
		c.cycle(5)
		c.builder.CreateBr(c.dynJumpBlock)
	 	c.currentBlock = nil

	// ... other cases
  }
}</code>
</pre>
<p>
This ends up looking something like this in the generated module:
</p>
<pre>
<code class="language-llvm">DynJumpTable:
  %63674 = load i16* @PC
  switch i16 %63674, label %PanicBlock [
    i16 -11559, label %Label_d2d9
    i16 -3527, label %Label_f239
    i16 -27243, label %Label_9595
    i16 -28888, label %Label_8f28

    ; (about 2,000 more cases)

    i16 -3045, label %Label_f41b
    i16 -18147, label %Label_b91d
  ]</code>
</pre>
<p>
This solution will work as long as the indirect jump chooses to jump to one
of the labels.
However it causes a runtime crash if the indirect jump
sets the PC to anything else.
As we will soon find out, assembly programmers not only force the processor
to do this, but even more heinous acts.
</p>
<h2 id="dirty-assembly-tricks">Challenge: Dirty Assembly Tricks</h2>
<p>
Super Mario Brothers 1 is an amazing technical feat.
Every last byte of the 32KB available program space is utilized.
In fact, some bytes are even dual purposed to save space.
Have a look at this code from our Super Mario 1 disassembly:
</p>
<pre>
<code class="language-6502">Label_8220:
    LDY #$00
    .db $2c
Label_8223:
    LDY #$04
    LDA #$f8
Label_8227:
    STA $0200, Y
    INY
    INY
    INY
    INY
    BNE Label_8227
    RTS</code>
</pre>
<p>
Note the <code class="language-6502">.db $2c</code>.
</p>
<p>
$2c is the op code for <code>BIT</code> absolute, which is 3 bytes - 1 byte op code
and then 2 bytes for the absolute address.
</p>
<p>
<code class="language-6502">LDY #$04</code> is 2 bytes - $a0 for the op code and then
$04 for the immediate value.
</p>
<p>
So how this works is if you jump to <code>Label_8220</code>, it does Y = $00, and then
<em>sabotages</em> the next instruction, causing the Y = $04 to not happen. Instead,
the <code>BIT</code> instruction sets some status bits in a way that does not matter.
Then it picks up the next instruction, <code class="language-6502">LDA #$f8</code>,
as if you had jumped to <code>Label_8223</code> with a different Y.
</p>
<p>
This occurs over a dozen times in Super Mario 1. Similarly, there are instances
where the program <em>jumps into the middle of an instruction</em>.
</p>
<p>
Yet another trick is adding an address on the stack and then using the
<code>RTS</code> instruction to jump there, kind of like a homebrew
<code>JMP</code> indirect instruction.
And even with indirect <code>JMP</code> instructions, the programmer
may choose to jump to RAM or somewhere other than a label.
</p>
<p>
These issues <em>must</em> be resolved if we want a playable game.
Sadly, the solution marks the final nail in the coffin of the integrity
of this project.
</p>
<p>
The solution is to embed an interpreter runtime in the generated binary:
</p>
<ul>
  <li>
  Instead of identifying data that is read and only including that in the
  generated module, include all the PRG ROM, since we don't know which
  addresses may be accessed at runtime.
  </li>
  <li>
  After every instruction, update the program counter variable.
  </li>
  <li>
  Include a basic block in <code>rom_start</code> called <code>Interpret</code>
  which reads the program counter variable, reads the op code from the PRG ROM,
  performs the necessary operation, and then jumps to the <code>DynJumpTable</code> block.
  </li>
  <li>
  Update the <code>DynJumpTable</code> block so that the default case jumps to the
  <code>Interpret</code> block instead of panicking.
  </li>
  <li>
  When control flow runs into data, branch to the <code>Interpret</code> block.
  </li>
</ul>
<p>
Here's a diagram to help clarify:
</p>
<img src="https://superjoe.s3.amazonaws.com/blog-files/jamulator/interpret-block.svg" alt="state machine diagram" class="bright">
<p>
This strategy ensures that NES games will run <em>correctly</em>,
at the cost of efficiency.
Normally, doing something like updating the program counter variable after
every instruction would be optimized away, but this is thwarted by interrupts
in our case.
Because after every instruction we call <code>rom_cycle</code>, which in turn might
call <code>rom_start</code> with an interrupt, all the global variable state must be
correct before we call <code>rom_cycle</code>.
This defeats the entire point of this project.
At this point we might as well emulate.
In fact, with the new <code>Interpret</code> block, we are doing just that.
</p>
<p>
Although, to be fair, I only had to add emulation support for 6 op codes to get
Super Mario Brothers 1 working.
</p>
<h2 id="video">Video: Playing Super Mario Brothers 1 on Native Machine Code</h2>
<iframe src="https://player.vimeo.com/video/481428671" width="640" height="359" frameborder="0" allow="autoplay; fullscreen; picture-in-picture" allowfullscreen></iframe>
<p>
In this video I demonstrate my (poor) Super Mario 1 skills in a recompiled executable.
I also demonstrate the movie playback feature that was instrumental in debugging.
</p>
<h2 id="mappers">Unsolved Challenge: Mappers</h2>
<p>
At this point in the project we have Super Mario Brothers 1 running
mostly on native code, although not very highly optimized.
We've learned that static compilation, while possible, is rendered
pointless by some of the inherent challenges that emulating a system presents.
Thus there is no reason to solve the challenge that mappers provide.
</p>
<p>
The only thing I will say about mappers here is that they present
an additional layer of complexity for static disassembly, and they
make it blatantly obvious that Just-In-Time compilation is a better
technique than static recompilation. More on that in the conclusion.
</p>
<h2 id="community-support">Community Support</h2>
<p>
NES seems to be a common system for first-time emulator programmers.
As such, I was happy to find a large swath of documentation online
explaining in great detail how the NES works, emulator tutorials,
<a href="http://www.emulators.com/docs/nx25_nostradamus.htm">fascinating optimization articles</a>,
the invaluable
<a href="http://wiki.nesdev.com/w/index.php/Nesdev_Wiki">Nesdev wiki</a>,
and more.
Even so, there is nothing like asking a question and having a knowledgeable
person answer in real-time.
The folks in #nesdev on EFNet are fun, engaging, working on all kinds
of interesting projects, and helpful.
Thanks especially to Ulfalizer, sherlock_, Bisqwit, Movax12, and
<a href="https://github.com/scottferg/">scottferg</a> for answering
my questions, even when they seemed stupid.
The <a href="http://forums.nesdev.com/">Nesdev Forums</a> are nice as well.
</p>
<p>
There were also instances where I asked for some help with Go -
how to write good code, what is the best way to implement certain things, etc.
#go-nuts on Freenode is great for that. The channel is active - you'll
nearly always get an answer immediately.
</p>
<p>
As for my contributions back to the community, I did help scottferg
<a href="https://github.com/scottferg/Fergulator/commit/feb9b16d785f281f04f7295452d065f856bfc727">fix a bug</a>
in his emulator, which fixed support for
<a href="https://en.wikipedia.org/wiki/Maniac_Mansion">Maniac Mansion</a>.
</p>
<p>
I also filed an
<a href="https://llvm.org/bugs/show_bug.cgi?id=15873">llvm feature request</a>
asking for the ability to generate
comments in IR code. This would make it much, much easier to debug
generated IR code.
</p>
<h2 id="conclusion">Conclusion</h2>
<p>
After completing this project, I believe that static recompilation
does not have a practical application for video game emulation.
It is thwarted by the inability to completely disassemble a game without
executing it as well as the fact that multiple systems are executing in parallel,
possibly causing interrupts in the game code.
There is a constant struggle between correctness and optimized code.
Nearly all optimizations must be tossed out the window in the interest of
correctness.
Even more compromises would have to be made to start supporting advanced emulator
features such as saving state or rewinding.
</p>
<p>
A comparison could be made between a console game and an interpreted language
such as <a href="https://en.wikipedia.org/wiki/JavaScript">JavaScript</a>.
There are amazingly fast JavaScript interpreters such as
<a href="https://code.google.com/p/v8/">V8</a>, but they do not
work by statically compiling the script. Instead they use
<a href="https://en.wikipedia.org/wiki/Just-in-time_compilation">Just-In-Time compilation</a>, along with some advanced techniques, to achieve great speeds.
These techniques could be applied to JIT console game compilation.
</p>
<p>
For example, one such technique is to identify a section of code,
make some assumptions based on heuristics which allow for highly
optimized native code generation, and then detect if those assumptions
are broken.
If the assumptions are broken, the generated native code is tossed,
and emulation takes over.
However, if the assumptions are upheld, the recompiled block of code
will execute with blazing fast native speed.
</p>
<p>
This technique is much more suited to an emulator with a JIT core,
rather than trying to do everything statically, especially since
the emulator can "notice" at runtime which memory addresses are used
as instructions and which memory addresses are used as data.
</p>
<p>
Furthermore, distributing static executables that function as games
would be problematic as far as copyright infringement is concerned.
By keeping ROMs separate from the emulator executable, the emulator
can be distributed freely and easily without risking trouble.
</p>
<p>
That being said, I do feel like this project was worthwhile in that it
was intellectually stimulating and highly effective at teaching me,
and hopefully now, you, how the Nintendo Entertainment System works,
how to use LLVM effectively, and an introduction to a wide range of
problems that compilers and interpreters face.
</p>
]]></description>
      </item>
      <item>
         <title>Rapid Development Email Templates with Node.js</title>
         <pubDate>Wed, 30 Jan 2013 12:00:00 GMT</pubDate>

         <link>https://andrewkelley.me/post/swig-email-templates.html</link>
         <guid>https://andrewkelley.me/post/swig-email-templates.html</guid>
         <description><![CDATA[<h1>Rapid Development Email Templates with Node.js</h1>
<h2>Contents</h2>
<ol>
  <li><a href="#automated-emails-nodejs">Sending Automated Emails in Node.js</a></li>
  <li><a href="#nodeemailtemplates_gotchas">node-email-templates Gotchas</a>
  <li><a href="#fundamental_flaws">Fundamental Flaws</a>
    <ol>
      <li><a href="#includes_vs_template_inheritance">Includes VS Template Inheritance</a></li>
      <li><a href="#sharing_css">Sharing CSS</a></li>
      <li><a href="#dummy_context">Dummy Context</a></li>
    </ol>
  </li>
  <li><a href="#conclusion">Conclusion</a></li>
</ol>
<h2 id="automated-emails-nodejs">Sending Automated Emails in Node.js</h2>
<p>When I was tasked with solving the age-old problem of sending automatic
email messages to our users at
<a href="http://indabamusic.com/">Indaba Music</a>, I surveyed the
<a href="http://nodejs.org/">Node.js</a> landscape to find out the state of affairs.
</p>
<p>I was pleased to immediately discover
<a href="https://github.com/niftylettuce/node-email-templates/">node-email-templates</a> and
<a href="https://github.com/andris9/Nodemailer">Nodemailer</a>, and within two days had a
proof of concept email notification server deployed to production.
</p>
<p><strong>node-email-templates</strong> helps organize your project and make it
easy to render templates for sending via email by using
<a href="https://github.com/LearnBoost/juice">juice</a> to
<a href="http://www.campaignmonitor.com/css/">inline css</a>.
</p>

<p><strong>Nodemailer</strong> does the actual email dispatching -
given an email dispatch service, a subject, to, and body,
Nodemailer will get your mail to its destination.</p>

<p>Nodemailer is a wonderful piece of software. It worked, and continues to work,
exactly as advertised and without any hiccups. It even has a convenient API that
makes integration with common email dispatchers, such as
<a href="http://sendgrid.com">SendGrid</a> (which I also recommend), quite painless.</p>

<p>Unfortunately this journey was not without a few obstacles that node-email-templates
provided for me to solve.</p>

<p>What follows is an explanation of the bumps along the road that caused me to write
these modules to improve the state of email templates in node.js:</p>

<ul>
<li><a href="https://github.com/andrewrk/boost">boost</a></li>

<li><a href="https://github.com/andrewrk/swig-dummy-context">swig-dummy-context</a></li>

<li><a href="https://github.com/andrewrk/swig-email-templates">swig-email-templates</a></li>
</ul>

<p>More on these in a bit.</p>

<h2 id="nodeemailtemplates_gotchas">node-email-templates Gotchas</h2>

<p>Assume that I have 2 templates named <code>reminder</code> and <code>notice</code>.</p>
<p><strong>node-email-templates</strong> requires your project to have a folder structure that looks like this:</p>

<pre>./templates/reminder/html.ejs
./templates/reminder/text.ejs
./templates/reminder/style.css
./templates/notice/html.ejs
./templates/notice/text.ejs
./templates/notice/style.css
</pre>
<p>There are several problems here. There are some module smells:</p>

<ul>
  <li>
  Regardless of whether or not you need text version of a template, you must have the
  <code>text.ejs</code> file there or node-email-templates will throw an error.
  </li>
  <li>
  This project is poorly maintained. As of this writing,
  <a href="https://github.com/daeq">daeq</a> submitted a
  <a href="https://github.com/niftylettuce/node-email-templates/pull/17">pull request</a>
  to fix the above problem 3 months ago, and despite a promise to merge it,
  <a href="https://github.com/niftylettuce/">niftylettuce</a> still has not done so.
  </li>
</ul>

<p>But there are also some fundamental problems with the approach that the module takes.</p>

<h2 id="fundamental_flaws">Fundamental Flaws</h2>

<ul>
  <li>
  This flavor of ejs is limited. You can do includes, but not layouts or
  <a href="https://docs.djangoproject.com/en/dev/topics/templates/#template-inheritance">template inheritance</a>,
  which is where the true value of using templates comes in.
  </li>
  <li>
  The html templates have <em>no way</em> of sharing css between them.
  </li>
  <li>
  Because ejs depends on using <code>eval</code>, it is impossible to,
  given a template, create a dummy context with which to generate a preview
  of the template. More on this later.
  </li>
</ul>

<h3 id="includes_vs_template_inheritance">Includes VS Template Inheritance</h3>

<p>To demonstrate the template inheritance problem, let me give you 2 versions
of a template, one using ejs with includes, and one using
<a href="https://github.com/paularmstrong/swig/">swig</a> with template inheritance:</p>

<h4 id="ejs_includes">ejs includes</h4>

<h5 id="noticeejs">notice.ejs</h5>
<pre>
<code class="language-markup">&lt;% include header %&gt;
&lt;div&gt;
  &lt;p&gt;Hey &lt;%= username %&gt;,&lt;/p&gt;
  &lt;p&gt;This is a notice that your offer is about to expire.&lt;/p&gt;
&lt;/div&gt;
&lt;% include footer %&gt;</code>
</pre>
<h5 id="reminderejs">reminder.ejs</h5>
<pre>
<code class="language-markup">&lt;% include header %&gt;
&lt;div&gt;
  &lt;p&gt;Hey &lt;%= username %&gt;,&lt;/p&gt;
  &lt;p&gt;Don't forget! You probably wanted to do that thing.&lt;/p&gt;
&lt;/div&gt;
&lt;% include footer %&gt;</code>
</pre>
<h5 id="headerejs">header.ejs</h5>
<pre>
<code class="language-markup">&lt;div&gt;
  &lt;img src="logo.png"&gt;
&lt;/div&gt;</code>
</pre>
<h5 id="footerejs">footer.ejs</h5>
<pre>
<code class="language-markup">&lt;div&gt;
  Super Cool &amp;amp; Co., LLC.
  &lt;a&gt;Privacy Policy&lt;/a&gt;
&lt;/div&gt;</code>
</pre>
<h4 id="swig_template_inheritance">swig template inheritance</h4>

<h5 id="reminderhtml">reminder.html</h5>
<pre>
<code class="language-markup">{% extends "base.html" %}

{% block content %}
  &lt;p&gt;Don't forget! You probably wanted to do that thing.&lt;/p&gt;
{% endblock %}</code>
</pre>
<h5 id="noticehtml">notice.html</h5>
<pre>
<code class="language-markup">{% extends "base.html" %}

{% block content %}
  &lt;p&gt;This is a notice that your offer is about to expire.&lt;/p&gt;
{% endblock %}</code>
</pre>
<h5 id="basehtml">base.html</h5>
<pre>
<code class="language-markup">&lt;div&gt;
  &lt;img src="logo.png"&gt;
&lt;/div&gt;
&lt;div&gt;
  &lt;p&gt;Hey {{ username }},&lt;/p&gt;
  {% block content %}
  {% endblock %}
&lt;/div&gt;
&lt;div&gt;
  Super Cool &amp;amp; Co., LLC.
  &lt;a&gt;Privacy Policy&lt;/a&gt;
&lt;/div&gt;</code>
</pre>
<h4 id="template_inheritance_wins">Template Inheritance Wins</h4>

<p>This is an oversimplified example, but even so it starts to become
obvious why template inheritance is superior to includes.</p>

<p>You can also have includes in swig, by the way.</p>

<h3 id="sharing_css">Sharing CSS</h3>

<p><strong>node-email-templates</strong> uses
<a href="https://github.com/LearnBoost/juice">juice</a> to inline css.
Give juice html and css, and it returns html with the css inlined on each element for
<a href="http://www.campaignmonitor.com/css/">maximum email client compatibility</a>.</p>

<p>This setup seems good at first, but it is crippled by the fact that templates
are completely unable to share css.
Each template has its own independent <code>style.css</code> file.</p>

<p>It is not <strong>node-email-templates</strong>'s fault.
Given the way that juice works, it isn't really possible to share css.</p>

<p>This is where
<a href="https://github.com/andrewrk/boost">boost</a> comes in.
<strong>boost</strong> depends on juice and adds 2 key features that make sharing CSS possible.</p>

<ul>
  <li>Ability for the html to have
  <code class="language-markup">&lt;link rel=&quot;stylesheet&quot;&gt;</code>
  tags which are resolved correctly and have the resulting CSS applied.
  </li>
  <li>Ability to have
  <code class="language-markup">&lt;style&gt;...&lt;/style&gt;</code>
  elements and have that CSS applied as well.
  </li>
</ul>

<p>When you add this capability with template inheritance, sharing CSS becomes a solved problem.
For example:</p>

<h4 id="basehtml">base.html</h4>
<pre>
<code class="language-markup">&lt;html&gt;
&lt;head&gt;
  {% block css %}
    &lt;link rel="stylesheet" href="base.css"&gt;
  {% endblock %}
&lt;/head&gt;
&lt;body&gt;
  {% block content %}
  {% endblock %}
&lt;/body&gt;
&lt;/html&gt;</code>
</pre>
<h4 id="reminderhtml">reminder.html</h4>
<pre>
<code class="language-markup">{% extends "base.html" %}

{% block css %}
  {% parent %}
  &lt;link rel="stylesheet" href="reminder.css"&gt;
{% endblock %}

{% block content %}
  &lt;h1&gt;Reminder&lt;/h1&gt;
  &lt;p&gt;Don't forget!&lt;/p&gt;
{% endblock %}</code>
</pre>
<h4 id="noticehtml">notice.html</h4>
<pre>
<code class="language-markup">{% extends "base.html" %}

{% block css %}
  {% parent %}
  &lt;link rel="stylesheet" href="notice.css"&gt;
{% endblock %}

{% block content %}
  &lt;h1&gt;Notice&lt;/h1&gt;
  &lt;p&gt;This is a notice.&lt;/p&gt;
{% endblock %}</code>
</pre>
<h3 id="dummy_context">Dummy Context</h3>

<p>In order to rapidly build email templates, we need to see what we are building as we build it.</p>

<p>Because you need to supply a template with a context in order to render it and see it,
this makes seeing what you are doing while you build templates a two step process.</p>

<p>We can remove this extra step by taking advantage of
<a href="https://github.com/andrewrk/swig-dummy-context">swig-dummy-context</a>,
a module I wrote which, given a swig template, gives you a "dummy" context -
a fill-in-the-blank structure you can use to immediately preview your template.</p>

<p>Given:</p>
<pre>
<code class="language-markup">&lt;div&gt;
  {{ description }}
&lt;/div&gt;
{% if articles %}
  &lt;ul&gt;
  {% for article in articles %}
    &lt;li&gt;{{ article.name }}&lt;/li&gt;
  {% endfor %}
  &lt;/ul&gt;
{% else %}
  &lt;p&gt;{{ defaultText }}&lt;/p&gt;
{% endif %}</code>
</pre>
<p>swig-dummy-context produces:</p>
<pre>
<code class="language-javascript">{
  "description": "description",
  "articles": {
    "name": "name"
  },
  "defaultText": "defaultText"
}</code>
</pre>
<p>And if you render the template with the generated dummy context, you get:</p>
<pre>
<code class="language-markup">&lt;div&gt;
  description
&lt;/div&gt;
&lt;ul&gt;
  &lt;li&gt;name&lt;/li&gt;
&lt;/ul&gt;</code>
</pre>
<h2 id="conclusion">Conclusion</h2>

<p><a href="https://github.com/andrewrk/swig-email-templates">swig-email-templates</a>
gives you all the ingredients you need to build well-organized templates and gives you the
tooling that you need to build a live preview tool.</p>

<p>At Indaba Music we have such a tool. It lets you select a template from the templates folder
and fill in the substitutions to preview how an email will look. To be extra sure of how an email
will render, you can use the tool to send a test email to your email address.</p>

<p>This tool is currently private as it is not decoupled from SendGrid or even polished up
for 3rd party use at all; however if there is sufficient interest I may open source it.</p>
]]></description>
      </item>
      <item>
         <title>How to be Successful at PyWeek</title>
         <pubDate>Sun, 07 Aug 2011 12:00:00 GMT</pubDate>

         <link>https://andrewkelley.me/post/pyweek-success.html</link>
         <guid>https://andrewkelley.me/post/pyweek-success.html</guid>
         <description><![CDATA[<h1>How to be Successful at PyWeek</h1>
<p>A short series to help you win. Covers these topics:</p>
<ol>
<li><a href="#intro">What PyWeek is, and Why You Should Participate</a></li>
<li><a href="#python">Video Games in Python</a></li>
<li><a href="#speed">Speed Developing</a></li>
<li><a href="#game">Game Theory - How to Create Fun</a></li>
<li><a href="#community">Getting the Most Out of the Community</a></li>
<li><a href="#judging">The Psychology of Judging</a></li>
<li><a href="#conclusion">Conclusion</a></li>
</ol>
<h2 id="intro">What PyWeek is, and Why You Should Participate</h2>
<p><a href="http://pyweek.org/">PyWeek</a> is a challenge. It asks you to spend one week of your life developing a new video game, using <a href="http://python.org/">Python</a>.</p>

<p>PyWeek is difficult. It is not easy to create a new video game in just one week, especially one that is fun, innovative, and polished. But it <em>is</em> fun to try, especially with peers from all around the world working alongside you toward the same goal.</p>

<p>PyWeek will strengthen you as a game developer. If you go solo it will strengthen you as a generalist, and if you join a team it will strengthen you as a team member.</p>

<p>As of this writing, the <a href="http://www.pyweek.org/13/">next competition</a> is the week of <strong>September 11, 2011</strong> to <strong>September 18, 2011</strong>. Registration opens August 12, 2011.</p>

<p>You can check out the competition's <a href="http://pyweek.org/s/rules/">Rules</a> and <a href="http://pyweek.org/s/help/">FAQ</a>.
</p>
<h2 id="python">Video Games in Python</h2>
<p>Python is a great language for rapid video game development:</p>
<ul>
<li>Large selection of libraries to use, for example <a href="http://pygame.org/">PyGame</a> and <a href="http://www.pyglet.org/">Pyglet</a></li>
<li>Python is known for being a prototyping language due to its ease of use.</li>
<li>It is simple to create arbitrary one-off data structures that you often need in game programming.</li>
<li>You can use objects and inheritance if you need it.</li>
<li>Cross platform</li>
</ul>
<p>There are some drawbacks though:</p>
<ul>
<li>Deployment is cumbersome. For PyWeek this isn't too much of an issue since everyone will run from source, but for an official game, it would be an issue.</li>
<li>Code is about 10x less efficient than in C or C++. This will not matter for graphics on screen, since you will likely use a game library which uses the GPU for graphics, but it might make your main loop or physics slow.</li>
</ul>
<p>Just keep these things in mind when developing.</p>
<h2 id="speed">Speed Developing</h2>
<p>The biggest thing preventing you from success during PyWeek is lack of time. Therefore, the more time you conserve, and the more time-consuming things you can do <em>before</em> the development week, the better chances you have for success.</p>
<h3>Allocate as much time as possible.</h3>
<p><a href="http://pyweek.org/">PyWeek</a> is inspired by the <a href="http://www.ludumdare.com/compo/">Ludum Dare</a> competition<a href="http://pyweek.org/s/help/#what-s-pyweek-all-about">[1]</a>, which only lasts 48 hours. In that competition, you pretty much have to use all 48 hours if you want to be successful. PyWeek is one week long to make it easier on people who have commitments to things other than the Internet. Regardless, <b>your success will be directly proportional to the amount of time you spend</b> on PyWeek.</p>
<ul>
<li>Get ahead at work and/or school so that you can spend as little time on them as possible.</li>
<li>Make sure your significant other knows you don't want to be bothered this particular week. You'll make up for it later.</li>
<li>Remember not to agree to go to any social activities that will take place during PyWeek. If someone asks you to hang out, try to change the time to before or after.</li>
</ul>
<h3>Use version control.</h3>
<p>You may be tempted to forgo version control, in the interest of time. One of two things are true. Either, </p>
<ol>
<li>You are new to version control, and you honestly would spend more time struggling with the version control software than developing your game, or </li>
<li>You are comfortable with version control, but think that you can save time by not using it since this is a speed-developing contest.</li>
</ol>
<p>In the first case, you should use version control anyways, because that skill should take priority over game development, and this is a great opportunity to learn. <a href="https://github.com/">GitHub</a> makes it dead simple if you are okay with making your project open source.</p>
<p>In the second case, let me assure you that version control will actually save you time:</p>
<ul>
<li>When you're tired and you forgot what you were working on, you can use the diff tool to quickly remember what you were working on.</li>
<li>You can spend a few minutes working on a feature or experimenting with a different style of gameplay to see if it's fun. If not, it's one command to scrap it and try again.</li>
<li>If you're used to using version control and you forgo it for the competition, it will throw off your groove to not use it, messing with your psychology in a time-wasting manner.</li>
</ul>
<h3>Practice beforehand.</h3>
<p>Development is all about solving problems.</p>
<p>There are some problems you are bound to run into that are simply a function of Python, the game library you are relying on, and the fact that you are writing a game. You would run into these problems no matter what game you decided to write.</p>
<p>Therefore, it makes sense to solve these problems before the actual competition. Practice writing a simple game engine with the library of your choice before the competition. You will run into and solve some issues that you would have otherwise had to waste time solving during the competition.</p>
<p>Examples of things you may want to practice:</p>
<ul>
<li>Basic game engine / main loop</li>
<li>Displaying graphics</li>
<li>Animating things on the screen</li>
<li>Responding to user input</li>
<li>Sound playback</li>
<li>Transitioning state between a title screen or menu system and gameplay.</li>
<li>3D graphics</li>
<li>Networking</li>
<li>Deploying on Windows, Mac, and Linux</li>
</ul>
<p>Practice using the tools you are going to use as well. Before the competition starts, you should feel comfortable creating assets, such as:</p>
<ul>
<li>Art</li>
<li>Animations</li>
<li>Sound effects</li>
<li>Music</li>
</ul>
<p>Often you can make at least half of your sound effects using only a microphone, <a href="http://audacity.sourceforge.net/">Audacity</a>, and some creativity.</p>
<p>PyWeek allows you to use artwork, music, and sounds whose copyrights explicitly allow you to use them. If you are going to use other people's artwork, find the databases and websites where you are going to get it from and make sure the citation process is efficient and will work.</p>
<h3>Have your game idea ready before the theme is decided.</h3>
<p>
You have one week between finding out what the 5 possible themes are, and the start of development. Use this week to <strong>brainstorm several game ideas for each theme</strong>, before you even know which one will win.
</p>
<p>Brainstorming is a lengthy process. Sometimes it can help to give yourself fake limitations in order to get your brain to think of ideas. For each possible theme, try to think of a game that fits each category:</p>
<ul>
<li>Action</li>
<li>Adventure</li>
<li>Puzzle</li>
<li>Role-Playing</li>
<li>Simulation</li>
<li>Other - a crazy wacky idea that probably will be stupid</li>
</ul>
<p>Sometimes trying to think of an idea for one category will make you think of an idea for another category. That's great, write it down! And sometimes trying to think of an idea for one theme will make you think of an idea for a different theme. That's great, write it down! At this point you want as many ideas as possible.</p>

<p>A friend or room mate can be very useful when brainstorming ideas. Bouncing ideas off each other is a great way to come up with material.</p>

<p>Once the theme is decided, you can often pick your best idea, regardless of what theme you thought of it for, and adapt it to fit the real theme. My winning entry, <a href="http://www.pyweek.org/e/superjoe/">Lemming</a>, was originally an idea inspired by the "Fry Cook on Venus" theme, but I adapted it to the "Nine Times" theme when it was chosen. <a href="https://github.com/andrewrk/lemming/blob/master/ideas/theme-brainstorming">[2]</a></p>

<h3>Rely on other projects as much as possible.</h3>
<p>As a general rule, do not try to write a level editor during PyWeek. You need as much time as possible to spend on the game itself. Consider using <a href="http://www.mapeditor.org/">Tiled</a> and <a href="https://code.google.com/p/pytmxloader/">DR0ID's tiledtmxloader.</a></p>
<h3>Get enough sleep.</h3>
<p>Your body needs sleep. You cannot cheat around that fact. The longer you sleep-deprive yourself, the less clearly you will think, the more mistakes you will make, and the more slowly you will work.</p>
<p>When you rest, you give your unconscious mind a chance to sort things out and surface solutions and ideas that you wouldn't have thought of consciously.</p>
<p>Sleep is directly linked to creativity.<a href="http://en.wikipedia.org/wiki/Sleep_and_creativity">[3]</a></p>
<h2 id="game">Game Theory - How to Create Fun</h2>
<p>It is easy to start programming and create a simple platformer, without thinking about what makes platformers fun. It is easy, but not nearly as rewarding as it is to actually think about <a href="http://en.wikipedia.org/wiki/Game_theory">Game Theory</a> - a fascinating subject that philosophers have been thinking about for a very long time.</p>
<h3>Think about game theory.</h3>
<p>If you have never read that Wikipedia page, do. Be sure to check out the simple example, <a href="http://en.wikipedia.org/wiki/Prisoner's_dilemma">Prisoner's Dilemna</a>, and especially the <a href="http://en.wikipedia.org/wiki/Prisoner%27s_dilemma#See_also">See Also</a> section of that page for other great examples.
</p>
<p>In short, you should be constantly asking yourself these questions while designing your game:</p>
<ul>
<li>What do I want the player to try to do?</li>
<li>What is the player actually motivated to try to do?</li>
<li>Is what I want the player to try to do the same as what the player is actually motivated to try to do? Why or why not, exactly? What elements naturally drive the player to act in a certain way?</li>
<li>Is what I want the player to try to do fun? What exactly makes it so, or not so?</li>
<li>Are the player's short term goals conflicting with the player's long term goals? (This is one way to create good gameplay.)</li>
<li>What compels the player to keep playing?</li>
<li>Is the player compelled to keep playing compulsively, or because they are having fun?</li>
<li>Do I want the experience of playing to be the same every time, or have random elements that change up the gameplay?</li>
<li>Are there any parts of the gameplay that are definitely <em>not</em> fun?</li>
<li>Is it possible to make it fun to lose?</li>
<li>What forces is the player working against?</li>
<li>Does the player feel like the forces working against him are acting fairly?</li>
</ul>
<p>There is a plethora of reading (and watching) material on creating interesting gameplay. Here are some materials to get you started thinking about game design:</p>
<ul>
<li><a href="http://penny-arcade.com/patv/episode/the-skinner-box">Extra Credits: The Skinner Box</a></li>
<li><a href="http://penny-arcade.com/patv/episode/narrative-mechanics">Extra Credits: Narrative Mechanics</a></li>
</ul>
<h3>Get your game into a playable state as soon as possible.</h3>
<p>Developing a game is an evolutionary process. The sooner you have a playable game, the sooner you can begin to morph it and mold it into a better and better game.</p>
<p>Ask yourself if it is fun. Try to figure out what makes it fun, and expand on those elements of gameplay.</p>
<p>Try out your gameplay on friends and see what they do. Try to not talk to them while they are playing. Figure out what parts your friends get frustrated at and change those parts.</p>
<h3>Sketch your levels on paper before coding anything.</h3>
<p>This will give you an idea of the world you're trying to create, giving you focus and direction when you code.</p>
<h2 id="community">Getting the Most Out of the Community</h2>
<p>PyWeek is only as fun as the people who participate. After all, there are no prizes, no reason to compete, other than for personal development and community interaction.</p>
<p>You get what you give.</p>
<h3>Keep a journal.</h3>
<p>One great way to participate is to <strong>write a diary entry after each day of work</strong>. Write about:</p>
<ul>
<li>Any progress made</li>
<li>Things that are impeding progress</li>
<li>Lessons learned that you want to remember for next time</li>
<li>Teaser screenshots, art, videos, sounds, or music</li>
<li>Decisions that you are pondering about in your game design</li>
</ul>
<p>It is fun for others to follow along with your progress and for you to follow along with others. It is customary to be positive, uplifting, and encouraging in response to diary entries, but don't be afraid to tactfully include constructive criticism. On the flip side, be ready to accept critique, using it to improve your design.</p>
<h3>Join the IRC channel.</h3>
<p>You can feel the camaraderie by hanging out in #pyweek on Freenode.</p>
<p>Live chat is personal and you can get to know the other participants fairly well. This place is especially fun during judging time. At one point we broke out into a <a href="http://www.pyweek.org/d/3946/">spontaneous one-hour game development competition.</a> There is almost always someone ready to give instant feedback about a game design decision or to give tips on development in general.</p>
<p>One thing you want to be careful of, however, is spending too much time distracted by IRC during development time. You should be spending almost all your development time developing.</p>
<h3>Follow along with the forum discussion.</h3>
The <a href="http://www.pyweek.org/messages/">forum</a> is the main communication facility of PyWeek. While it may take longer for someone to respond than on IRC, this is a great place to get questions answered and to participate in general in the community.<p></p>
<p>Diary entries are posted to the forum so that everyone sees them and can comment on them.</p>
<h3>Play and judge as many games as you can.</h3>
<p>As developers, we thrive on feedback. When it is time to judge, give as much valuable and honest feedback as you can to as many games as you can stand.</p>
<p>Giving a silly award can be a fun way to give praise to a weird aspect of a game.</p>
<h2 id="judging">The Psychology of Judging</h2>
<p>Realize that in the end you are creating a demo. Your project is not going to be a polished, ready-to-ship game in one week. Your goal is to <strong>make sure judges play your entire game experience and do not get distracted by nuances</strong>.</p>
<h3>Test your game on several people.</h3>
<p>Find a few friends who have not been following along with your development and shove the game in front of their face. Set them up with the title screen, but say nothing else.</p>

<p>Watch their face, watch their hands, and watch the screen. You are now getting an accurate preview of how judges will react to your game. Pay attention to what parts of your gameplay they don't understand. Notice when they ignore instructions you didn't want them to. Take note of what parts they get stuck at. All of these things are what you need to work on. If you don't, this is everything that judges will talk about in your game's feedback.</p>

<p>One mistake I made was testing my game on someone who is exceptionally good at video games. He didn't have too much of a problem with the levels that the real judges simply got stuck at and gave up.</p>

<p>Remember that judges have hundreds of games to play. If at any time during your game they get stuck, they are likely to shrug and move on to a different game.</p>

<h3>The first few minutes of your game should be ridiculously easy.</h3>
<p>It is critical that you <em>gradually</em> ramp up the difficulty, starting off stupidly easy.</p>
<p>Nobody likes a game that treats the player like an idiot. However, a player is far more willing to tolerate a Level 1 that they can pass without thinking, than they are to try to disassemble a masterpiece puzzle at the very beginning. Judges are much more likely to give up and move on.</p>
<p>As long as each new section, or level, is at least a <em>little tiny bit</em> more challenging than the last, the player will be intrigued and want to keep playing.</p>
<p>There is a time and a place for ridiculously hard challenging bonus levels, and that time and place is after the player has beaten the normal game, as a reward for their success. The test level that you use for 90% of your debugging will soon become boring and easy to you, but ridiculously hard for a new player. It has no place in the normal game; it should be in bonus-land.</p>
<p>Make stupidly easy levels come first, and then <strong>slowly introduce your more complex gameplay elements as the player progresses and gains confidence.</strong></p>

<p>Again, realize that your PyWeek game is essentially a demo. You might want to include a level select on the title screen, so that a frustrated judge can skip levels that were hard in ways that you did not expect. The easier you make it for fellow PyWeekers to experience the fullness of your creation, the better feedback you will get.</p>
<h3>Post a walkthrough with your final submission.</h3>
<p>Once again, the goal here is to get each judge to experience all of your content, <em>not</em> to stump them with your enigmatic gameplay.</p>
<p>Post a clear and detailed walkthrough that explains how to get through and beat each level.</p>
<h3>Reading instructions is boring.</h3>
<p>No matter how complex your gameplay is, there is no excuse for making your players read a wall of text.</p>
<p>There should be no instructions necessary. The player should be able to learn the gameplay by immediately beginning to play, and then being gradually given small bits of learning to digest. Tutorial levels are highly encouraged.</p>
<p>Remember that one of the categories you are being rated on is Fun<a href="http://pyweek.org/s/rules/#entries-are-judged-by-peers">[4]</a>, and reading instructions is not fun.</p>

<h3>Remember that not everyone has your setup.</h3>
<h4>Some people use alternate keyboard layouts.</h4>
<p>In PyWeek 12, 11% of participants used a keyboard other than QWERTY.<a href="https://spreadsheets.google.com/viewanalytics?hl=en_GB&formkey=dEIxVS1SRTdYTHZKbWJiTjdSS3FUYWc6MA">[5]</a></p>
<p>Be polite to alternate keyboard users. Defaulting to WASD is acceptable, but the player should be able to remap the keys.</p>
<p>A config file is an acceptable method for remapping keys, as long as it is obvious how to do it without a lot of instruction-reading.</p>
<p>However, alternate keyboard users are often proactive, and willing to put forth a bit of extra effort to compensate for their questionable life choices*. If you are running low on development time, it would be better to focus on gameplay rather than being nice to alternate keyboard users.</p><p><small>*Note: I am a Dvorak user.</small></p>
<h4>Most operating systems are case sensitive.</h4>
<p>Time to pick on Windows users.</p>
<p>It is <strong>unacceptable</strong> to rely on your file system's case insensitivity in your code. It is a huge pain to rename all media files in someone else's project and then grep through their code to find the filenames in the effort to get the program to not crash.</p>
<p>Make sure you <strong>use the same case in your filenames as you do in your code</strong>.</p>
<h4>Everyone's screen situation is different.</h4>
<p>The simplest thing to do is use a fixed sized windowed application - that way you don't have to take into account different screen resolutions and ratios.</p>
<p>Some people consider it rude for games to go fullscreen without explicit instruction.</p>
<h4>Do not make a tarbomb.</h4><p>When you create a tar file, everything should be inside of a single folder with your game name and version number.</p>
<h3>Test your final submission.</h3>
<p>Once you upload your final submission to the website, immediately download it again into a new folder and try to play it. This ensures that you didn't make any stupid mistakes.</p>
<p>Test on Windows, Mac, and Linux. Get a friend to test it for you if you don't have direct access to any of these OS's. Linux is free, so there is no excuse for not testing it on <a href="http://www.ubuntu.com/">Ubuntu</a>.</p>
<p>You would be amazed at how many people forget to include their media folder, or make some trivial mistake that renders their entire game unplayable.</p>


<h2 id="conclusion">Conclusion</h2>
<p>
I wrote this article selfishly. I feel compelled to play every contestant's game, and I want the games that I play to be fun. Hopefully this article is helpful enough to raise the standard of PyWeek.
</p>
]]></description>
      </item>
      <item>
         <title>John Muir Trail from the Perspective of Andrew Kelley</title>
         <pubDate>Thu, 04 Aug 2011 12:00:00 GMT</pubDate>

         <link>https://andrewkelley.me/post/jmt.html</link>
         <guid>https://andrewkelley.me/post/jmt.html</guid>
         <description><![CDATA[<h1>John Muir Trail from the Perspective of Andrew Kelley</h1>
<p class="lead">198 miles - July 24, 2011 through August 4, 2011</p>
<h2>Journal Entries</h2>
<ol>
  <li><a href="#day1">Day 1 - First Day of Hiking</a></li>
  <li><a href="#day2">Day 2 - Donahue Pass to Some Lake</a></li>
  <li><a href="#day3">Day 3 - Some Lake to Deer Creek Crossing</a></li>
  <li><a href="#day4">Day 4 - Halfway to Duck Lake to Mono Creek</a></li>
  <li><a href="#day5">Day 5 - Mono Creek to Senger Creek</a></li>
  <li><a href="#day6">Day 6 - Senger Creek to Something Meadow</a></li>
  <li><a href="#day7">Day 7 - Muir Pass</a></li>
  <li><a href="#day8">Day 8 - Mather Pass, Almost</a></li>
  <li><a href="#day9">Day 9 - The "Matherhorn" Pass and Pinchot Pass</a></li>
  <li><a href="#day10">Day 10 - Glen Pass - 15 miles</a></li>
  <li><a href="#day11">Day 11 - Forrester Pass - 13 miles</a></li>
  <li><a href="#day12">Day 12 - Mt. Whitney - 22 miles</a></li>
</ol>
<h2 id="day1">Day 1 - First Day of Hiking</h2>
<p class="lead">July 24, 2011</p>
<p>
I woke up to a cool and crisp morning. The tent Dad and I brought is extremely fast to construct and tear down, lightweight, and just the right size for the two of us. Dad crowded me a bit but it didn't bother me at all. He's on old man; he needs his space.
</p><p>
Mike, Paul, Dad, and I took our time packing up camp. We had 5 hours to kill waiting for Bruce and Chris to arrive. I tried organizing my pack a different way, which made it easier to carry, but made my food less accessible.
</p><p>
When we left camp, Dad grumbled something about not paying the $5 / person fee for staying in the campgrounds. I told him I didn't understand how he thought that was morally different than cracking your neighbor's DirecTV service, a reference to a discussion we had had earlier. He said that he thought I was right, so we walked back and paid the $20 for the 4 of us.
</p><p>
We drove the rental car just down the road to a convenience store, where I purchased a sewing kit for my sleeping bag and sunglasses for the snowy passes that lied ahead. Just then, Bruce and Chris walked in. It was fun to see them, not expecting them for another four and a half hours.
</p><p>
Dad and I called the rental place and found out that the nearest drop off was not 6 miles away as we thought, but rather 50 minutes away. Dad and Mike drove the rental car and Bruce's PT Cruiser down and came back in the PT Cruiser 2 hours later.
</p><p>
Soon we were at the trail head taking the obligatory photo, and then on our way. The trail ran alongside a beautiful creek/river with dozens of shallow crossings. My new boots worked like champions, water and mud alike sliding off the top of the boots, leaving them looking spotless. Pretty cool.
</p><p>
After a short food and water break, intense uphill, and a short but painful case of the runs, I arrived at the beautiful snowy valley where we are camped. After hot dinner thanks to Bruce's stove, Paul, Mike, Dad, and I played 3 rounds of hearts. Now I'm about to undress and sleep until first light.
</p><p>
These guys are such great company. Courteous and caring, yet comfortable with crude humor. I'm looking forward to the next 13 days.
</p>

<h2 id="day2">Day 2 - Donahue Pass to Some Lake</h2>
<p class="lead">July 25, 2011</p>
<p>
I tossed and turned and dreamt all night. I dreamt that I wasn't being faithful to my girlfriend, and that I turned gay for a bit, accidentally.
</p><p>
I woke to Dad gently shaking my leg. Looking around I saw that everybody was just beginning to get up. I ended up being one of the first ones packed, so I scouted the river we would soon be crossing. Soon, we were on our way. With my belly full of oatmeal, and my legs fresh from sleep, I felt good.
</p><p>
Donahue Pass was somewhat tricky as far as staying on the trail. Although it was a beautiful sunny morning, there was still a significant amount of snow remaining.
</p><p>
The hike went on for a long, long time. I am happiest when debating or talking with someone. Bruce is very intelligent but his wordiness makes it difficult to really build an argument against him. On the other hand, wordiness is great in the context of killing time. Mike is straight out obstinate. He told me that I couldn't prove that science could handle morality in a logical manner. When I asked him to define "morality," he would not, instead telling me to define it, and then refusing to accept my definition. Every time I finally was beginning to make some progress in the discussion, he would change the subject. He would answer direct questions not with a definite answer, but by a lengthy allegory that was barely relevant.
</p><p>
The views that we witness are breathtaking. At least I'm sure they would be if I actually appreciated beauty. Still, I get pleasure from pointing out views and wildlife and watching my companions' reactions.
</p><p>
Toward the end of the hike we met 2 guys who knew the Ridenours through Katelin. How weird is that?
</p><p>
I enjoy the implicit camaraderie that nearly every hiker feels. Warm welcomes, trading of information, and sometimes even sharing food is common with the hikers we meet.
</p><p>
Dad asked about Sara today. She's been on my mind off and on this whole trip. It's weird to have a successful first date with someone I seem to "click" with, and then immediately leave town for 2 weeks. Oh well. Hopefully she'll still be interested in getting together when I get back.
</p><p>
The last 4 miles of the trail hurt. I forced achy tired muscles uphill, downhill, uphill, downhill, and then more up. Everybody agreed the uphill is killer. Nothing is worse than switchbacks. 
</p><p>
At this point we were near a lake, with moderate mosquitoes and ants. It was our target destination, and a decent camp site. I voted to stop, but the guys wanted to go another mile uphill before stopping. I bit the bullet and joined the march. After finally arriving at the new destination, we quickly realized that this marshy, mosquito-infested, snowy area would make for a miserable night. We back tracked 1/3 of the way back to our original destination, to a camp site I remembered seeing. It is a good spot, but with another nearby lake, it is thick with mosquitoes. Not as many as the marsh though.
</p><p>
Dad tried out the instant mashed potatoes, mixing boiling water and bacon bits. It was heavenly. We ate that along with soup - teriyaki flavoring + Ramen + raw chicken (mixed with boiling water). That was good too but it didn't compare to the mashed potatoes + bacon bits.
</p><p>
The sun is about to go down. I'm going to force myself up off my sleeping pad and join back with the guys sitting on some rock. I am really looking forward to sleeping tonight. Hopefully I don't dream that I'm gay again, that was lame. We'll see what kind of wacky story my brain puts together.
</p>
<h2 id="day3">Day 3 - Some Lake to Deer Creek Crossing</h2>
<p class="lead">July 26, 2011</p>
<p>
I woke up often during the night to air out my hot sleeping bag. The air was cool, but my bag is just so warm. Regardless, I got decent sleep.
</p><p>
After a quick breakfast, deconstructing the tent, and brushing my teeth, I was packed and about to head off down the trail. I get a kick out of being more responsible than Dad at least as far as taking care of my teeth. Paul politely requested that I wait a bit since it was going to be a while before everyone else was done packing, and it would be better if our group hiked together. So I went ahead to fill up water from snowmelt and waited. Soon enough I was on the trail with Dad, Mike, and Paul. The word was that Bruce and Chris were still packing but would catch up soon.
</p><p>
Despite a good night's sleep, I didn't feel great. I was sore and my feet already hurt. I mentally braced myself for an entire day of enduring pain. At one point Paul suggested that I hike my pack up on my back and tighten the waist strap. This helped tremendously.
</p><p>
Bruce and Chris were nowhere to be seen when it came time to go a bit off the trail to Red's Meadow Resort to pick up our resupply package. We hiked a half mile uphill to get there. The package was huge. Along with four 20-something hikers next to us, we had to toss some food out. I traded bacon bits for dried fruit with those hikers, telling them how great it tasted with instant mashed potatoes. We also saw the two girls that passed us the day before. Their names were Alley and Jule. Alley is a microbiologist. They plan to be out on the trail for an entire month!
</p><p>
After 50 minutes of lunch and resupply, we hiked back down the hill hoping to meet up with Bruce and Chris. Once there, we realized that the trail was actually back uphill, so we had to backtrack half a mile uphill.
</p><p>
Along the Pacific Crest Trail, we saw a bunch of construction workers working on the trail. When we approached, they would shout, "HIKERS!" and get out of the way. At one point I thanked a guy for working on the trail and he said, "Don't thank me, I have to do this to pay off my fines."
</p><p>
After a trek uphill, another meeting with Alley and Jule, and a lot of walking, I was thinking to myself, "My feet hurt, my back hurts, and I'm tired, but this is still a new day, so gear up." But much to my pleasant surprise, it was 3:40pm, late in the day, and only 20 minutes to our destination. At this point I started to mentally relax. Bruce and Chris were waiting for us at Deer Creek Crossing. That was good news. The bad news was they had been waiting for 45 minutes and wanted to move on. I switched from boots to sandals and again mentally braced myself for 6 more miles uphill.
</p><p>
I explained the Mineflayer AI I am working on to Bruce, who works on algorithms for a living, and he said that there were a few algorithms that fit the bill. I'm going to email him and ask what they are. I also told Paul I would show him Demetri Martin.
</p><p>
At 5:23pm, we found a decent spot and settled. A creek flows nearby and the mosquitoes are just as bad as last night. I think I killed at least 100 today.
</p><p>
While we were eating dinner and having a jolly old time debating various topics, a 20-something year old hiker named Kevin came by and asked if he could camp nearby. We were happy to have him come up and join the discussion. He is an American History major who "finally" got a minimum wage Starbucks job. I'm so lucky to have a profitable and enjoyable profession.
</p><p>
Everyone else is going to sleep. I don't want to feel like crap when I wake up so I'm going to put down the pen and zonk out.
</p>
<h2 id="day4">Day 4 - Halfway to Duck Lake to Mono Creek</h2>
<p class="lead">July 27, 2011</p>
<p>
It was slightly cooler last night, so I woke up less times to air out my bag.
</p><p>
We did our morning routine and hit the trail. Kevin hadn't woken up yet. After a small crossing where Dad and I hopped across rocks and the rest of the crew trudged through, we had a small pass ahead of us. I was feeling good. So good, in fact, that I hummed the Muse cover of the song Feeling Good. At least it was something different than Micah by Five Iron Frenzy which has been stuck in my head this entire trip.
</p><p>
Anyway, I soon left the rest of the pack behind with my speed. After 400 feet of climbing and walking through a small pass, I entered a beautiful hidden meadow with a pristine lake in the middle. I took a break, taking off my boots to let my feet taste the grass. It was hot and noonish, so I stripped to my underwear and jumped in. By this time, the group caught up. Everybody loved the spot, so we all had lunch, enjoyed the meadow, and got eaten by mosquitoes.
</p><p>
At this point our 6-person troupe was on the trail together. I started discussing philosophy with Chris and Bruce. We talked about quality of humans, what gives human life value, and various contrived examples of abortion. Bruce did not agree with my criticisms of grad school.
</p><p>
We had been walking and talking for a long time. Having a discussion is a great way to walk miles upon miles without realizing it. Eventually Bruce wanted a break, but Chris and I pressed on. We began to go up Silver Lake Pass. I told him the story about sleeping under my desk. I could tell he enjoyed the story, especially the part about my fortress in the Capstone room.
</p><p>
Storytelling is something that happens often and naturally on this trip. As I write, the guys are sitting around a fire and Dad is telling Chris about a biking injury.
</p><p>
Once Chris and I got almost to the top of Silver Lake Pass, we lost the trail. 4 other hikers joined us, in the same boat. I decided to wait for everyone to catch up, just to make sure we all went the same direction. Bruce came along first, scouted for the trail, and yelled back when he found it. Chris took off while I waited for Dad and Paul. Once they arrived we trudged up the snow. Looking down we saw a frozen green/blue lake. As we trudged on, we couldn't tell where the trail was due to snow. Good riddance! Instead of tedious switchbacks, we climbed up a near vertical wall of snow, getting to the top of the peak fairly quickly. Here was a stunning view, a quick snack, and relief from mosquitoes. Paul read from his JMT guidebook about what the next 6 miles held for us.
</p><p>
We decided to try to get ahead on the itinerary by going another 6 miles downhill. We took off, excited to get ahead of schedule. After a few miles my feet started to hurt so I swapped my boots with sandals. This felt much better. I fell into step with Dad as we hiked down. At one point we saw a really cool waterfall. As we walked further, we learned that we would cross just under that waterfall, which was right before another. What fun! I almost fell in.
</p><p>
After just a few more switchbacks, we arrived at another river crossing. This one was violent. Paul fell several times, skinning both knees, but somehow saving his camera. Dad fell a little bit and his pack briefly ducked under water. So far everyone had borrowed Bruce's poles to cross the stream. I, "Feeling Good," tried to cross without them. A few steps in it was clear to me that if I took another step I would topple over and be carried downstream. I motioned to Dad, who threw one of Bruce's hiking poles to me. I deftly caught it in the air and used it to maneuver across. Good thing I didn't miss!
</p><p>
At this point, it was obvious that Dad was not feeling good. He had some bad blisters that were aggravated by the crossings. It was well past 5pm, our "usual" "look for a campsite" time, but the rest of the guys wanted to press on. We walked for another few miles after that, and finally, after yet another crossing, Chris, Bruce, and I made the executive decision to call it quits for the day and set up camp.
</p><p>
This is a good spot; as I mentioned, we have a fire going, and there are some dry, flat spots for tents. Dad and I had our favorite mashed potatoes + bacon bits. Dad feels much better after warming up and relaxing a bit. His bag got wet but only a little. He put it out to dry and it looks pretty dry now. The fire seems to ward off the mosquitoes.
</p><p>
Everyone is tired and going to sleep now. Dad is pumping up his mat. I think we walked 20 miles today. A lot of my food is gone - we'll see if I make it to the next resupply.
</p>
<h2 id="day5">Day 5 - Mono Creek to Senger Creek</h2>
<p class="lead">
July 28, 2011
</p>
<p>
I hate mosquitoes.
</p><p>
I woke several times throughout the night, thankful, each time, that it was still dark and I didn't have to get up. I noticed that every time, the pain in my feet went down. Just when one more snooze would have completely obliterated my feet pain, it was time to get up.
</p><p>
Bruce started the fire again and I put a few sticks on to make it hot enough to burn some trash. As I burned the blue nylon t-shirt I had been wearing the entire trip, empty plastic bags, and moldy summer sausage, I smiled at the weight I would no longer be carrying.
</p><p>
After lazily finishing my morning routine, Dad, Mike, and Paul had already hit the trail. I wanted company so I waited until Bruce was done packing. I took the lead on the trail, however, and Bruce didn't keep up with my brisk pace, so I quickly got ahead.
</p><p>
This part of the trail was about a 1000 foot climb. Switchbacks became monotonous very quickly, but much to my delight, I had The Receiving End of It All by Streetlight Manifesto in my head instead of Micah. It wasn't to last though; by noon I caught myself mentally singing those silly lyrics with each step.
</p><p>
By 11am I became ravenously hungry. Dad gave me several large handfuls of sunflower seeds he had handy, but my stomach demanded more. We agreed to stop for a lunch break after crossing Bear Creek several miles ahead. I hiked with a fury so that I might get there sooner and abate my hunger. Along the way, I noticed that my feet were starting to feel quite bad even though it was still early in the day.
</p><p>
I ate pepper jack cheese, pine nuts, sunflower seeds, pecans, tomato &amp; basil crackers, and pepperoni. Finally my stomach was content. The mosquitoes, on the other hand, were not. They continued feasting on us all day. There was naught but one or two moments of relief.
</p><p>
After lunch, I switched to sandals and we began another ascent up to Seldon Pass. I fell into step with Chris; we talked about music conventions like Magfest and percussion events, our similarly aged younger brothers and their various life issues, and the pros and cons of raising kids in a small town versus a big city.
</p><p>
This was quite enjoyable. Before we knew it we were looking at Marie lake - icy blue water surrounded by a meadow at the top of Seldon Pass. It seems amazing to me that there can be a lake at the peak of a mountain.
</p><p>
I sat down for a bit to wait for Dad and Paul, swatting at my legs, arms, hands, neck, and face. It was clear that Dad was having a tough time today. He has huge blisters from the balls of each foot to their respective toes.
</p><p>
Nevertheless, he trudged on with the rest of us. This part was several miles downhill - much easier on the body. My experiment with sandals was a huge success. My feet continued to feel much better ever since switching footwear at lunchtime.
</p><p>
I saw a marmot scitter across the trail. It was fun to see, but I still think the woodpecker that Paul and I heard the morning before was more interesting. 
</p><p>
After a few miles I began to feel the familiar end-of-the-day fatigue. It is kind of an inexplicable pain that makes me want to stop and set up camp. If I try to pinpoint it, I find that yes, my feet hurt and my back is a bit sore, but that itself is not the fatigue. In the morning I can have the same pain but continue to hike 16 more miles.
</p><p>
The mosquitoes were relentless. Even while walking they attacked with a fury. At one point I stopped to wait for Dad, donated about a pint of blood, and didn't even get $20.
</p><p>
Finally we crossed Senger Creek and began to look for a campsite. We settled on one with a modest walk to water, mostly flat tent spots, and a fire pit. Dad and I shared our favorite potatoes + bacon bits and some rice. I have one more dinner and half a lunch left of food, but it should be fine since we're resupplying at John Muir Ranch first thing tomorrow.
</p><p>
I built a fire to drive away mosquitoes but it's hardly effective. They bite through cloth, nylon, and netting. When you kill one, the others are unfazed. I pulled my pants down for 5 seconds to drop a deuce, and now I have 3 itchy bites on my rump. I'm cowering in Dad's and my tent, with about 30 of them patiently perched on the netting, waiting for me to come out or accidentally press my back against one of the tent sides.
</p><p>
It might rain tonight. I felt a few drops and heard thunder in the distance.
</p><p>
The guys are sitting around in the "kitchen" talking. I'm deciding between joining them or going to bed immediately. The sun is still out but I'm sore, itchy, and lying on this pad feels oh so good.
</p>
<h2 id="day6">Day 6 - Senger Creek to Something Meadow</h2>
<p class="lead">July 29, 2011</p>
<p>
According to Dad, a hiker named "Stick Man" camped neighboring ours, and he had been hiking the Pacific Crest Trail all the way from Mexico, starting May 1st. He was headed for Canada, but had to stop somewhere along the way to make more money for his expedition.
</p><p>
I fell asleep shortly after yesterday's entry. My aching body was happy with this turn of events, fixing itself up completely while I slept. Once I finally stepped onto the trail, I felt fresh, strong, and ready for the day.
</p><p>
It was only a few short miles downhill to John Muir Ranch, our final resupply. The food we packed there had to last us 7 days. Along the way, Paul and I saw a peculiar native bird. It looked and acted like a combination of a quail and a chicken. I heard its baby chicks softly chirping in a nearby bush.
</p><p>
At the ranch, our resupply bucket was a welcome sight. In the words of Paul, "This is better than Christmas!" Bruce used the hiker buckets with leftover food to resupply for himself and Chris, who woke up late and was missing. I scored a rice bar and a peanut-butter-and-honey tortilla for an immediate snack, and a bag of instant potatoes for dinner later.
</p><p>
Dad bought some Deet, and we all tried our cell phones for reception, but of course nobody had signal. There was a handy spring water facet for filling up water bottles, and a scale for weighing packs. Everyone's packs were between 34-37 lbs, my pack setting the high bound. We spent about 90 minutes at the ranch, organizing our food, eating, and discussing what to do about Chris. Just when we were about to take off, leaving Bruce behind to wait, Chris waltzed in. He had taken a wrong turn and had to double back.
</p><p>
"Chris!" Bruce shouted from around the corner.
</p><p>
"Yeah Dad?" Chris replied.
</p><p>
"Welcome!" Bruce exclaimed, with just a hint of patronization. Mike and I chuckled.
</p><p>
We left the ranch, virtually staggering under the weight of our new food. We took a breather at a junction hardly even a mile away, next to a roaring creek with a wooden bridge. Dad gave me his camera to borrow for a while, a suggestion I made a few days ago.
</p><p>
After we got up and crossed the bridge, the trail gradually took us for a tour along the lower left side of a thriving valley, with waterfalls to our right above, and a violent river to our right below. I fell into step right behind Dad for a while, until he suddenly stopped and I bumped into him.
</p><p>
Looking ahead, not 20 feet away, I saw a doe and her 2 fawns. I snapped a couple photos with Dad's camera and waited till Paul was done doing the same before moving on.
</p><p>
Eventually there was a convenient spot for water near the trail and I went to get some. The others pressed on, knowing that I would catch up soon. I sat on a mostly flat rock, sucking the river water through my bottle's filter, enjoying the break. I lingered, somewhat guiltily, watching the guys get further and further ahead. Eventually the mosquitoes started biting, so I gathered my pack and resumed the trail.
</p><p>
At this point, I sauntered leisurely along in solitude, knowing that Bruce and Chris were still far behind. I started thinking about the length of this trip, and my new home in Seattle, my new friends, Sara, and starting work at Amazon in less than 2 weeks. For the first time on this trip, I felt homesick.
</p><p>
The feeling passed however, as I set my mind to new tasks, for example, where to put my pack while taking a dump. Everywhere I looked there were anthills riddling the ground. Eventually I resigned myself to shaking my pack after doing my business, knowing that I was surely far behind the others. I wasn't even sure if I was following the right trail. Soon enough, though, I caught them on the uphill. It saddens me that one day, I, too, will be old and slow.
</p><p>
After a short hike into a denser area, I shouted a greeting to 4 hikers lunching on a nice sunny rock as I passed them by. I carefully waded across the rushing but relatively shallow creek, walked a bit, and then found my own sunny rock to rest on while munching some snacks and waiting for the rest of the group.
</p><p>
The sun beat down angrily. I wondered if I would burn, having forgotten sunscreen that day. Soon, Dad, Mike and Paul showed up, and eventually, so did Bruce and Chris. We discussed how far we planned to go today, but were interrupted when a single, large ball of hail landed on my pack. There was a brief moment of hesitation, and then everyone burst from their relaxing position into action mode, getting out raincoats and tarps to protect packs from taking on water.
</p><p>
It sprinkled for less than a minute. I was sourly disappointed. But 10 minutes later, the sky opened up in a deluge of rain and hail. It was amazing how fast the trail turned into a stream of icy water. The storm kept up for at least an hour. Instead of accidentally getting pebbles stuck in my sandals, I was getting hail. Amazingly, my feet kept warm.
</p><p>
Finally the group gathered. It was still raining, and there were no dry camp sites for 8 miles. As it was already 5 o'clock, we opted to try to find a decent place on the hill. We found 3 soggy but not swampy places and quickly pitched tents.
</p><p>
Mosquito myth: Mosquitoes don't come out while it's raining. False, false, false, false, false! Oh, here's another bite. False!
</p><p>
Most of Dad's and my stuff is wet. I went out while Bruce was heating water and made dinner for Dad and myself, bringing it to him in the tent.
</p><p>
I'm getting cold. I'm going to try to get warmed up and go to sleep. We have a hard day tomorrow and have to get up early.
</p>
<h2 id="day7">Day 7 - Muir Pass</h2>
<p class="lead">July 30, 2011</p>
<p>
Although Dad and I were wet all night long, we were warm. By morning, the inside of my sleeping bag was dry; the moisture had steamed out while my body heated it up.
</p><p>
The others packed up quickly, wanting to hit the trail as soon as possible. I was the slow poke. While Mike, Bruce, and Chris took off as soon as they packed, Paul and Dad waited 5 minutes for me.
</p><p>
Today I prepared to hike a long way without stopping. I carried no water, to save weight, and instead kept my bottle in my shorts pocket to drink when I crossed a stream. In my other pocket, I carried deshelled sunflower seeds.
</p><p>
As we climbed, we came to an empty trash bag sitting in the trail. What luck! I sorely missed this last night - it would have kept my bag dry. Dad fell behind a bit. He was about .25 miles back when Paul and I caught up to the rest of the group. Mike and Paul, "planners," as Bruce calls them, were concerned about our pace. There was talk about ending the entire trip. Eventually, we decided to break into groups of two - the people who shared tents - and meet at the end of the day.
</p><p>
With that, Mike and Paul were off. They moved fast, concerned about the thunderhead clouds ominously forming above the peak behind us. I took the opportunity to rearrange my pack, getting out pepper jerky and trail mix.
</p><p>
When Dad arrived I informed him of the plan and we were off, leaving behind Bruce and Chris, who were still messing with their packs. I still hiked in wool socks and sandals, even though the terrain was snowy. We trudged endlessly through the snow fields, wondering when we would finally be over the pass. The snow had settled in a most unfortunate ruffle pattern, making missteps and sliding backwards common and exhausting.
</p><p>
The looming clouds which had been threatening us all day finally started to deliver. After a few drops of rain, dad and I put our packs down to dig out our rain coats, and I switched from sandals to boots. What a difference! I seemed to power right through the snow with those bad boys.
</p><p>
Yet the rain and the slope increased. Icy wind bit at my ears. I led; Dad trudging along slowly behind me. We passed several hikers going the other way, one of them pointing out a triangular shaped rock hut marking the top of the pass.
</p><p>
Just when the rain was getting violent, we made it to the door of the hut at the top of the pass. I tried the handle. The door slowly spun open. I couldn't see a thing, but it was warm, and I was greeted with a chorus of "hello"s. After my eyes adjusted, I saw about 8 hikers and a lot of gear seated on and around the bench that lined the wall.
</p><p>
Dad ate a snack and we waited out the rain. The hikers were friendly and talkative. The gentleman to my left, who I later found out was named Jaffey, was interested in a party of two to my right; a father heating water for "hot chocolate mixed with cider" and his 17-year-old daughter. 
</p><p>
As soon as the rain let up I was out the door and down the other side of the pass. This side was steep. It was possible, however, to sort of run-ski down, making for quick and effortless travel. After a ways I looked back and was happy to see Dad already on the trail. As I watched, he slipped and fell, pausing a bit before getting up.
</p><p>
Eventually, he caught up and we hiked along the snowy mountainside together. At one point, the trail led straight downhill, and there was a perfect butt-shaped halfpipe in the snow alongside it. Dad did not hesitate. He plunked down, pushed off, and then gracefully and effortlessly sped down the causeway. I, of course, followed, getting snow in my pants and dropping my bottle toward the end.
</p><p>
We crossed a few streams and hiked down another mile or two. The terrain went from snowy to lush and green. We took a break to swap footwear and so that I could drop a dookie. Just when I was doing my business, it started to rain. We had no idea how far ahead Mike and Paul were, nor how far behind Bruce and Chris were. We decided to pitch the tent and wait out the rain.
</p><p>
Minutes after we finished the tent, Bruce and Chris came by, decked out in rain gear. They considered waiting it out with us, but ultimately wanted to keep moving.
</p><p>
So we waited. The rain pounded relentlessly. It was nice knowing my entire pack was in the tent and out of the rain. Dad and I snoozed for several hours. The rain just wasn't letting up. Finally, I woke to the sound of Dad putting shoes on. The rain had stopped, and it was sunset. Dad and I decided to quickly pack up and hoof it down the trail in hopes of catching the others. In two minutes we were packed and making headway.
</p><p>
Walking on fresh legs at sunset after a rainstorm is beautiful. The colors all change, the smell is different, and there is a spookiness to it. Dad pointed out a large rock to the right that looked exactly like a whale's face, and somebody had lined up rocks on the bottom layer to make it look like teeth.
</p><p>
Not even an hour after we left, we spotted a campfire and Paul. We were very glad to see each other, thinking that we had been split up. At this campsite, 12 hikers in total have gathered, including Jaffey and his friend. It was fun to sit around the fire and swap stories. Apparently everybody knows Kevin, the hiker who stayed a night with us. Bruce heated up some water for Dad and me - we ate stuffing and broccoli flavored rice.
</p><p>
Paul and Mike are still concerned about our pace; they want to get up very early and try to get over the next pass. Paul says it would ensure our success on this trip. I say let the planners plan. I'll follow along.
</p><p>
I'm running low on lunch food.
</p>
<h2 id="day8">Day 8 - Mather Pass, Almost</h2>
<p class="lead">July 31, 2011</p>
<p>
Dad started getting ready next to me while it was still completely dark. I saw headlamps on in Paul and Mike's tent, and Bruce and Chris's tent. Thinking it was close to 3am, I was slow waking up. Finally I willed myself to begin putting on warm clothes and then tried to remember what to do next.
</p><p>
Eventually, I had my pack together, ate oatmeal, and brushed my teeth. Turns out it was actually 5am. Examining my bear cannister, I became increasingly alarmed at my lack of food. Nevertheless, I needed energy for the long hike ahead, so I got out lunch - a bag of rice and a bag of pecans - and stuffed it into my gym shorts pocket to eat on the trail later.
</p><p>
I couldn't find my nice long wool socks, dirty as they were, so I wore one of the two remaining pairs of cotton socks with my sandals.
</p><p>
The talk back at the camp was that we had to make it over Mather Pass today to ensure the trip's completion. If Dad and I hadn't made it to camp last night we would have had to cut off the trip early. On top of that, we learned from a sign by a ranger station that there was a rock slide blocking Whitney Portal. Dad thinks that "they" will clean it up by the time we get there. Mike doesn't, but thinks we can walk to town if we have to.
</p><p>
I was determined to not be a part of the problem. I kept up a brisk pace, uphill, flat, and downhill. There were several creek crossings with no bridge across. While others tried to keep their shoes dry, I stomped right on through in my sandals. Soon I had a large lead on the others, except Mike, who was close behind. The trail became overgrown with dense lush plants and grass, and I jumped when something with a furry tail scampered under my legs. I saw Kelley Flowers that Paul had pointed out to me - yellow, upside down pedals with brown spots. Also I was surrounded by Jeffrey Pines. Dad must have been having a heyday.
</p><p>
The trail began going steeply uphill, and I ate the bag of not-very-good rice, worrying about running out of food in a few days.
</p><p>
Fun mosquito fact: Mosquitoes can fly faster than you can walk.
</p><p>
Mike and I kept climbing, watching the foggy sky ahead forming ill-boding clouds. We did not like the look of that at all. Sure enough, after another mile it started sprinkling. I heeded the warning and stopped to put on a sweater, rain coat, and switched to boots.
</p><p>
The rain started to pour.
</p><p>
I found Mike just ahead, huddling under a patch of awkwardly shaped pine trees. He looked miserable. I waited there with him for about 30 minutes until Paul showed up. There was no evidence of our failed illegal fire attempt. Both my lighters were out of fluid, and the sticks we had tried to burn were soggy.
</p><p>
The three of us continued walking in the rain until we realized that there were no more camp sites ahead, and the relatively flat spot we were at presently was where we needed to wait for Dad, Bruce, and Chris. We used Mike's fly and sat on my ground cloth to shelter ourselves and our packs from the rain.
</p><p>
Finally, the rest of the guys showed up. Dad came under the tarp for a bit, but it was uncomfortable; when the rain let up just a little, we scrambled out and pitched our tent as fast as we could.
</p><p>
At this point I informed Dad of my food shortage issue. He looked concerned and offered me some crackers which I gladly accepted. Not that it would be enough. I decided to skip dinner that night.
</p><p>
Inside our tent, Dad and I examined our belongings. Surprisingly, our stuff got only a little wet. That garbage bag I picked up yesterday had worked wonders helping my clothes and sleeping bag stay dry.
</p><p>
With not much else to do, we snoozed for several hours. Finally it stopped raining, and Mike yelled at us to come out and "be social."
</p><p>
We played several rounds of Hearts while I ate the tuna that I turned down at the ranch but Dad had packed anyway. Bruce served a dish of Fritos and hummus. So delicious. Paul gave me a bag of cereal and pumpkin seeds. I feel pretty guilty about not packing enough food at the John Muir Ranch and mooching off of everybody. Not guilty enough to turn down their self-sacrificing offers and go hungry, though.
</p><p>
Mosquito myth: mosquitoes can't survive at high altitudes. False!
</p><p>
There has been much speculation about where, how far, and how long the pass is ahead. I think we were probably over-estimating its difficulty. That being said, we have to do 2 passes tomorrow or cut the trip short. I think it all depends on the weather.
</p>
<h2 id="day9">Day 9 - The "Matherhorn" Pass and Pinchot Pass</h2>
<p class="lead">August 1, 2011</p>
<p>
High up in the thin air, the temperature dropped below freezing during the night. Dew formed on the clothes I laid out to dry and then froze. Inside my dry down-feather sleeping bag, I slept, warm and cozy.
</p><p>
That is, until 5am when I felt Dad waking up and putting on his warm clothes to start the morning routine. I could clearly hear the bridge of "Mermaid" by Anamanaguchi, playing in my head. Our performance today determined whether or not we would have to end the trip early. If we could get over both Mather Pass and Pinchot Pass, we would ensure success. The guys got ready especially fast; I was still shivering and crunching on frozen dry oatmeal when Paul and Mike took off. Not wanting to fall too far behind, I skipped brushing my teeth.
</p><p>
Not that falling behind was anything to worry about. The conditions of the pass were perfect - the snow had solidified and become very hard overnight. With food in my belly and plenty of sleep, I practically pranced up the steep treacherous pass. For some strange reason, I had the Jeopardy theme song with "I'm a little teapot" lyrics stuck in my head.
</p><p>
Dad and I made it up to the top of the pass first and had a celebratory piss off the edge. While waiting for the others I rearranged my pack a bit, changing into lighter clothes now that the glorious sun was shining on me. When I finished, Dad, Paul, and Mike had started down and Chris was still waiting for Bruce. Luckily I noticed Dad's camera he had forgotten. Yelling at him to turn around, I snapped a picture of his face when he realized he'd forgotten it.
</p><p>
I started down the steep trail, planning to eventually catch up with Mike and Paul in front. With the camera, though, I found myself looking for picture opportunities. I kept up a brisk pace, despite that, and soon enough gave Dad his camera back. I switched from boots to sandals and focused on catching up again.
</p><p>
I had this sharp pain in my left foot, in the tendons when I lifted my toes. I think it may be a lipoma causing problems. I want to have a doctor look at it.
</p><p>
Despite this weird new pain, I hurried and soon passed Dad again while he switched shoes to cross a creek. With my sandals I saved time by simply splashing through without changing footwear, but not without first dipping my bottle and having a drink. At one point Mike found a dead deer just upstream in a creek while looking for a dry crossing. I hope I didn't drink from that one.
</p><p>
I stopped putting on sunscreen two days ago, and I have not been burned. My only conclusion is that I have reached Level 10 Hiker and am now immune to the sun. I can't wait until I gain intrinsic mosquito repellence. 
</p><p>
I had to run to keep up with Mike and his stupid long legs. After a while we came upon Paul filming a nearby doe. The three of us walked along until Mike stopped to change shoes for a crossing, while Paul and I waded carefully across without changing footwear.
</p><p>
Paul and I were pleased at the good weather, and the good time we were making. At some point Paul stopped to change into lighter clothes and I went on alone. I walked for some time, getting hungry and thinking about my rations. Finally I stopped for lunch: one measly packet of crackers, and a medium stack of pepperoni. Delicious. I snuck another small stack of pepperoni. I wanted more, but I knew I had to ration.
</p><p>
After that I kept on. I tend to go slowly when alone and in the lead. Sure enough, Paul and Mike caught up after a mile or two, and I followed them, switching into my boots for the upcoming Pinchot Pass. There was a bit of uphill climbing, but then we were there, along with two hikers who went by Major Upchuck and Bounce Box. It was a couple who had met on the trail a month ago. They were friendly, and they know Stick Man. At one point Upchuck asked me if I had a bowl, which seemed a queer question, until I realized it was another euphemism for Marijuana.
</p><p>
After a short break at the top of Pinchot Pass, Paul, Mike, and I headed down. We walked for a long way. We could see some clouds boiling in the mountains ahead, threatening more nasty weather.
</p><p>
Sure enough, after another long section of hiking, the rain started to come down. We stopped to get out our rain gear and then pressed on. By this time, I was starting to get ravenously hungry. It was hard to take my mind off food, and I began to feel somewhat weak. Still, we kept walking and walking, until finally I had to sit down and have a snack. Paul and Mike walked on. Before I even had my cannister unpacked, Paul shouted my name. I put my pack back on and followed. A suspension bridge led from the trail over a roaring river to what Major Upchuck had called a "tit perfect" campsite. Hardly any mosquitoes, plenty of room, more than one fire pit, and a toilet.
</p><p>
By this time the rain had calmed down, but it was about to pick up again, so Mike and Paul pitched their tent. I had to wait for Dad, who had half the tent, so I used the time to relax and take inventory of my food. 1 breakfast short, 1.5 lunches short, and 2 dinners extra. I ate a bag of not-very-good rice that Dad had given me and that I had just found was not sealed properly.
</p><p>
I worried that Dad was too far behind and would not make it today. Just when I went to nervously check the bridge, he was there, waving at me. I was amazed. He must have been in terrible pain. I grabbed the tent poles from him as soon as possible and began pitching the tent. Not a moment too soon. Once the fly was up it started raining hard and we took shelter within. It even hailed a bit. Not too long though, and the rain stopped and the sun came out, so I ventured out.
</p><p>
There were about 10 tents in this site. Some folks were playing Frisbee. Bruce and Chris's tent was not one of them, however. They were nowhere to be seen. Dad came out of the tent at Paul's coaxing, wanting to take his picture in front of a giant Jeffrey Pine in the sunset. After that we played Hearts, Dad sharing salmon with me.
</p><p>
After round 3, Bruce and Chris finally showed up - 2 hours after Dad. They had gotten lost and followed a wrong trail. Instead of the 20 mile day we had had, theirs was 25.
</p><p>
Bruce, being the generous man he is, immediately started heating water. I split with Dad the chicken noodle soup I got from the hiker's box back at John Muir Ranch, which was filling and delicious.
</p><p>
It was getting dark, and finally a family started the fire. They were friendly and happy to have me burn my trash. I sat with them and talked. They are taking 7 days to hike 40 miles, a much more leisurely and fun trip. We had fun swapping short stories about our home town climates, and a teenager showed me a video of him jumping off the roof of his house into a snow bank on his iPhone. We all headed for bed when the fire started to die. Gotta wake at 5am again tomorrow.
</p>
<h2 id="day10">Day 10 - Glen Pass - 15 miles</h2>
<p class="lead">August 2, 2011</p>
<p>
I slept deeply, without waking or stirring the entire night. I dreamt that I was the protagonist in a Rube Goldberg Machine, and then that I was doing covert ops with David Hayden on the Death Star.
</p><p>
"Wake up," Mike intoned, from outside the tent Dad and I were sleeping in. I did better at waking up this time, almost immediately sitting up to put my warm clothes on. I didn't really understand why we were getting up at 5am again - we were no longer in a hurry - but I felt okay, so I saw no reason to fuss.
</p><p>
By the time I had eaten breakfast, brushed my teeth, and found the rumored toilet on the hill, the guys had all left, leaving me to play catch-up. Not that I minded. Bruce had left me a full package of freeze-dried strawberries on my bear cannister, claiming they were "extra." Hard to believe, but even harder to turn down. 
</p><p>
As I was packing, one of the nice ladies from the campfire the previous night came out of the tent and gave me a handful of almonds, citing "motherly instinct." I gave her warm thanks and ate them all right then, continuing packing.
</p><p>
As I left and bid farewell, she asked, "Is that your jacket hanging up right there?" I mentally kicked myself.
</p><p>
"Why, yes it is," I replied, again thanking her profusely. Oh the misery she has likely saved me.
</p><p>
Hiking in my shorts and skintight t-shirt, it was cold. The valley's mountainous sides were so tall that the sun wouldn't come out for several more hours. Hiking kept my core body warm, but my arms and hands became numb.
</p><p>
Finally, the trail went so far as to escape the giant valley walls, and the glorious sunshine fell upon my face. I soon passed Dad and Bruce, who were moving slowly. Dad had finally used moleskin, but it didn't appear to be helping much.
</p><p>
In a few miles I passed a large party of elderly looking hikers, a few campers just waking up and having breakfast, and a ranger station. I considered going up there to ask for a weather forecast, and any news about the rock slide at Whitney Portal, but I wanted to catch Mike and Paul.
</p><p>
I soon did after hiking around a lake. They were preparing to start the real climb that was Glen Pass. They left just as I arrived, while I stayed a few minutes to swap footwear. The boots do much better in snow.
</p><p>
The pass was very steep, and very cold. I shivered at the unceasing icy wind. It seemed to get stronger the higher up I went. I paused to look down and saw a half-frozen lake with beautiful blue ice. Just as I was about to turn and keep walking up the switchbacks, I heard someone calling out. Spotting Dad and Bruce far below, I waved and hollered back.
</p><p>
Once over the pass, there was a nice downhill dirt slope that was perfect for speed-walking. As I gained on Mike and Paul, I passed about 25 hikers going up the pass the way I came down. They must be doing the 40-mile something-loop trail the campers last night told us about.
</p><p>
I caught up to Paul and Mike resting in a small, sunny, grassy area, where the wind died down. I joined them, sitting on the ground and leaning my back against a rock, which felt wonderful. Paul gave me a handful of M&amp;M's and quite a few deshelled sunflower seeds. M&amp;M is one of my least favorite candies, but as I munched that sugary snack right then, it was just about the best thing I had ever tasted.
</p><p>
Paul and Mike ended their break and took off, but I lingered. The spot I had chosen to sit was surprisingly comfortable, and there were no mosquitoes. I finished off my pepperoni and the strawberries Bruce had left me, with a somewhat guilty conscience but a mostly contented stomach.
</p><p>
I caught Dad but then changed back to sandals. I caught him again but then had to do business again, dumping my load into some poor groundhog's hole. Once I caught Dad a third time, I walked with him until camp. He was really hurting, his giant blisters throbbing with each step. We were very happy when we finally saw Bruce's tent.
</p><p>
The two guys we had crossed paths with countless times today, Yuri and David, were making supper at our camp. They cooked a delicious-smelling hot dish that made my grumbling stomach jealous. Dad showed everybody his battered feet, and Yuri gave him Neosporin, blue blister fixer pad things, and tape to fix up his feet. Here's hoping it works.
</p><p>
Dad and I had potatoes and rice for dinner, but then Bruce made a delectable vegetable stew, complete with onions he picked along the trail, and shared it with everybody. Then he made juevos rancheros and also shared that. If there is anybody who lives up to the "love your neighbor" principle, it is Bruce. Did I mention the juevos rancheros was to die for?
</p><p>
The end is so close, everybody can taste it. We're already fantasizing about the hot restaurant food and beer we're going to get when we're done, and we have the mileage of the next few days planned out. We have 36 total miles left. Our last day, Mt. Whitney, is going to be 14 and it's not flexible. So that leaves 22 to split between tomorrow and the next day. Currently the plan is 14 tomorrow and 8 the next. That way we can rest up for the many-thousand-foot climb that is Mt. Whitney. Also we plan to wake again at 5am, for some reason unknown to mankind.
</p>
<h2 id="day11">Day 11 - Forrester Pass - 13 miles</h2>
<p class="lead">August 3, 2011</p>
<p>
The last pass.
</p><p>
I struggled to find motivation to get out of bed. It was cold and dark, and we didn't have to go very far to stay on schedule - not to mention that we had an entire day to kill tomorrow.
</p><p>
I ate my oatmeal breakfast hungrily. Food rationing is a terrible thing. It leaves me with a constant element of stress, and makes me think about food more often than I would if I were not rationing, causing me to feel hungrier. I'm eating plenty, especially with Paul graciously and consistently giving me some of his own precious food. But still, the very knowledge that I must ration gives me constant hunger and uneasiness.
</p><p>
I finished packing just before Dad. He had gone to bed early and slept well. His blistering feet had amazingly healed almost completely overnight. He changed the moleskin and secured it with tape that Yuri had given him last night. We were last to hit the trail.
</p><p>
Walking behind Dad, I could see the ease with which he stepped. Compared to yesterday, he looked pain free. He said as much, too.
</p><p>
We caught up to and passed Paul on the steep, frozen switchbacks leading up to Forrester Pass. We passed a group that had camped at the frozen lake. It must have been a cold night for them.
</p><p>
Eventually, the trail was covered in snow and we followed footprints instead. Those lead laterally across the ice-hard snow mountain side, then directly up the mountain toward the top of the pass. I would not say that I had altitude sickness, but I could tell my body was uneasy with being that high up. I did notice that I became short of breath much more easily.
</p><p>
Our group took a brief respite at the top. We all tried our cell phones but without success. Mike claimed that he got a text-message through, but I doubt it. I don't think the text message protocol includes a confirmation. I'll have to double check that when I get back. <small>[Note: it depends on the network. It is possible to include confirmation, but it is not perfect, and it is not part of the SMS protocol itself.]</small>
</p><p>
Our group continued on down the other side of the pass, but I lingered to eat a snack - some pepper jack cheese and a few almonds and peanuts. Dad shouted up - he had forgotten his camera again, so I grabbed it. Then Yuri and David made it up. I learned that they had run out of food and were trying to make it out today. I gave them the 700-calorie bag of pumpkin seeds Paul had given me that I had been looking forward to eating. This partly eased my guilty conscience about mooching. They thanked me, and then pointed out a black sleeping pad that someone left. Chances were decent that it was either Bruce's, Chris's, or Mike's, so I decided to take it. Then Yuri asked if I might send them pictures of the hike since their camera died. I got their email addresses and snapped a nice photo of them with the mountains in the background. Then someone made it up the other side and brought the good news that the Whitney Portal rock slide was cleaned up and passable again.
</p><p>
By the time I was descending the pass, the others were far ahead of me, and I had forgotten the sleeping pad. I dawdled along, not caring too much about the speed I was going on this short day. At one point I lost the trail to the snow, but saw it way in the distance and simply walked directly toward it.
</p><p>
Finally I got intolerably hungry again and stopped to have instant potatoes, and to change clothes and shoes. Dad whistled for me but I had already unpacked. Turns out he was waiting for me just ahead, simply because he enjoyed walking with me.
</p><p>
We walked together for the rest of the day. We saw Mt. Whitney looming ahead, and even spotted the hut at the summit. This was the first time we actually saw the "finish line" of our trip. This short day - about 13 miles - felt long to me. I can definitely say that at this point, I would much rather be done and home. That is not to say, however, that I don't want to finish the trip. I will thoroughly enjoy the sense of accomplishment.
</p><p>
I think it's funny that we are considering tomorrow - 8 miles uphill - a day of "rest."
</p><p>
When Dad and I finally arrived at the spot where the guys were waiting, we had the biggest interpersonal conflict we have had yet. Everyone except Dad wanted to continue hiking today, so that we could hike out tomorrow and finish the trip a day early. Dad was simply not up to the task, with his feet they way they were. Nobody wanted to be a jerk about it, but there was definitely a sense of disgruntlement. We ended up camping though, and after some time, spirits lifted and everyone warmed up to the plan. Mostly.
</p><p>
Dad and I took food inventory together. He gave me a bunch of lunch and breakfast food, and I will be providing a nice big dinner for tomorrow. Bruce gave us noodle-and-vegetable soup for tonight. We mixed it with our minestrone soup packages and it turned out delicious. I couldn't get enough though - I'm still hungry.
</p><p>
Zeo made it! We thought we wouldn't see him again, but sure enough, he came strolling by at 7pm. Ironically, he's probably going to make it out a day before us. Slow and steady wins the race. 
</p><p>
Paul, Mike, Dad, and I played a full game of Hearts. I won, going for a run 3 times in a row and succeeding twice. When I retreated to the tent to hide from mosquitoes, Dad, Bruce, Mike, and Paul engaged in a religion/philosophy discussion.
</p><p>
The mosquitoes here are thick - possibly the thickest of any campsite so far. Yet I'm the only one who wears his mosquito net. I don't understand.
</p><p>
Dad had been calling Chris "sunshine" for several days because he likes to sleep in. Chris doesn't seem to mind - he's a good sport.
</p><p>
Almost every night we go to sleep with the soothing sound of a rushing nearby creek. Dad wanted me to mention that.
</p>
<h2 id="day12">Day 12 - Mt. Whitney - 22 miles</h2>
<p class="lead">August 4, 2011</p>
<p>
I slept long into the morning. Dew had formed on Dad's and my sleeping bags and frozen into ice - a testament to our high altitude. Dad got up first, drinking hot coffee for the first time on the trip, thanks to Bruce and his stove. Dad enticed me out of bed with warm noodle soup and some oatmeal.
</p><p>
Our lazy morning quickly turned frantic as the mosquitoes swarmed in. It was not gradual; one moment we were peacefully minding our own business, and the next we were being attacked by 7 swarms. One for each of us, and another for Zeo.
</p><p>
Soon only Dad and I remained, trying to pack without being eaten. Brushing my teeth with a mosquito net on probably looked silly. Eventually we, too, were on our way.
</p><p>
Dad was in a chipper mood, feeling good and knowing he had another day of rest before tackling Mt. Whitney. Or so he thought. After a quick 6 miles, we came upon the guys all waiting around a junction.
</p><p>
"We need to have a discussion," Mike said. I laughed, because I had predicted this. Everyone but Dad really wanted to summit Whitney today, and finish the hike a day early. The alternative would be to finish in just 1 hour, by 9am, and have nothing to do for the rest of the day.
</p><p>
Dad saw that he had to either go along, or be the "stick in the mud" as he called it. As I watched, his facial expression and demeanor quickly flashed from horror of betrayal, to quiet acceptance, to grim determination.
</p><p>
Once it was "settled" that we were going to summit Whitney today and make it out a day early, I opened my bear cannister and scarfed down 2 days' worth of lunch food. Oh sweet calories.
</p><p>
We took off down the trail, knowing a long day was ahead of us. Dad's mood had visibly changed. He knew the last 10 miles would be misery. As if fate decided to add insult to injury, Dad stumbled as he partly tore a muscle on his leg. It was not enough to stop him from walking, but it did give him extra pain with each step. Paul gave him an Excedrin dose which seemed to help after a bit.
</p><p>
By the time we made it to Guitar Lake, which looks more like a bottle of Jagermeister than a guitar if you ask me, Dad's determination and grit had completely overcome his fear of pain and inability to make it today. He kept a strong and steady pace all the way to the junction.
</p><p>
It was a long way up. The summit of Mt Whitney is at 14,500 feet - today I would climb my first "fourteener." Although the sun beat down strongly and unopposed by clouds, it was downright cold in the thin air with the strong upward wind.
</p><p>
Finally we made it to the junction. One way would take us down to Whitney Portal, our exit point, and the other way, up to the summit. We left our packs at the junction, dressed warmly, and ascended. I felt like I could just flap my wings and fly to the summit without my 35 pound pack weighing me down. We made it up to the tippy top, took the obligatory picture, and made it back down to the junction in less than 90 minutes. That is a guess. I don't have a watch.
</p><p>
Dad was able to get cell reception on the summit and talk to Mom a bit. His mood was obviously improved after that. Dad seemed to walk with a vigor on the way down. I followed happily. The sooner we made it down, the more likely we would be able to have hot greasy restaurant food, and beer for the guys who liked it.
</p><p>
The switchbacks and steep down-steps were brutal. Still, we passed many people on the way down from Whitney, and were passed by none. Although we had walked far and still had far to go, we were invigorated by the finish line. I got hungry again and snacked on deshelled sunflower seeds.
</p><p>
It was a long walk to the portal, but it was downhill, and Dad somehow felt in good shape the entire time, keeping up his brisk pace.
</p><p>
Finally, we could see the portal and were on the home stretch. Someone we passed mentioned the restaurant/shop, saying that it would be open as late as 8:15pm, late enough for us to get burgers and beer. I let out a whoop and Dad and I high-fived.
</p><p>
When we finally got there, Dad was hurting, but we had made great time, arriving not too long after Paul and Mike. As we ordered food, Bruce and Chris arrived. We had all hauled down that mountain, eager to finish.
</p><p>
When the food was ready, I asked for barbecue sauce and doused my chicken sandwich and fries. It tasted like just about the best thing that I ever had. I even ate the pickle on my plate. The guys enjoyed a few beers too, clinking bottles together in merry celebration of completion. Paul responsibly contacted the van service guy and had him come pick us up and take us to Lone Pine village.
</p><p>
Here we had McDonalds milkshakes, showered, and are sleeping in a hostel. Tomorrow we have breakfast and then a bus ride to a bigger town that has a rental car place so that we can get to Los Angeles airport for our flights on Saturday.
</p><p>
I connected my phone to WiFi and saw all the stuff I missed while hiking. Looks like my rent check didn't get sent out correctly. On the other hand, Sara didn't forget about me! I invited her over for dinner Sunday, and she offered to give me a ride from the airport. I'm also really looking forward to my first day of work at Amazon on Monday.
</p>
]]></description>
      </item>
      <item>
         <title>Lemming - PyWeek #12 Game Development Journal</title>
         <pubDate>Sat, 02 Apr 2011 12:00:00 GMT</pubDate>

         <link>https://andrewkelley.me/post/lemming-pyweek-12.html</link>
         <guid>https://andrewkelley.me/post/lemming-pyweek-12.html</guid>
         <description><![CDATA[<h1>Lemming - PyWeek #12 Game Development Journal</h1>
<p>
<a href="http://pyweek.org/12/">PyWeek #12</a> challenge starts in 
2 days, 16 hours, and 11 minutes.
</p>
<p>
And I have 23 hours of homework to do in order to clear the Week of Py for
development. 
</p>
<p>
I've plotted out THIS week in Google Calendar (the week before PyWeek) - an entry for homework time, break time, and even sleep time. I'm following a strict schedule for optimum time management during PyWeek. The start date is 5pm on Saturday for me, so I have rotated my sleep schedule so that I will go to sleep at 9am on Saturday in order to wake up just as PyWeek starts.
</p>
<p>
I'm getting pumped up. I can't wait to start!
</p>
<h2 id="day-1"><a href="#day-1">End of Day 1</a></h2>
<p>
Status: Tired, Hungry, Have to Pee
</p>
<p>
Idea: Side scrolling platformer. You're an odd-shaped lemming-like creature leading a stampede of 9 clones. When you die control is instantly transferred to the next one behind you. Your death affects the environment, so a large part of the gameplay will be suiciding yourself in order to get the next guy behind you to safety. Dying should be fun and comedic.
</p>
<p>
Screenshot:
</p>
<img src="http://s3.amazonaws.com/superjoe/temp/pyweek-screenshot-1.png">
<p>
<a href="http://s3.amazonaws.com/superjoe/temp/pyweek-day-1.mpg">Video</a>
</p>
<p>
<a href="https://github.com/andrewrk/lemming">Source code on GitHub</a>
</p>
<p>
Lines of code so far: 612
</p>
<p>
Decided to go with a tile-based level format.
</p>
<p>
I got hung up for a while on pyglet's inverted Y axis but I'm getting used to it now.
</p>
<p>
Having some trouble keeping the frame rate up. I have it up to about 24 but that's not even with a complete level. I'm hoping I won't have to mess with OpenGL directly, as I always encounter bugs that take hours upon hours to find when I do. I'm using pyglet's batch sprite API currently.
</p>
<p>
Next up on the TODO list is to create more interesting tile types and implement the physics for them. Maybe then getting a parallax background going (hard part is finding/creating the art). I think I'll save animation for later. I'd rather focus on my strengths (programming) and get some good gameplay going, and then if I have time I can make the animation happen. Really wishing I had an artist on board at this point.
</p>
<p>
Well, time for an 8-hour nap and then it's back to work!
</p>
<h2 id="pyglet-bug"><a href="#pyglet-bug">pyglet bug</a></h2>
<p>I ran into a
<a href="http://code.google.com/p/pyglet/issues/detail?id=411">rather unfortunate pyglet bug</a> regarding animation.
</p>
<p>
This is going to cost me a an hour or two of dev time to use sprite sheets instead of .gif's :-(
</p>
<p>
In other news, new screenshot:
</p>
<img src="http://s3.amazonaws.com/superjoe/temp/pyweek-screenshot-2.png">
<p>
Also, when I tried to upload my character art (<img src="http://s3.amazonaws.com/superjoe/temp/pyweek-char-art.png">) to Facebook, I got this:
</p>
<img src="http://s3.amazonaws.com/superjoe/temp/pyweek-character-art-facebook.png">
<p>LOL!</p>
<h2 id="day-2"><a href="#day-2">End of Day 2</a></h2>
<p>
Here I am at the end of day 2. According to git log, here's what I've done today:
</p>
<p>
6:24pm: scrolling 2-layer background
8:23pm: ability to detatch leader and give control to the next guy
10:44pm: suicide bomb control works
11:57pm: +1 dude powerup
12:06am: +infinite dudes powerup
12:21am: land mines 
12:35am: spikes
9:20am: totally rewrite display engine and change level format.
9:38am: use pyglet resources rather than the skelington data.py module
</p>
<p>
Lines of code so far: 542. This is interesting because yesterdays LOC count was 612. After a full day's work and several new features, I ended up with less code. Three cheers for good design!
</p>
<p>
I'm now using
<a href="http://www.mapeditor.org/">Tiled</a>
for editing, which is way nicer than my own hacked up piece of crap level editor. I had fun deleting that.
</p>
<p>
Thanks to the display engine rewrite (I simply use glTranslatef instead of manually positioning every sprite), all my FPS worries are gone. I'm well above 60. However I have a new problem: weird line artifacts appearing on sprites. It seems almost as if the bottom row of pixels of every sprite is displaying on the top of the sprite instead. You can see it in the screenshot - there isn't supposed to be a flickering  line on the spikes.
</p>

<p><a href="http://s3.amazonaws.com/superjoe/temp/pyweek-video-day-2.mpg">Video</a></p>
<img src="http://s3.amazonaws.com/superjoe/temp/pyweek-day-2-screenshot.png">

<p>
Next up on the TODO list: use the nifty level editor to create more tiles and a fun and legit Level 1. Then begin to do whatever game upgrades necessary to make Level 1 work. This may involve doing some (shudder) animations.
</p>

<p>
I should begin to think about music and sound effects too. I am considering making the background music myself. I have been known to make
<a href="http://theburningawesome.bandcamp.com/">some music</a>
in
<a href="http://www.image-line.com/documents/flstudio.html">FL Studio</a>.
I think it would be fun if some parts of the level pulsated to the beat of the music.
</p>
<h2 id="day-3"><a href="#day-3">End of Day 3</a></h2>
<p><a href="http://s3.amazonaws.com/superjoe/temp/pyweek-video-day-3.mpg">Video</a></p>
<img src="http://s3.amazonaws.com/superjoe/temp/pyweek-screenshot-day-3.png">

<p>Lines of code so far: 653</p>

<p>What I've done:</p>

<ul>
  <li>7:27pm: fixed the sub-pixel access issue with glTranslatef. thanks richard!</li>
  <li>12:13am: created animation assets for the main character</li>
  <li>1:05am: implement animation support in the game logic</li>
  <li>3:27am: smooth camera movement - follows the character at a damped acceleration curve</li>
  <li>6:50am: created level 1 in map editor with some unimplemented game ideas and tiles</li>
</ul>
<p>Here is some new concept art:</p>
<img src="http://s3.amazonaws.com/superjoe/temp/winnar.png">
<img src="http://s3.amazonaws.com/superjoe/temp/sadface.png">
<p>Problem: I have a mid-term coming up in 7 hours, and I have not been to class or read the book since the last test. Also I worked through the night so I will be in sleepy-and-not-motivated mode all day while studying and test-taking. Bleh.</p>
<p>Overview of what my TODO list looks like right now:</p>
<ul>
  <li>try to not fail college</li>
  <li>implement in game logic all the stuff I just pretended would work when I made level 1</li>
  <li>make another level with even more new ideas and then implement them. repeat until satisfied.</li>
  <li>sound effects</li>
  <li>bg music</li>
  <li>HUD</li>
  <li>menu system / title screen</li>
  <li>death / game over</li>
  <li>test on windows and mac</li>
</ul>

<h2 id="day-4"><a href="#day-4">End of Day 4</a></h2>
<p>
Good news and bad news. The bad news is that I got at best 55% on my statistics test yesterday. The good news is that I have all kinds of new stuff done for PyWeek. I think I have my priorities in the right place :-)
</p>
<img src="http://s3.amazonaws.com/superjoe/temp/pyweek-day-4-screenshot.png">
<p><a href="http://s3.amazonaws.com/superjoe/temp/pyweek-day-4.mpg">Video #1</a></p>
<p><a href="http://s3.amazonaws.com/superjoe/temp/pyweek-day-4-#2.mpg">Video #2</a></p>

<ul>
  <li>3:20am: more advanced collision detection which makes walls work</li>
  <li>4:18am: ramps work</li>
  <li>4:33am: text boxes work. (that was dead simple, thanks to pyglet!)</li>
  <li>5:40am: belly flops work.</li>
  <li>6:16am: ladders work</li>
  <li>6:52am: decorations work; added house decoration</li>
  <li>7:18am: add ladder animation</li>
  <li>8:15am: add monster animations</li>
  <li>8:51am: added monster who throws you</li>
  <li>9:23am: explosions will break breakable stuff</li>
  <li>9:39am: you can explode monsters</li>
  <li>9:54am: add some tiles to make the level a little nicer</li>
  <li>11:33am: implement buttons and bridges. level 1 works completely</li>
  <li>12:21am: support reverse direction for monsters</li>
</ul>

<p>I talked to my friend
<a href="http://www.soundclick.com/bands/default.cfm?bandID=501445">Tyler</a>
and he wants to make the music for the game, if only he can whip his computing situation into shape in time. I'm crossing my fingers.
</p>

<p>I've formulated a plan for the rest of the days I have left:</p>

<p>Day 5:</p>
<ul>
  <li>sound effects</li>
  <li>HUD</li>
  <li>menu system and title screen</li>
  <li>game over</li>
</ul>
<p>Day 6:</p>
<ul>
<li>implement as many gameplay ideas as possible</li>
<li>create as many levels as possible exploring those concepts</li>
<li>bg music if it doesn't work out with tyler</li>
</ul>
<p>Day 7:</p>
<ul>
  <li>Dinner with family</li>
  <li>Party with friends</li>
  <li>Test on Mac and Windows and smooth out any glitches/bugs/problems with the experience.</li>
  <li>bug fixing only. no new features</li>
</ul>

<h2 id="day-5"><a href="#day-5">End of Day 5</a></h2>
<p>I'm really fighting sleep-deprivation right now. Going to crash like a rock as soon as I'm done with this diary entry. Like a rock? Is this how a person crashes? I can't remember. Anyways:</p>

<ul>
  <li>1:42pm (yes I cheated and worked more after yesterday's diary entry before sleeping): maps support background music and the game engine will stream, play, and loop the background music. I'm really liking how simple this stuff is in pyglet.</li>
  <li>2:11pm: test on Windows with py2exe. It worked after I added another entry to the resource search path. It's not safe to rely on the .py file location - with py2exe it moves. I better warn people who used the skellington.</li>
  <li>12:46am (after a nap and Poker Night at Tyler's across town): record, create, and find sound effects</li>
  <li>2:04am: create a short background music track</li>
  <li>3:58am: support sound effects in game</li>
  <li>4:13am: if you don't belly flop, you don't cover as many spikes up</li>
  <li>4:35am: a HUD which gives you hints as to what buttons you can press</li>
  <li>7:07am: slim down tiledtmxloader and make it use pyglet.resource</li>
  <li>8:57am: huge code refactor enabling me to switch scenes from gameplay to title screen or restart gameplay when you die. I took my own advice from
  <a href="http://stackoverflow.com/questions/5581447/switching-scenes-with-pyglet">Hugoagogo's stack overflow help post</a>
  and it worked pretty nicely. Hopefully there isn't a memory leak.</li>
  <li>9:47am: title screen works and has sound</li>
  <li>9:53am: FPS display is a command line argument, off by default</li>
  <li>10:10am: on game over, starts level over</li>
  <li>10:28am: if you beat the level you go on to the next one</li>
  <li>10:41am: automatically save progress so player can continue from where they left off.</li>
  <li>11:04am: credits scene when you win</li>
</ul>
<p><a href="http://s3.amazonaws.com/superjoe/temp/pyweek-day-5.mpg">Video</a></p>
<img src="http://s3.amazonaws.com/superjoe/temp/pyweek-screenshot-day-5.png">
<p><a href="http://s3.amazonaws.com/superjoe/temp/silly.mp3">Music</a></p>

<p>I'm
<a href="http://www.pyweek.org/d/3787/#comment-8584">waiting on confirmation</a>
to make sure I don't have to swap out my explosion graphic. That would suck.</p>

<p>TODO:</p>

<p>Day 6:</p>
<ul>
  <li>implement as many gameplay ideas as possible</li>
  <li>create as many levels as possible exploring those concepts</li>
  <li>fix some bugs/nice-to-haves</li>
</ul>
<p>Day 7 (half-day):</p>
<ul>
  <li>test on windows and mac</li>
  <li>complete the README and make sure all files are in place on pyweek.org</li>
  <li>download the final submission and play through it on each operating system</li>
</ul>
<p>Bugs / nice to haves that I probably don't have time for:</p>
<ul>
  <li>credits song</li>
  <li>physics bugs</li>
  <li>the "beat level" jingle is obnoxious. re-do it</li>
  <li>don't call game over until belly flop is done. he might pick up a +1</li>
  <li>animate the title screen</li>
</ul>

<h2 id="day-6"><a href="#day-6">End of Day 6</a></h2>

<p>Lines of code: 1939</p>

<p>I'm not going to spoil the surprise by posting any more commit log, videos, pictures, or sound. You'll just have to play it to find out!</p>

<p>Things are progressing on track. I already have something playable and working right now, all that needs doing is some more levels and bug fixes.</p>

<p>Good news - Tyler managed to get his computer working and make an awesome track.</p>

<p>I really need sleep but I'm going to be out and about for another 10 hours. I can do this.</p>

<p>TODO:</p>
<p>Day 7 (half-day):</p>
<ul>
  <li>create at least 3 more levels</li>
  <li>critical bug: if you die several times, memory keeps expanding and sound stops working</li>
  <li>important bug: belly flop stuck in wall, and don't call it game over until all belly flops are turned to dead bodies</li>
  <li>test on windows and mac</li>
</ul>
<p>I'd really like to, but probably not going to get to these, unfortunately:</p>
<ul>
  <li>add a gory explosion of bloody organs when you die on spikes</li>
  <li>bad physics glitch when you jump against a vertical wall</li>
  <li>physics glitch with belly flops and spikes</li>
  <li>death shouldn't interrupt the music flow</li>
  <li>the sound that plays when you win is obnoxious and horrible</li>
</ul>

<h2 id="day-7"><a href="#day-7">End of Day 7</a></h2>
<p>Well here I am. 32 minutes to go and I am submitted. That was a gruelling but satisfying experience.</p>

<p>I tried to make as many levels as possible and ironically ended up with 9.</p>

<p>2021 lines of python.</p>

<p>Tyler's having fun with the level editor - I think we may continue to work on this past the competition.</p>

<p>Also I'm going to contribute to Tiled. Having relied on it heavily, I know how to make it better.</p>

<p>I also discovered a few pyglet bugs. I'll submit patches to fix them.</p>


<h2 id="postmortem"><a href="#postmortem">Postmortem</a></h2>
<p>A lot of people are getting stuck on Level 2.</p>

<p>And 3, and 4, and 5, and...</p>

<p>I realize now that I made the levels way too hard. I should have very slowly increased the difficulty.</p>

<p><strong>Each new level shows off more</strong> of the game's gameplay elements, so it would be a shame for you to give up and judge without at least checking out all the levels.</p>

<p>If you edit a file called "save_game" (this will be created if you beat level 1) in the directory you run the game from, it has the number of the level you start on when you press Continue from the title screen. You can edit this to skip around and see all 9 levels.</p>

<p>Alternately, there is a YouTube video of me speedrunning through the entire game (only 7 minutes long):</p>
<iframe width="560" height="315" src="https://www.youtube.com/embed/BOkXWp4OLsM" frameborder="0" allowfullscreen></iframe>
]]></description>
      </item>
      </channel>
</rss>
