<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>Speculative Branches</title>
    <link>https://specbranch.com/</link>
    <description>Recent content on Speculative Branches</description>
    <generator>Hugo -- gohugo.io</generator>
    <language>en-us</language>
    <copyright>Nima Badizadegan</copyright>
    <lastBuildDate>Sun, 28 Nov 2021 23:00:22 +0000</lastBuildDate><atom:link href="https://specbranch.com/index.xml" rel="self" type="application/rss+xml" />
    <item>
      <title>Exponentials in 3 Instructions</title>
      <link>https://specbranch.com/posts/fast-exp/</link>
      <pubDate>Sun, 04 May 2025 00:00:00 +0000</pubDate>
      
      <guid>https://specbranch.com/posts/fast-exp/</guid>
      <description>
        
          
            &lt;p&gt;&lt;em&gt;This post expands on an algorithm shown in the &lt;a href=&#34;https://www.routledge.com/9781032933559&#34;&gt;book I wrote&lt;/a&gt;
on floating-point math.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;It is very common in computing to want to do $e^x$ very quickly and not care very much about
how accurately you computed it. This is increasingly true in ML and AI algorithms, which can
be very tolerant to noise from numerical error and often use low bit precision either way. It also
shows up doing things like exponentially-weighted moving averages and other similar functions in
signal processing. Thankfully, you can replace $e^x$ in these situations with the following function:&lt;/p&gt;
          
          
        
      </description>
    </item>
    
    <item>
      <title>Perfect Random Floating-Point Numbers</title>
      <link>https://specbranch.com/posts/fp-rand/</link>
      <pubDate>Sat, 03 May 2025 00:00:00 +0000</pubDate>
      
      <guid>https://specbranch.com/posts/fp-rand/</guid>
      <description>
        
          
            &lt;p&gt;When I recently looked at the state of the art in floating point random number generation,
I was surprised to see a common procedure in many programming languages and libraries that
is not really a floating-point algorithm:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Generate a random integer with bits chosen based on the precision of the format.&lt;/li&gt;
&lt;li&gt;Convert to floating point.&lt;/li&gt;
&lt;li&gt;Divide to produce an output between 0 and 1.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;In code, this looks like:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-go&#34; data-lang=&#34;go&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;ln&#34;&gt;1&lt;/span&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;kd&#34;&gt;func&lt;/span&gt;&lt;span class=&#34;w&#34;&gt; &lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;nx&#34;&gt;r&lt;/span&gt;&lt;span class=&#34;w&#34;&gt; &lt;/span&gt;&lt;span class=&#34;o&#34;&gt;*&lt;/span&gt;&lt;span class=&#34;nx&#34;&gt;Rand&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;)&lt;/span&gt;&lt;span class=&#34;w&#34;&gt; &lt;/span&gt;&lt;span class=&#34;nf&#34;&gt;Float64&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;()&lt;/span&gt;&lt;span class=&#34;w&#34;&gt; &lt;/span&gt;&lt;span class=&#34;kt&#34;&gt;float64&lt;/span&gt;&lt;span class=&#34;w&#34;&gt; &lt;/span&gt;&lt;span class=&#34;p&#34;&gt;{&lt;/span&gt;&lt;span class=&#34;w&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;ln&#34;&gt;2&lt;/span&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;w&#34;&gt;    &lt;/span&gt;&lt;span class=&#34;kt&#34;&gt;int64&lt;/span&gt;&lt;span class=&#34;w&#34;&gt; &lt;/span&gt;&lt;span class=&#34;nx&#34;&gt;rand_int&lt;/span&gt;&lt;span class=&#34;w&#34;&gt; &lt;/span&gt;&lt;span class=&#34;p&#34;&gt;=&lt;/span&gt;&lt;span class=&#34;w&#34;&gt; &lt;/span&gt;&lt;span class=&#34;nx&#34;&gt;r&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;.&lt;/span&gt;&lt;span class=&#34;nf&#34;&gt;Int63n&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;mi&#34;&gt;1&lt;/span&gt;&lt;span class=&#34;o&#34;&gt;&amp;lt;&amp;lt;&lt;/span&gt;&lt;span class=&#34;mi&#34;&gt;53&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;)&lt;/span&gt;&lt;span class=&#34;w&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;ln&#34;&gt;3&lt;/span&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;w&#34;&gt;    &lt;/span&gt;&lt;span class=&#34;k&#34;&gt;return&lt;/span&gt;&lt;span class=&#34;w&#34;&gt; &lt;/span&gt;&lt;span class=&#34;nb&#34;&gt;float64&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;nx&#34;&gt;rand_int&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;)&lt;/span&gt;&lt;span class=&#34;w&#34;&gt; &lt;/span&gt;&lt;span class=&#34;o&#34;&gt;/&lt;/span&gt;&lt;span class=&#34;w&#34;&gt; &lt;/span&gt;&lt;span class=&#34;p&#34;&gt;(&lt;/span&gt;&lt;span class=&#34;mi&#34;&gt;1&lt;/span&gt;&lt;span class=&#34;o&#34;&gt;&amp;lt;&amp;lt;&lt;/span&gt;&lt;span class=&#34;mi&#34;&gt;53&lt;/span&gt;&lt;span class=&#34;p&#34;&gt;)&lt;/span&gt;&lt;span class=&#34;w&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;ln&#34;&gt;4&lt;/span&gt;&lt;span class=&#34;cl&#34;&gt;&lt;span class=&#34;p&#34;&gt;}&lt;/span&gt;&lt;span class=&#34;w&#34;&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;This function is supposed to produce floating-point numbers drawn from a uniform distribution
in the interval $[0, 1)$. Zero is a possible output, but one is not, and the distribution is
uniform. The number &amp;quot;53&amp;quot; in the algorithm above is chosen in a way that is floating-point aware:
the double-precision floating-point numbers have 53 bits of precision, so this algorithm only
creates bits equal to the precision of the number system. It seems to fit the bill.&lt;/p&gt;
          
          
        
      </description>
    </item>
    
    <item>
      <title>A Cryptographically Secret Santa</title>
      <link>https://specbranch.com/posts/cryptographic-santa/</link>
      <pubDate>Wed, 25 Dec 2024 00:00:00 +0000</pubDate>
      
      <guid>https://specbranch.com/posts/cryptographic-santa/</guid>
      <description>
        
          
            &lt;p&gt;Twas about 4-6 weeks before Christmas, and all through the math department,
not a creature was stirring, not even a plucky young undergrad. Cryptography
professors Alice and Bob sat at the elliptically-curved conference table to plan
the department&#39;s secret Santa. Mallory, the department secretary, had been given
the task of organizing last year, and somehow managed to get three gifts while
leaving several people disappointed. This year&#39;s math department thus resolved
to do their secret Santa without a trusted party.&lt;/p&gt;
          
          
        
      </description>
    </item>
    
    <item>
      <title>Time Programming for Lawyers and Jurors</title>
      <link>https://specbranch.com/posts/time-for-jurors/</link>
      <pubDate>Wed, 26 Jun 2024 00:00:00 +0000</pubDate>
      
      <guid>https://specbranch.com/posts/time-for-jurors/</guid>
      <description>
        
          
            &lt;p&gt;I often like to have streams of trials playing in the background when I am working, and
the recent trial of interest was the trial of Karen Read, who was accused of murder.
One issue in this case was a conflict between two timestamps logged by different apps
on a potential suspect&#39;s phone.  Apple health had logged the person climbing flights of
stairs at a given time, and at the same timestamp, a log from the app Waze indicated
that the person was driving.  This seems impossible, but these apps used different
time sources on the phone.&lt;/p&gt;
          
          
        
      </description>
    </item>
    
    <item>
      <title>Five Nine Problems</title>
      <link>https://specbranch.com/posts/five-nines/</link>
      <pubDate>Thu, 13 Jun 2024 00:00:00 +0000</pubDate>
      
      <guid>https://specbranch.com/posts/five-nines/</guid>
      <description>
        
          
            &lt;p&gt;A guilty pleasure of mine is the pursuit of perfection.  It is certainly a vice in most
contexts, but there are some problems whose solutions demand a measure of perfection.
These are problems that I will refer to as &amp;quot;5-9 problems&amp;quot;: problems whose solutions need
five 9&#39;s (or more) in some dimension.  Usually, those nines are correctness of some kind,
but they can also be availability or for some systems, speed.&lt;/p&gt;
          
          
        
      </description>
    </item>
    
    <item>
      <title>The Computer Architecture of AI (in 2024)</title>
      <link>https://specbranch.com/posts/ai-infra/</link>
      <pubDate>Sat, 10 Feb 2024 00:00:00 +0000</pubDate>
      
      <guid>https://specbranch.com/posts/ai-infra/</guid>
      <description>
        
          
            &lt;p&gt;Over the last year, as a person with a hardware background, I have heard a lot of complaints about Nvidia&#39;s dominance
of the machine learning market and whether I can build chips to make the situation better.  While the amount of money
I would expect it to take is less than
&lt;a href=&#34;https://www.tomshardware.com/tech-industry/artificial-intelligence/openai-ceo-sam-altman-seeks-dollar5-to-dollar7-trillion-to-build-a-network-of-fabs-for-ai-chips&#34;&gt;$7 trillion&lt;/a&gt;,
hardware accelerating this wave of AI will be a very tough problem--much tougher than the last wave focused on CNNs--and
there is a good reason that Nvidia has become the leader in this field with few competitors.  While the
inference of CNNs used to be a math problem, the inference of large language models has actually become a computer
architecture problem involving figuring out how to coordinate memory and I/O with compute to get the best performance
out of the system.&lt;/p&gt;
          
          
        
      </description>
    </item>
    
    <item>
      <title>The Knight Capital Disaster</title>
      <link>https://specbranch.com/posts/knight-capital/</link>
      <pubDate>Wed, 22 Nov 2023 00:00:00 +0000</pubDate>
      
      <guid>https://specbranch.com/posts/knight-capital/</guid>
      <description>
        
          
            &lt;p&gt;&lt;em&gt;This account comes from several publicly available sources as well as accounts from insiders who worked at
Knight Capital Group at the time of the issue. I am telling it second- or third-hand.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;On August 1, 2012, Knight Capital fell on its sword. It experienced a software glitch that literally bankrupted the
company.  Between 9:30 am and 10:15 am EST, the employees of Knight capital watched in disbelief and scrambled to
figure out what went wrong as the company acquired massive long and short positions, largely concentrated in 154
stocks, totaling 397 million shares and $7.65 billion.  At 10:15, the kill switch was flipped, stopping the
company&#39;s trading operations for the day.  By early afternoon, many of Knight Capital&#39;s employees had already
sent out resumes, expecting to be unemployed by the end of the week.&lt;/p&gt;
          
          
        
      </description>
    </item>
    
    <item>
      <title>Abstraction is Expensive</title>
      <link>https://specbranch.com/posts/expensive-abstraction/</link>
      <pubDate>Wed, 07 Dec 2022 00:00:00 +0000</pubDate>
      
      <guid>https://specbranch.com/posts/expensive-abstraction/</guid>
      <description>
        
          
            &lt;p&gt;As you build a computer system, little things start to show up: maybe that database query is awkward
for the feature you are building, or you find your server getting bogged down transferring gigabytes
of data in hexadecimal ASCII, or your app translates itself to Japanese on the fly for hundreds of
thousands of separate users.  These are places where your abstractions are misaligned - your app
would be quantitatively better if it had a better DB schema, a way to transfer binary data, or
native internationalization for your Japanese users.  Each of these misalignments carries a cost.&lt;/p&gt;
          
          
        
      </description>
    </item>
    
    <item>
      <title>Contemplating Randomness</title>
      <link>https://specbranch.com/posts/random-nums/</link>
      <pubDate>Thu, 27 Oct 2022 00:00:00 +0000</pubDate>
      
      <guid>https://specbranch.com/posts/random-nums/</guid>
      <description>
        
          
            &lt;p&gt;I have recently been immersed in the theory and practice of random number generation while working
on &lt;a href=&#34;https://arbitrand.com&#34;&gt;Arbitrand&lt;/a&gt;, a new high-quality true random number generation service
hosted in AWS.  Because of that, I am starting a sequence of blog posts on randomness and
random number generators.  This post is the first of the sequence, and focuses what random
number generators are and how to test them.&lt;/p&gt;
&lt;p&gt;Formally, random number generators are systems that produce a stream of bits (or numbers) with
two properties:&lt;/p&gt;
          
          
        
      </description>
    </item>
    
    <item>
      <title>Introduction to Micro-Optimization</title>
      <link>https://specbranch.com/posts/intro-to-micro-optimization/</link>
      <pubDate>Sun, 11 Sep 2022 00:00:00 +0000</pubDate>
      
      <guid>https://specbranch.com/posts/intro-to-micro-optimization/</guid>
      <description>
        
          
            &lt;p&gt;A modern CPU is an incredible machine.  It can execute many instructions at the same time, it can
re-order instructions to ensure that memory accesses and dependency chains don&#39;t impact performance
too much, it contains hundreds of registers, and it has huge areas of silicon devoted to predicting
which branches your code will take.  However, if you have a tight loop and you are interested in
optimizing the hell out of it, the same mechanisms that make your code run fast can make your job
very difficult.  They add a lot of complexity that can make it hard to figure out how to optimize a
function, and they can also create local optima that trap you into a less efficient solution.&lt;/p&gt;
          
          
        
      </description>
    </item>
    
    <item>
      <title>Rest in Peace, Optane</title>
      <link>https://specbranch.com/posts/rip-optane/</link>
      <pubDate>Fri, 12 Aug 2022 00:00:00 +0000</pubDate>
      
      <guid>https://specbranch.com/posts/rip-optane/</guid>
      <description>
        
          
            &lt;p&gt;Intel&#39;s Optane memory modules launched with a lot of fanfare in 2015, and were recently
discontinued, in 2022, with similar fanfare.  It was a sad day for me, a lover of
abstraction-breaking technologies, but it was forseeable and understandable.&lt;/p&gt;
&lt;p&gt;At the time of Optane&#39;s launch, a lot of us were excited about the idea of having a new storage
tier, sitting between DRAM and flash.  It was announced as having DRAM endurance and speed with
the persistence and size of flash.  It was a futuristic memory technology, but the technology of
the future met the full force of Wright&#39;s Law.&lt;/p&gt;
          
          
        
      </description>
    </item>
    
    <item>
      <title>Use One Big Server</title>
      <link>https://specbranch.com/posts/one-big-server/</link>
      <pubDate>Wed, 27 Jul 2022 00:00:00 +0000</pubDate>
      
      <guid>https://specbranch.com/posts/one-big-server/</guid>
      <description>
        
          
            &lt;p&gt;A lot of ink is spent on the &amp;quot;monoliths vs. microservices&amp;quot; debate, but the real issue behind
this debate is about whether distributed system architecture is worth the developer time and
cost overheads.  By thinking about the real operational considerations of our systems, we can
get some insight into whether we actually need distributed systems for most things.&lt;/p&gt;
&lt;p&gt;We have all gotten so familiar with virtualization and abstractions between our software
and the servers that run it.  These days, &amp;quot;serverless&amp;quot; computing is all the rage, and even
&amp;quot;bare metal&amp;quot; is a class of virtual machine.  However, every piece of software runs on a
server.  Since we now live in a world of virtualization, most of these servers are a lot
bigger and a lot cheaper than we actually think.&lt;/p&gt;
          
          
        
      </description>
    </item>
    
    <item>
      <title>The Most Useful Statistical Test You Didn&#39;t Learn in School</title>
      <link>https://specbranch.com/posts/kolmogorov-smirnov/</link>
      <pubDate>Mon, 04 Jul 2022 00:00:00 +0000</pubDate>
      
      <guid>https://specbranch.com/posts/kolmogorov-smirnov/</guid>
      <description>
        
          
            &lt;p&gt;In performance work, you will often find many distributions that are weirdly shaped: fat-tailed
distributions, distributions with a hard lower bound at a non-zero number, and distributions
that are just plain odd.  Particularly when you look at latency distributions, it is extremely
common for the 99th percentile to be a lot further from the mean than the 1st percentile.  These
sorts of asymmetric fat-tailed distributions come with the business.&lt;/p&gt;
&lt;p&gt;Often times, when performance engineers need to be scientific about their work, they will take
samples of these distributions, and put them into into a $t$-test to get a $p$-value for the
significance of their improvements.  That is what you learned in a basic statistics or lab science
class, so why not?  Unfortunately, the world of computers is more complicated than the beer
quality experiments for which the $t$-test was invented, and violates one of its core assumptions:
that the sample means are normally distributed.  When you have a lot of samples, this can hold,
but it often doesn&#39;t.&lt;/p&gt;
          
          
        
      </description>
    </item>
    
    <item>
      <title>What Happened with FPGA Acceleration?</title>
      <link>https://specbranch.com/posts/fpgas-what-happened/</link>
      <pubDate>Wed, 01 Jun 2022 00:00:00 +0000</pubDate>
      
      <guid>https://specbranch.com/posts/fpgas-what-happened/</guid>
      <description>
        
          
            &lt;p&gt;In 2018, I took the jump from being primarily an FPGA hardware engineer to being primarily a software
engineer.  At the time, things were looking great for FPGA acceleration, with AWS and later Azure
bringing in VMs with FPGAs and the two big FPGA vendors setting their sights on application
acceleration. Almost 5 years later, I am working on another project with FPGAs, this time a
cloud-oriented one. That has inspired me to write a retrospective on the last 5 years of what we
thought would be an FPGA acceleration boom.&lt;/p&gt;
          
          
        
      </description>
    </item>
    
    <item>
      <title>Teach Your Kids Bridge</title>
      <link>https://specbranch.com/posts/teach-bridge/</link>
      <pubDate>Sat, 21 May 2022 00:00:00 +0000</pubDate>
      
      <guid>https://specbranch.com/posts/teach-bridge/</guid>
      <description>
        
          
            &lt;p&gt;A post recently made the rounds on &lt;a href=&#34;https://news.ycombinator.com/news&#34;&gt;hacker news&lt;/a&gt; claiming that
&lt;a href=&#34;https://momentofdeep.substack.com/p/teach-your-kids-poker-not-chess?s=r&#34;&gt;you should teach your kids poker, not chess&lt;/a&gt;.
The comments on that post go through a lot of the reasons why poker is a bad game to teach your
children, but I felt that I was well suited to opine on this topic, and explain why duplicate
bridge is the best game for practicing the life skills involved in business and programming,
compared to all of the alternatives.&lt;/p&gt;
          
          
        
      </description>
    </item>
    
    <item>
      <title>Fixed Point Arithmetic</title>
      <link>https://specbranch.com/posts/fixed-point/</link>
      <pubDate>Wed, 18 May 2022 00:00:00 +0000</pubDate>
      
      <guid>https://specbranch.com/posts/fixed-point/</guid>
      <description>
        
          
            &lt;p&gt;When we think of how to represent fractional numbers in code, we reach for &lt;code&gt;double&lt;/code&gt; and &lt;code&gt;float&lt;/code&gt;,
and almost never reach for anything else.  There are several alternatives, including
&lt;a href=&#34;https://dl.acm.org/doi/pdf/10.1145/2911981&#34;&gt;constructive real numbers&lt;/a&gt; that are used in calculators,
and &lt;a href=&#34;https://docs.python.org/3/library/fractions.html&#34;&gt;rational numbers&lt;/a&gt;.  One alternative predates
all of these, including floating point, and actually allows you to compute faster than when you
use floating point numbers.  That alternative is fixed point: a primitive form of decimal that does
not offer any of the conveniences of &lt;code&gt;float&lt;/code&gt;, but allows you to do decimal computations more quickly
and efficiently. Fixed point still has usage in some situations today, and it can be a potent tool
in your arsenal as a programmer if you find yourself working with math at high speed.&lt;/p&gt;
          
          
        
      </description>
    </item>
    
    <item>
      <title>You (Probably) Shouldn&#39;t use a Lookup Table</title>
      <link>https://specbranch.com/posts/lookup-tables/</link>
      <pubDate>Wed, 04 May 2022 00:00:00 +0000</pubDate>
      
      <guid>https://specbranch.com/posts/lookup-tables/</guid>
      <description>
        
          
            &lt;p&gt;I have been working on another post recently, also related to division, but I wanted to address
a comment I got from several people on the previous division article.  This comment invariably
follows a lot of articles on using math to do things with &lt;code&gt;chars&lt;/code&gt; and &lt;code&gt;shorts&lt;/code&gt;.  It is: &amp;quot;why
are you doing all of this when you can just use a lookup table?&amp;quot;&lt;/p&gt;
&lt;p&gt;Even worse, a stubborn and clever commenter may show you a benchmark where your carefully-crafted
algorithm performs worse than their hamfisted lookup table.  Surely you have made a mistake and
you should just use a lookup table.  Just look at the benchmark!&lt;/p&gt;
          
          
        
      </description>
    </item>
    
    <item>
      <title>Who Controls a DAO?</title>
      <link>https://specbranch.com/posts/who-controls-a-dao/</link>
      <pubDate>Fri, 01 Apr 2022 00:00:00 +0000</pubDate>
      
      <guid>https://specbranch.com/posts/who-controls-a-dao/</guid>
      <description>
        
          
            &lt;p&gt;In honor of April Fools&#39; Day, I decided to write about a blockchain topic. The crypto economy is in
the process of speedrunning their way from zero to a modern economy, and when you move that fast,
a few things have to break along the way.  One of those things is corporate governance.&lt;/p&gt;
&lt;p&gt;Matt Levine&#39;s &lt;a href=&#34;https://www.bloomberg.com/opinion/authors/ARbTQlRLRjE/matthew-s-levine&#34;&gt;&amp;quot;Money Stuff&amp;quot;&lt;/a&gt;
is a financial newsletter that I can&#39;t recommend enough.  If you are at all interested in finance,
stocks, and markets, it is funny and informative read.  One of the recurring topics of Money Stuff
is &amp;quot;who controls a company?&amp;quot;  Quoting a bit of the
&lt;a href=&#34;https://www.bloomberg.com/opinion/articles/2018-07-24/papa-john-s-poison-pilled-papa-john?sref=1kJVNqnU&#34;&gt;newsletter&lt;/a&gt;:&lt;/p&gt;
          
          
        
      </description>
    </item>
    
    <item>
      <title>Python is Like Assembly</title>
      <link>https://specbranch.com/posts/python-and-asm/</link>
      <pubDate>Sun, 06 Mar 2022 00:00:00 +0000</pubDate>
      
      <guid>https://specbranch.com/posts/python-and-asm/</guid>
      <description>
        
          
            &lt;p&gt;Python and Assembly have one thing in common: as a professional software engineer, they are both
languages that you probably should know how to read, but be terrified to write. These languages seem
to be (and are) at opposite ends of the spectrum: One is almost machine code, and the other is almost a
scripting language. One is beginner-friendly and the other is seen as hostile to experts. One is
viciously versatile with tons of libraries and ports, and the other is ridiculously limited in its
capabilities. However, when you are creating production software, both are the wrong tool for the
job.&lt;/p&gt;
          
          
        
      </description>
    </item>
    
    <item>
      <title>Racing the Hardware: 8-bit Division</title>
      <link>https://specbranch.com/posts/faster-div8/</link>
      <pubDate>Tue, 22 Feb 2022 00:00:00 +0000</pubDate>
      
      <guid>https://specbranch.com/posts/faster-div8/</guid>
      <description>
        
          
            &lt;p&gt;Occasionally, I like to peruse &lt;a href=&#34;https://uops.info&#34;&gt;uops.info&lt;/a&gt;.  It is a great resource for micro-optimization:
benchmark every x86 instruction on every architecture, and compile the results.  Every time I look at this table,
there is one thing that sticks out to me: the &lt;code&gt;DIV&lt;/code&gt; instruction. On a Coffee Lake CPU, an 8-bit &lt;code&gt;DIV&lt;/code&gt; takes
a long time: 25 cycles.  Cannon Lake and Ice Lake do a lot better, and so does AMD. We know that divider
architecture is different between architectures, and aggregating all of the performance numbers for an
8-bit &lt;code&gt;DIV&lt;/code&gt;, we see:&lt;/p&gt;
          
          
        
      </description>
    </item>
    
    <item>
      <title>The Meaning of Speed</title>
      <link>https://specbranch.com/posts/performance-dimensions/</link>
      <pubDate>Sun, 13 Feb 2022 00:00:00 +0000</pubDate>
      
      <guid>https://specbranch.com/posts/performance-dimensions/</guid>
      <description>
        
          
            &lt;p&gt;A lot of the time, when engineers think of performance work, we think about looking at benchmarks and
making the numbers smaller.  We anticipate that we are benchmarking the right pieces of code, and we take
it for granted that reducing some of those numbers is a benefit, but also &amp;quot;the root of all evil&amp;quot; if done
prematurely.  If you are a performance-focused software engineer, or you are working with performance
engineers, it can help to understand the value proposition of performance and when to work on it.&lt;/p&gt;
          
          
        
      </description>
    </item>
    
    <item>
      <title>Performance Numbers Worth Knowing</title>
      <link>https://specbranch.com/posts/common-perf-numbers/</link>
      <pubDate>Mon, 31 Jan 2022 00:00:00 +0000</pubDate>
      
      <guid>https://specbranch.com/posts/common-perf-numbers/</guid>
      <description>
        
          
            &lt;p&gt;When you design software to achieve a particular level of performance, it can be a good idea to be familiar with
the general speed regimes you are working with: fundamental limitations like storage devices and networks can drive
software architecture. Here are a set of common benchmark numbers that can help you anchor performance conversations
and think about the components that your software will interact with. As with all guidelines, these numbers are all
slightly wrong, but still useful.&lt;/p&gt;
          
          
        
      </description>
    </item>
    
    <item>
      <title>Constant-time Fibonacci</title>
      <link>https://specbranch.com/posts/const-fib/</link>
      <pubDate>Sat, 22 Jan 2022 00:00:00 +0000</pubDate>
      
      <guid>https://specbranch.com/posts/const-fib/</guid>
      <description>
        
          
            &lt;p&gt;This is the second part in a 2-part series on the &amp;quot;Fibonacci&amp;quot; interview problem.
We are building off of a previous post, so &lt;a href=&#34;https://specbranch.com/posts/fibonacci/&#34;&gt;take a look at Part I&lt;/a&gt; if you haven&#39;t seen it.&lt;/p&gt;
&lt;p&gt;Previously, we examined the problem and constructed a logarithmic-time solution based on computing the power
of a matrix.  Now we will derive a constant time solution using some more linear algebra. If you had
trouble with the linear algebra in part I, it may help to read up on
&lt;a href=&#34;https://www.mathsisfun.com/algebra/matrix-introduction.html&#34;&gt;matrices&lt;/a&gt;,
&lt;a href=&#34;https://www.mathsisfun.com/algebra/matrix-multiplying.html&#34;&gt;matrix multiplicaiton&lt;/a&gt;, and special matrix
operations (specifically &lt;a href=&#34;https://www.mathsisfun.com/algebra/matrix-determinant.html&#34;&gt;determinants&lt;/a&gt;
and &lt;a href=&#34;https://www.mathsisfun.com/algebra/matrix-inverse.html&#34;&gt;inverses&lt;/a&gt;) before moving on.&lt;/p&gt;
          
          
        
      </description>
    </item>
    
    <item>
      <title>Less-than-linear Fibonacci</title>
      <link>https://specbranch.com/posts/fibonacci/</link>
      <pubDate>Fri, 14 Jan 2022 00:00:00 +0000</pubDate>
      
      <guid>https://specbranch.com/posts/fibonacci/</guid>
      <description>
        
          
            &lt;p&gt;Few interview problems are as notorious as the &amp;quot;Fibonacci&amp;quot; interview question. At first glance,
it seems good: Most people know something about the problem, and there are several
clever ways to achieve a linear time solution. Usually, in interviews, the linear time solution
is the expected solution. However, the Fibonacci problem is unique among interview problems in that
the expected solution is &lt;em&gt;not&lt;/em&gt; the optimal solution. There is an $O(1)$ solution, and to get there,
we need a little bit of linear algebra.&lt;/p&gt;
          
          
        
      </description>
    </item>
    
    <item>
      <title>First Post</title>
      <link>https://specbranch.com/posts/first-post/</link>
      <pubDate>Sat, 27 Nov 2021 18:28:52 +0000</pubDate>
      
      <guid>https://specbranch.com/posts/first-post/</guid>
      <description>
        
          
            &lt;p&gt;Hello everyone, and welcome to my blog.&lt;/p&gt;
&lt;p&gt;I am an ex-Google senior engineer focused on systems programming and performance optimization. My
past experience includes hardware engineering, numerical analysis, and high-frequency trading.&lt;/p&gt;
&lt;p&gt;Here, we will be talking about software engineering, performance, computer systems and foundations,
interesting math concepts, electrical engineering, hardware acceleration, FPGAs, and more. We may
also branch out to non-technical topics including companies, organizational psychology, and the
stock market.&lt;/p&gt;
          
          
        
      </description>
    </item>
    
  </channel>
</rss>
