Numerical Libraries and Functions

Exponentials in 3 Instructions

May 4, 2025 · 10 min read · Mathematical Algorithms Performance ·

This post expands on an algorithm shown in the book I wrote on floating-point math. It is very common in computing to want to do $e^x$ very quickly and not care very much about how accurately you computed it. This is increasingly true in ML and AI algorithms, which can be very tolerant to noise from numerical error and …

Perfect Random Floating-Point Numbers

May 3, 2025 · 16 min read · Mathematical Algorithms ·

Share on:

When I recently looked at the state of the art in floating point random number generation, I was surprised to see a common procedure in many programming languages and libraries that is not really a floating-point algorithm: Generate a random integer with bits chosen based on the precision of the format. Convert to …

The Most Useful Statistical Test You Didn't Learn in School

Jul 4, 2022 · 9 min read · Performance Mathematical Algorithms ·

Share on:

In performance work, you will often find many distributions that are weirdly shaped: fat-tailed distributions, distributions with a hard lower bound at a non-zero number, and distributions that are just plain odd. Particularly when you look at latency distributions, it is extremely common for the 99th percentile to be …

Fixed Point Arithmetic

May 18, 2022 · 12 min read · Mathematical Algorithms Performance ·

Share on:

When we think of how to represent fractional numbers in code, we reach for double and float, and almost never reach for anything else. There are several alternatives, including constructive real numbers that are used in calculators, and rational numbers. One alternative predates all of these, including floating point, …

Racing the Hardware: 8-bit Division

Feb 22, 2022 · 19 min read · Mathematical Algorithms Division Performance ·

Share on:

Occasionally, I like to peruse uops.info. It is a great resource for micro-optimization: benchmark every x86 instruction on every architecture, and compile the results. Every time I look at this table, there is one thing that sticks out to me: the DIV instruction. On a Coffee Lake CPU, an 8-bit DIV takes a long time: …