Will Murphy's personal home page

Primeversary Emailer Part 3: Lambda

Last time we looked at an AWS Step Functions state machine that provides the flow-charty part of the logic for an app that emails me and my (future) wife when we’ve been married a prime number of days. Today we’re going to look at the lambda that actually runs this calculation.

I’m building this on AWS, and “bit of code that runs occasionally but is idle the rest of the time” is a pretty good use case for AWS Lambda, so I decided to build a Lambda. I had a few considerations:

  1. My code runs once a day, so my lambda will always cold start
  2. Factoring prime numbers and similar maths are potentially slow

I had read a post showing that Rust has excellent performance within AWS Lambda, and Rust is always a language I wish I used more, so I decided to write the whole thing in Rust.

I also realized that answers to questions like “is 79 prime” are just facts about the universe, and there’s no need for my poor code to figure them out at runtime. I can just pre-compute a bunch of stuff at build time and not pay AWS to re-derive mathematical facts that I can learn for free.

So here’s how to frame the problem. Let’s assume (somewhat optimistically) that we will live 200 years after we get married, so valid numbers of days are between 0 and 73000 (that is, 200 * 365). 73K seems like a very manageable number for modern computers.

So now my program has two steps:

  1. Generate a list of primes less thank 73K
  2. Check if a number is on the list

For these problems, two approaches seem pretty straightforward:

  1. Sieve of Eratosthenes
  2. Binary Search

Binary Search is a common enough algorithm that it’s just in the Rust standard library, so we don’t even have to write it, which is nice. Eratosthenes’s sieve is, as far as I know, not in any standard libraries, so we’ll code that one up.

At first, I thought that, because Rust macro expansion runs at build time, I could implement code in Rust macros that runs Eratosthenes’s sieve. My first attempt looked like this:

const MAX_DAYS: usize = 73000;

macro_rules! low_primes {
    () => {{
        let mut sieve = vec![true; MAX_DAYS + 1];
        let mut result: Vec<usize> = Vec::new();
        sieve[0] = false;
        sieve[1] = false;
        for n in 2..MAX_DAYS {
            if sieve[n] {
                result.push(n);
                for multiple in (n..MAX_DAYS).step_by(n) {
                    sieve[multiple] = false;
                }
            }
        }
        result
    }};
}

Then I thought I could just have low_primes()!.binary_search(candidate), but I was wrong. That macro expands at build time into code that runs the sieve at runtime. That doesn’t really help the problem I’m trying to solve.

It turns out that the Rust feature I was looking for wasn’t macros at all, but was build.rs. A build.rs file defines a Rust program that gets built and runs before cargo begins compilation of the rest of the project. People use this for things like compiling C code that the rust needs to link to, but you can also implement code generation in it, and it solves the “emit a list of primary nubmers <=73000 at build time” problem quite nicely. Here’s what I ended up with:

// build.rs
fn main() {
    let out_dir = env::var_os("OUT_DIR").unwrap();
    let dest_path = Path::new(&out_dir).join("hello.rs");
    let mut w = File::create(dest_path).unwrap();
    let lp = low_primes!(); // same macro as above
    write!(
        &mut w,
        "
const LOW_PRIMES : [u64; {}] = {:?};
fn is_prime(candidate: u64) -> bool {{
    LOW_PRIMES.binary_search(&candidate).is_ok()
}}
        ",
        lp.len(),
        lp,
    )
    .unwrap();
    println!("cargo:rerun-if-changed=build.rs");
}

Before cargo starts compiling my main project, it compiles and runs a program that (1) generates a list of primes <=73000, and then just prints them into the source code file. The generated file will look like this:

// generated
const LOW_PRIMES : [u64; /*length of omitted array literal */] = /*omitted because super long */;
fn is_prime(candidate: u64) -> bool {{
    LOW_PRIMES.binary_search(&candidate).is_ok()
}}

And then is_prime is available to my program at runtime. This means that the list of all the primes I care about is just present in the executable, and all that happens at runtime is that the program checks how many days it’s been, and then checks whether that number is on the list. The worst case number of steps for this program is the log base 2 of 73_000, rounded up. Let’s check real quick in python (my current favorite calculator):

>>> math.ceil(math.log2(73_000.0))
17

17 integer comparisons seems like a pretty acceptable runtime cost for our problem. Let’s see how it runs in Lambda:

I’m willing to declare victory on making that function frugal, since it runs in less time than the minimum billing granularity.

I also just noticed a benefit of the state machine architecture I discussed last time: The lambda doesn’t do anything customer visible (like email my wife), so I can just run it whenever I want to make sure it’s working. That’s a big advantage.

That’s all for today. Next time, we’ll talk about actually deploying and updating a lambda that’s in Rust.

Till then, happy learning!
– Will

Discussion

Love this post? Hate it? Is someone wrong on the internet? Is it me? Please feel free to @me on mastodon

I used to host comments on the blog, but have recently stopped. Previously posted comments will still appear, but for new ones please use the mastodon link above.