Thursday, November 28

Hyrumtoken: A Go package to encrypt pagination tokens

hyrumtoken

hyrumtoken is a Go package to encrypt pagination tokens, so that your API
clients can’t depend on their contents, ordering, or any other characteristics.

Installation

go get github.com/ssoready/hyrumtoken

Usage

hyrumtoken.Marshal/Unmarshal works like the equivalent json functions,
except they take a key *[32]byte:

var key [32]byte = …

// create an encrypted pagination token
token, err := hyrumtoken.Marshal(&key, “any-json-encodable-data”)

// parse an encrypted pagination token
var parsedToken string
err := hyrumtoken.Unmarshal(&key, token, &parsedToken)

You can use any data type that works with json.Marshal as your pagination
token.

Motivation

Hyrum’s Law goes:

With a sufficient number of users of an API, it does not matter what you promise in the contract: all observable
behaviors of your system will be depended on by somebody.

Pagination tokens are one of the most common ways this turns up. I’ll illustrate
with a story.

Getting stuck with LIMIT/OFFSET

I was implementing an audit logging feature. My job was the backend, some other
folks were doing the frontend. To get them going quickly, I gave them an API
documented like this:

To list audit log events, do GET /v1/events?pageToken=…. For the first
page, use an empty pageToken.

That will return {“events”: […], “nextPageToken”: “…”, “totalCount”: …}.
If nextPageToken is empty, you’ve hit the end of the list.

To keep things real simple, my unblock-the-frontend MVP used limit/offset
pagination. The page tokens were just the offset values. This wasn’t going to
work once we had filters/sorts/millions of events, but whatever! Just rendering
the audit log events was already a good chunk of work for the frontend folks,
and we wanted to work in parallel.

A week ensues. The frontend folks came back with a UI that had one of these at
the bottom:

Weird. The documented API doesn’t really promise any affordance of “seeking” to
a random page. “If you’re on page 1 and you click on 3, what happens?” The
reply: “We just set the pageToken to 300”.

This happened because folks saw the initial real-world behavior of the API:

GET /v1/events
{“events”: [… 100 events …], “nextPageToken”: “100”, “totalCount”: “8927”}

GET /v1/events?pageToken=100
{“events”: [… 100 events …], “nextPageToken”: “200”, “totalCount”: “8927”}

And so it didn’t matter what you document. People will guess what you meant, and
it really looks like you meant to make pageToken be an offset token.

The fun part about this story is that I in fact have lied to you. We knew
keyset-based pagination was coming, and so we needed a way to encode potentially
URL-unsafe data in pageToken. So right from the get-go we were base64-encoding
the token. So the actual requests looked like:

GET /v1/events
{“events”: [… 100 events …], “nextPageToken”: “MTAwCg==”, “totalCount”: “8927”}

GET /v1/events?pageToken=MTAwCg==
{“events”: [… 100 events …], “nextPageToken”: “MjAwCg==”,  » …
Read More