Building Aegis: A Simpler OAuth Provider

Most software begins life as an inconvenience. You reach for a thing the world ought to provide pre-assembled—a login screen, a session cookie, a polite little JWT signed in some respectable algorithm—and discover that the available options demand either a monthly tithe to a tenant in someone else’s cloud or the architectural ambition of erecting a colosseum. Aegis was born of that small irritation: the wish for an OpenID Connect provider one could host alongside one’s own applications, in a single repository, with the ergonomics of a modern codebase and the moral clarity of a system whose secrets never leave the premises.

What follows is a tour through how the thing was built, how it is written, and, most importantly, the architecture of distrust that allows it to be entrusted with anything at all.

The Spine: One Repo, Two Pavilions, No Surprises

The repository is organized as an npm workspace with two siblings under packages/. server is a TypeScript Express application; dashboard is a Vite-built React single-page app. There is, deliberately, nothing else: no microservices, no message bus, no Redis ferrying ephemera between processes. State lives in a single SQLite file accessed through better-sqlite3, the synchronous binding whose blunt honesty is, in this context, a virtue. Every database call is a function that either returns or throws, and the entire mental model of concurrency collapses into “the request is the transaction.”

Configuration is gathered at startup by a Zod schema in packages/server/src/config.ts. Every environment variable is parsed, coerced, and bounded; if a single one is malformed, the process refuses to start and prints the offense in red. This is the first of several places where the codebase performs what one might call honest paranoia: it would rather die at boot than run with ambiguous instructions. Secrets like ADMIN_SESSION_SECRET are required to be at least sixteen characters; cookie keys are split into rotation arrays; key paths are resolved relative to the binary, not to wherever the operator happens to be standing in their shell.

The Provider: A Standards-Compliant Heart Wrapped in a Custom Adapter

At the cryptographic core sits oidc-provider, the unflappable Node implementation of the OpenID Connect specifications. Aegis configures it in packages/server/src/oidc/provider.ts, then teaches it to read and write through a SQLiteAdapter (in oidc/adapter.ts) and a client-adapter.ts that resolves OAuth clients out of the same database the rest of the application uses. The signing keys are generated once, on first boot, by crypto.generateKeyPairSync('rsa', { modulusLength: 2048 }); they are written as PKCS8 PEM files into data/keys/ with the directory created on demand, and from that moment forward every ID token the server emits is signed RS256 against a single kid: 'main-signing-key'. The public half is served at the standard JWKS endpoint, where any relying party that can reach the issuer can verify a token without ever phoning home.

Two small, hard-won modifications to the default policy reveal the texture of real-world OIDC work. The first removes the native_client_prompt check from the consent stage, a courtesy to first-party iOS and watchOS clients that should not be subjected to a redundant approval dialog and that, more practically, were having their flows mangled by Cloudflare. The second is a defensive Check added to the login prompt that fires when a session carries an accountId whose user has since been deleted or disabled. Without it, the consent stage would later attempt to dereference a phantom account and crash with a null-property TypeError. These are not abstract correctness wins; they are scars from production, preserved as code, with multi-line comments explaining precisely why the obvious shorter version is wrong. Anyone who has shipped an OIDC provider will recognize the genre.

The Services: A Catalogue of Single-Responsibility Functions

The server’s services/ directory is where the day-to-day labor happens, and it is written in a deliberately old-fashioned style: plain functions, exported by name, taking primitives and returning primitives, with the database fetched through a getDb() accessor at the top of each call. There are no classes pretending to be modules, no dependency-injection containers performing elaborate ceremonies. If a function needs the database, it asks for it. If it needs to return a user, it returns a user.

A few of the services deserve mention because they each encode a security decision worth describing:

user.service.ts hashes passwords with Argon2id at deliberately costly settings (memoryCost: 65536, timeCost: 3, parallelism: 4), chosen so that even a well-funded adversary attempting offline cracking against a stolen database file pays handsomely for every guess. Failed login attempts are accumulated against tiered thresholds (5 → 15 minutes, 10 → 1 hour, 15 → 24 hours), so the cost of online brute force grows geometrically rather than linearly. When a lockout fires, the user is notified by email, itself an event of forensic interest, and one that frequently surfaces compromised passwords before any breach can mature.

session.service.ts never stores a session token in plaintext. The token issued in the cookie is a 48-character nanoid; what lives in the database is its SHA-256 hash. A stolen database thus yields only the digests of active sessions, useless for impersonation. Validation runs the cookie through the same hash and looks it up; lookups touch last_used_at so that idle sessions can be reaped without disturbing live ones.

totp.service.ts is where the encryption-at-rest story begins. TOTP shared secrets are generated by otpauth, encoded as base32 for the user’s authenticator app, and then, before being persisted, encrypted by utils/crypto.ts using AES-256-GCM with a key derived from ADMIN_SESSION_SECRET via SHA-256. Each ciphertext carries its own 12-byte IV and 16-byte auth tag; the format is iv || tag || ciphertext, base64-encoded, with a decryptIfEncrypted helper that detects and migrates legacy plaintext rows. Recovery codes get the more permanent treatment: they are SHA-256 hashed before storage, normalized first to remove dashes and whitespace, and verified by hashing the user’s input and comparing. The user sees their recovery codes exactly once, when they enroll. The database, even compromised, never sees them at all.

audit.service.ts maintains an append-only chronicle of consequential actions, keyed against a discriminated union of literal strings: user.login, user.totp_enable, admin.client_secret_rotate, oauth.authorize, and several dozen more. The AuditAction type is enumerated in code, which means the TypeScript compiler refuses to let an unrecognized verb sneak into the log; misspellings become build errors rather than blank entries that a future investigator would have to reconcile by hand. Every entry captures the actor, the resource, the IP, and the user agent, and every meaningful endpoint in the codebase emits one.

analytics.service.ts dresses raw login events in geographic clothing. A self-hosted MaxMind GeoLite2 database (loaded only if GEOIP_DB_PATH is set; gracefully absent otherwise) translates IP addresses into country, city, and approximate coordinates; ua-parser-js extracts browser, OS, and device class from the user agent. Crucially, none of this telemetry leaves the box. There is no third-party analytics SDK, no pixel, no fetch to a beacon endpoint. The geographic lookup happens in-process against a flat file, and the resulting data is rolled into hourly and daily aggregate tables that the dashboard renders into the small constellations operators use to spot anomalies.

Defense in Depth: A Mille-Feuille of Refusals

Authentication endpoints are not protected by a single rate limit but by three. globalRateLimit caps any individual IP at the configured ceiling. authRateLimit keys on ${ip}:${email}, slowing distributed credential-stuffing where the attacker rotates emails behind a single host. emailRateLimit keys on the email alone, which is the version that catches the harder case: a botnet rotating IPs against a single victim’s account. All three skip successful requests, because the goal is to punish guessing, not to penalize users who type their own passwords correctly. A fourth, strictRateLimit, exists for the truly punishing endpoints at five attempts per hour per IP, reserved for the ones that ought to be expensive. This is honest paranoia at its most practical: each layer assumes the one beneath it has been bypassed.

The HTTP boundary is hardened by helmet with a Content-Security-Policy enforced in production, cors configured against an origin allowlist persisted in app settings (so administrators can add a relying party’s origin without redeploying), and a clientBasedCORS callback inside the OIDC provider that checks the request’s Origin against the calling client’s registered redirect URIs. Server-to-server token requests, which carry no Origin header, are deliberately allowed through; browser-originated requests must come from a host the client itself has declared. This is one of those subtle distinctions whose violation, in either direction, breaks something valuable: too strict and confidential clients can’t exchange codes; too loose and any page on the internet can spend a stolen authorization code.

The avatar upload route in auth.routes.ts deserves a small mention as a representative of the codebase’s attitude toward user-supplied filesystem operations. Multer is configured with an explicit MIME allowlist (image/jpeg, image/png, image/gif, image/webp), a 2 MiB size cap, and filenames composed from the user’s ID and a timestamp, never from the uploaded filename. Deletion is gated by a safeDeleteFile helper that resolves the candidate path and refuses to act unless it lies strictly within the expected base directory. The defense against path traversal is not a regex hoping to catch ../; it is a structural check that no realpath outside the avatars directory can ever satisfy. Honest paranoia, again: trust nothing about the input, only the structure of the result.

The Dashboard: A Drawing Room with the Same Standards as the Vault

The administrative interface lives in packages/dashboard, served at /panel/ (a Vite base setting, learned the painful way after a blank-page episode preserved in commit 0366dc0). It is a React 18 SPA built with Tailwind for styling, React Router for navigation, and Zustand-flavored stores under src/stores/ for state. Every administrative action it offers (creating clients, rotating secrets, issuing invites, minting API keys, disabling users, revoking sessions) terminates in an admin.routes.ts handler that checks req.user.is_admin via the adminMiddleware and writes an audit entry before returning. The dashboard, in other words, has no special powers; it is simply a particularly well-dressed client of the same API anyone else could call.

Quickstart and integration documentation are first-class pages, not afterthoughts. Quickstart.tsx will even compose an AI-targeted prompt that another developer can paste into their own assistant to wire up a relying party, the sort of small, considerate touch that betrays a project built by someone who has been on the receiving end of bad documentation.

The Process: A Commit Log as a Confession

Reading the recent git log is its own form of architectural documentation. There is the moment OAuth was made fully featured (98932c5); the long arc of native-client support (56fa05e, 627583e, 8be2beb, db0a963, 1916401, 8a184d1); the embarrassing-in-retrospect blank-page bug and its fix (0366dc0); the production debugging of a 403 on the token endpoint that turned out to be a CORS interaction (d2ccb3e, 748cacc, f0cea6d), where the temporary debug logging was added, used, and then removed, which is itself a discipline; and the consent-crash patch (4220540) whose code-comment in provider.ts reads like a small confessional about exactly which assumption proved false in production.

The shape of the history matters because it is the history of a system being honest with itself. Most of the bug-fix commits are not about features that didn’t work; they are about edge cases the spec didn’t explicitly cover, or about the friction between idealized OIDC flows and the dampening effect of CDNs, mobile WebViews, double-submitting forms, and users who delete their own accounts in the middle of an active session. The codebase is the precipitate of those encounters, and its comments, sparse but pointed, are usually warnings to the next person who is about to try the obviously-shorter approach.

A Posture, Not a Product

Aegis does not aspire to dominance. It is small enough to read in an afternoon and self-contained enough to deploy on a single VM behind a single TLS terminator. Its security model is built on a small number of unfashionable but durable convictions: hash everything you don’t need in the clear, encrypt the few things you do, log every consequential action, distrust user input until a structural check has cleared it, and never ship a feature whose failure mode you can’t describe in a sentence. Its privacy model follows from those: there is no telemetry, no third-party identity provider, no SDK exfiltrating events to someone else’s dashboard. The data lives where the operator put it, and only there.

It is the kind of software that, if it does its job well, will never be especially interesting again—which is, after all, the highest compliment one can pay to an authentication system.