audit-2026-05-17-auth

blocking-launchtype/audittopic/authtopic/securitylaunch/blocker

Auth audit — 2026-05-17 (pre-launch)

2026-05-18 update: Direction shifted — auth is being migrated to Auth.js v5 (Google + Credentials) rather than continuing to patch the bespoke JWT system. Launch slips ~2 weeks to W23+. See [[Projects/personal-finance-notion/decisions/adr-2026-05-18-authjs-migration|ADR 2026-05-18 — Migrate to Auth.js v5]]. Findings remap:

  • C1 / C2 / C3 → Phase B of migration (custom flows on top of Auth.js Credentials)
  • C4 → Done 2026-05-18 (commit pending)
  • H3 → Done 2026-05-18 (commit pending); moot under Auth.js
  • H4 → Done 2026-05-18 (commit pending)
  • H1 / H2 → Phase C of migration
  • M1 / M3 → Superseded by Auth.js (legacy cookie names + dual JWT secrets go away)
  • M2 / M4 / M5 → Still apply, addressed in Phase C
  • L1 → Elevated; Phase C ships per-user API keys for the integration API

Status: 4 CRITICAL + 4 HIGH findings before public multi-user launch next week. Core JWT design is strong; gaps are around account lifecycle (no password reset, no email verification) and information leakage (enumeration, error responses).

Context

Shipping to public production next week with open signups. Financial app — token compromise = direct access to bank/transaction data, so the bar is higher than a typical CRUD product. This document is a point-in-time audit; fixes are not implemented here — see the Pre-launch sprint below to triage.

Scope confirmed up front:

  • Audience: public multi-user, open signups.
  • Integrations API: disabled at launch (INTEGRATION_API_TOKEN_PREFIX unset → routes return 503). Concern noted but not blocking.
  • Output: audit report only — no code changes in this pass.
  • Password reset: must ship for launch.

TL;DR

Overall posture: strong core, with a small number of production-blocking gaps.

The JWT design is the best part of this codebase: rotating refresh families with JTI swap (src/lib/auth.ts:103-121, src/lib/auth.ts:365-434), blacklist with TTL cleanup (src/lib/utils/tokenBlacklist.ts), __Host- cookies in production (src/lib/auth-cookies.ts:7-8), 15m/7d expiry split, bcryptjs (10 rounds), per-email rate limiting with account lockout. Edge middleware does signature-only checks and defers revocation to server (src/middleware.ts:51-57). Well-thought-out work.

Blocking a public launch:

  1. No password reset flow — users will be permanently locked out on lost passwords.
  2. No email verification — attacker can sign up with victim@example.com, get a session, and own the account before the real user does. For a finance app this is reputation-ending.
  3. Signup leaks email existencesrc/lib/actions/user.ts:125 throws "Email already exists", enabling account enumeration.
  4. API error messages leak internalssrc/lib/apiHelper.ts:86-93 returns raw error.message to client and console.errors the full object.

The rest is hardening (CSP, login timing-equalisation, integration token redesign for later, etc.) — important but not gating.


Findings

Severity scale: CRITICAL = block launch · HIGH = strongly recommend before launch · MEDIUM = ship a patch in the first week · LOW = backlog.

CRITICAL — block launch

C1. No password-reset flow

  • Evidence: grep -r "forgot.?password\|reset.?password\|verify.?email" returns nothing across src/.
  • Impact: First user who forgets their password is locked out forever (account lockout after 5 failed attempts at src/lib/actions/user.ts:197 makes this worse). Support load + churn.
  • Fix: See Password reset design below.

C2. No email verification on signup

  • Evidence: src/lib/actions/auth.ts:155-165 — signup immediately calls createTokenPair and setTokenCookies. No EmailVerificationToken model, no verification gate anywhere.
  • Impact: Account pre-claim attack — attacker signs up with victim@example.com before the real victim. Victim later tries to sign up, hits the C3 enumeration error, and can never claim their own address. Worst-case impersonation for a finance app.
  • Fix: Add emailVerified: boolean to userModel.ts. Issue an EmailVerificationToken (random 32-byte, hashed at rest, TTL 24h, one-time use). Gate loginAction (and ideally the immediate post-signup token issuance) on emailVerified === true. Reuse the same Resend/SES/SMTP transport as password reset.

C3. Account enumeration on signup

  • Evidence: src/lib/actions/user.ts:123-126:
    const existingUser = await User.findOne({ email: email.toLowerCase() });
    if (existingUser) {
      throw new Error("Email already exists");
    }
    
    Error bubbles back to the client via src/lib/actions/auth.ts:181-189.
  • Impact: Attacker can validate whether any given email has an account — precursor to credential-stuffing, phishing, and the C2 pre-claim attack.
  • Fix: On duplicate email, return the same generic success response and silently send an "account already exists, did you forget your password?" email to the existing user. (Standard pattern; see Auth0 / Google.) Pair with C2's verification step so the response shape is identical for new vs. existing emails.

C4. API error responses leak internal detail

  • Evidence: src/lib/apiHelper.ts:85-93:
    } catch (error) {
      console.error("API Error:", error);
      ...
      const response = createErrorResponse(
        error instanceof Error ? error : "An unexpected error occurred",
        code
      );
      return NextResponse.json(response, { status: response.code });
    }
    
    createErrorResponse (line 29-40) returns the raw error.message in JSON.
  • Impact: Mongoose validation errors, connection errors, internal Error("Failed to update user settings") strings, etc. surface to clients. console.error puts full stack traces in Vercel logs — fine for ops, but combined with the JSON leak it's a bad pattern.
  • Fix: Two changes in apiHelper.ts:
    1. In production, return a generic message for 5xx: code >= 500 ? "Internal server error" : error.message. Keep 4xx detail (validation messages are useful).
    2. Replace console.error with the existing logger.error (already used in src/lib/actions/*).

HIGH — strongly recommend before launch

H1. No Content-Security-Policy header

  • Evidence: src/middleware.ts:84-94 sets X-Content-Type-Options, X-Frame-Options: DENY, X-XSS-Protection, Referrer-Policy, and Strict-Transport-Security in prod. No CSP.
  • Impact: XSS becomes catastrophic without CSP — stored-XSS via a transaction note/category name could hijack the UI and inject fake transactions. CSP is the single most effective XSS mitigation header.
  • Fix: Add CSP in next.config.mjs async headers() (not currently used) or in middleware. Tight starting policy:
    default-src 'self';
    script-src 'self' 'unsafe-inline';
    style-src 'self' 'unsafe-inline';
    img-src 'self' data: blob:;
    connect-src 'self' https://openrouter.ai;
    frame-ancestors 'none';
    base-uri 'self';
    form-action 'self';
    
    Run report-only first (Content-Security-Policy-Report-Only) for 24h, watch the console, then enforce.

H2. Login timing leaks email existence

  • Evidence: src/lib/actions/user.ts:153-216. If the email doesn't exist, function returns at line 168 (no bcrypt, no DB write). If the email exists but the password is wrong, code runs bcrypt.compare (line 191), then user.save() (line 210). The two branches differ by tens of milliseconds.
  • Impact: Account enumeration via response timing. Less obvious than C3 but observable from a script.
  • Fix: Always run bcrypt.compare against a dummy hash when the user doesn't exist. Precompute one constant (e.g. const DUMMY_HASH = await bcrypt.hash("dummy", 10)). Compare against it in the not-found branch.

H3. Token blacklisted on automatic refresh is racy

  • Evidence: src/lib/auth.ts:346-352 — when access token is near expiry, current token is added to blacklist and a new one is issued in the same handler. Concurrent requests from the same user can race into tokenBlacklist.add(accessToken, ...), and the in-flight request then returns "blacklisted".
  • Impact: Occasional spurious 401s during the 5-minute pre-expiry window. Hard to repro in dev; may surface as flaky e2e.
  • Fix: Don't blacklist access tokens on rotation — let them expire naturally. Blacklisting access tokens only matters on logout (already handled separately at src/lib/actions/auth.ts:208-211). Remove line 348 in auth.ts.

H4. PWA caches authenticated API GETs to disk

  • Evidence: next.config.mjs:24-37api-reads runtimeCache with StaleWhileRevalidate for any same-origin GET /api/*. Service Worker cache is per-origin, not per-user.
  • Impact: On a shared device (family laptop), User B can briefly see User A's last cached responses for up to 5 min after A logs out, before SWR refreshes. For a finance app, "balance shown to wrong person for 4 minutes" is a real complaint waiting to happen.
  • Fix: Either (a) exclude /api/* from the SW cache entirely — they're cheap to refetch; offline-fresh financial data is dubious anyway; or (b) clear the api-reads cache from logoutAction via a caches.delete("api-reads") postMessage to the SW. (a) is simpler.

MEDIUM — patch in the first week

M1. Legacy cookie names still readable

  • Evidence: src/middleware.ts:17-28, src/lib/auth.ts:294-296, 303-305 read access_token / refresh_token / session_token (non-__Host-) as fallbacks. A non-__Host- cookie can be set by any subdomain — if you ever serve from *.yourdomain.com, a subdomain XSS could plant a cookie this code would honor over the legitimate one.
  • Fix: Remove the fallback reads after deploy stabilises. Keep only __Host- names in prod, bare names in dev.

M2. Role field exists but never enforced

  • Evidence: src/lib/models/userModel.ts:42-46 defines role: "admin" | "user", propagated through UserDataSchema in src/lib/auth.ts:47. grep for role.*admin\|isAdmin\|requireAdmin returns nothing.
  • Impact: Latent footgun. Today nobody is an admin, so no exposure. The day someone adds an admin route and forgets the check, you have privilege escalation.
  • Fix: Either delete the role field (recommended for v1 — YAGNI) or add a requireRole("admin") helper plus an integration test asserting a user cannot call an admin-only route.

M3. JWT_REFRESH_SECRET_KEY silently falls back to JWT_SECRET_KEY

  • Evidence: src/lib/auth.ts:36-37 — fine for dev, but in production this means an access-token forgery primitive (if you ever leak the access key) is also a refresh-token forgery primitive.
  • Fix: Require both in production:
    if (process.env.NODE_ENV === "production" && !process.env.JWT_REFRESH_SECRET_KEY) {
      throw new Error("JWT_REFRESH_SECRET_KEY must be set in production");
    }
    
    Document in .env.example that the two MUST differ in prod.

M4. CSRF defence rests entirely on SameSite=strict

  • Evidence: src/lib/utils/csrf.ts exists but is documented as not enforced. Server actions rely on Next.js's built-in origin check + SameSite=strict cookies.
  • Impact: Fine for the current surface (no public POST APIs, no third-party-embeddable routes). But if you ever add a webhook or public POST endpoint, you'll need to remember.
  • Fix: Add a one-line comment at the top of any future /api/* POST/PUT/DELETE route. Or delete src/lib/utils/csrf.ts if it's dead code so it doesn't mislead.

M5. Logger error path can throw on circular structures

  • Evidence: Not verified in this audit; flag for review. If logger.error JSON-stringifies arbitrary error objects, MongooseError / native Error chains can have non-serializable properties.
  • Fix: Read src/lib/utils/logger.ts and confirm it has a safe stringifier.

LOW — backlog

  • L1. Integration API token design (when re-enabled). src/lib/integrations/requireIntegrationUser.ts:30-49 uses Bearer = PREFIX + User._id (hex). Mongo _id is not a secret (timestamp + machine + counter, partially predictable; can leak via API responses if _id is ever serialised). Today the route returns 503 because INTEGRATION_API_TOKEN_PREFIX is unset. Before re-enabling, replace with per-user random API keys (32 bytes), hashed at rest, listed/revoked from a settings page, scoped, with explicit expiry. Add an integrationKeys collection.
  • L2. No 2FA / passkeys. For a finance app this will come up in user feedback fast. Plan for v1.1.
  • L3. No login audit log (IP, UA, geo). User.lastLogin is set, but no per-event log. Useful for "recent sessions" UI and forensics.
  • L4. No HaveIBeenPwned check on password set. Cheap to add (one fetch per signup/reset, k-anonymity API).
  • L5. OPENROUTER_API_KEY is server-side only (good). Confirm it's not in any NEXT_PUBLIC_* envs at launch.
  • L6. CI lacks pre-build typecheck + lint. package.json build runs next build only; playwright.yml runs only e2e. Add a lint && tsc --noEmit step.
  • L7. No npm audit / dependabot in CI. Add npm audit --production to the workflow.

What's solid (preserve through any refactor)

  • Rotating refresh families with JTI swap: src/lib/auth.ts:103-121 (issue), :365-434 (rotate). Reuse of a stale refresh JTI revokes the entire family.
  • Token blacklist with TTL auto-cleanup: src/lib/utils/tokenBlacklist.ts + MongoDB TTL index.
  • __Host- cookie prefix in production: src/lib/auth-cookies.ts:7-8. Prevents subdomain cookie injection.
  • Cookie flags: httpOnly: true, secure: isProd, sameSite: "strict" (auth-cookies.ts:25-30).
  • Edge middleware does signature-only: src/middleware.ts:51-57. Full revocation checks deferred to Node getUserData(). bcryptjs wouldn't run on edge anyway.
  • Account lockout after 5 failed attempts for 30 min: src/lib/actions/user.ts:194-208.
  • Per-email login + signup rate limiting: src/lib/actions/auth.ts:67-74, 135-141.
  • Defence in depth: middleware gate + getUserData() server check. A bypass of one doesn't bypass the other.

Password reset design sketch

Required for launch. Minimal viable design.

Schema — new file src/lib/models/passwordResetTokenModel.ts:

{
  userId: string;          // FK to User.userId
  tokenHash: string;       // sha256(token) — NEVER store raw
  expiresAt: Date;         // now + 1h, TTL-indexed
  usedAt: Date | null;     // single-use enforcement
  createdAt: Date;         // for audit + rate limiting
}

TTL index on expiresAt. Compound index on userId + createdAt for rate-limit lookups.

Server actions — new file src/lib/actions/passwordReset.ts:

  • requestPasswordReset(email):
    1. Rate-limit per-email (3/hour) AND per-IP (5/hour) — reuse rateLimiter utility.
    2. Look up user. Always return success regardless (no enumeration).
    3. If user exists: generate crypto.randomBytes(32).toString("base64url"), store sha256 hash, send email with https://<host>/reset-password?token=<raw>.
    4. Invalidate any previously-unused tokens for the same userId (one active token per user).
  • resetPassword(token, newPassword):
    1. Validate newPassword via existing passwordValidation (src/lib/validations/common.ts).
    2. sha256(token), look up, check expiresAt > now && usedAt === null.
    3. Update user password (the existing pre-save hook in userModel.ts:96-106 hashes it).
    4. Mark token usedAt = now.
    5. Invalidate all refresh-token families for this user (reuse revokeRefreshFamilyByFamilyId in a loop, or add a revokeAllRefreshFamiliesForUser(userId) helper). Clear current cookies.
    6. Optionally auto-login by issuing a new token pair, or redirect to /login with a flash.

UI — two new routes added to the public allowlist in src/lib/auth-routes.ts:

  • /forgot-password — email input, calls requestPasswordReset, shows "if an account exists, we've sent a link" success state regardless.
  • /reset-password?token=... — new password + confirm, calls resetPassword.

Email transport — pick one and add to .env.example:

  • Resend (~10 min setup, generous free tier, good DX) — recommended.
  • AWS SES if you already have an AWS account.
  • SMTP/Postmark/SendGrid all work.

Implementation audit checklist:

  • Tokens are 32+ random bytes (base64url).
  • Only the hash is stored; raw token only in the email link.
  • Single-use enforced by usedAt.
  • Expiry ≤ 1h.
  • Rate-limited per-email + per-IP.
  • Response is identical whether the email exists or not.
  • On successful reset, all sessions for the user are revoked.
  • Reset link is HTTPS-only and not logged.

Pre-launch sprint (1 week)

Day-by-day, ordered by dependency:

  1. Day 1 — Stop the bleeding (one-liners / pure deletions):

    • C3: replace the throw at src/lib/actions/user.ts:125 with the "send already-exists email + return identical success" path.
    • C4: edit src/lib/apiHelper.ts:85-93 to gate 5xx detail behind NODE_ENV !== "production".
    • H3: delete src/lib/auth.ts:347-349 (no blacklist on auto-rotate).
    • H4: disable PWA caching for /api/* — remove or guard the api-reads block in next.config.mjs.
  2. Day 2-3 — Email infrastructure:

    • Pick provider (Resend?).
    • Add RESEND_API_KEY, MAIL_FROM to .env.example + startup check.
    • Build a thin sendTransactional(template, to, data) wrapper.
    • Templates: password-reset, account-exists-already, email-verify.
  3. Day 3-4 — Password reset: implement per design above. End-to-end Playwright test.

  4. Day 4-5 — Email verification: add emailVerified to userModel.ts, issue token on signup, gate login on verified, add /verify-email?token= route. Backfill: existing users (if any) start verified.

  5. Day 5 — Hardening:

    • H1 CSP in report-only first, then enforce.
    • H2 timing-equalise login.
    • M3 require both JWT keys in prod.
    • M2 decide role field: delete or enforce.
  6. Day 6 — Verification: full pre-launch checklist (below) on a staging deploy with prod env vars.

  7. Day 7 — Buffer / launch.


Verification before shipping

Run on a staging deploy that mirrors prod (real cookie flags, real HTTPS, real env vars).

Manual auth flow

  1. Sign up → log in → log out → log back in → log out everywhere. Cookies in DevTools must have __Host- prefix, Secure, HttpOnly, SameSite=Strict.
  2. After logout, hit a protected route → redirect to /login.
  3. After login, hit /login → redirect to /.
  4. Steal the __Host-access_token value, log out, paste it back via DevTools → protected route should still 401 (token is blacklisted).
  5. Idle 15+ min, hit a protected route → auto-refresh, no redirect.

Account enumeration (fails today; passes after C3, C2)

  1. POST /signup with an existing email → response identical to a new email signup.
  2. POST /login with nonexistent@example.com vs an existing email + wrong password → time both with curl -w "%{time_total}\n" 20× each. Means should match within ~5ms after H2.

Rate limits

  1. loginAction 6× wrong password → 6th attempt rate-limited.
  2. loginAction 6× same email → account locked for 30 min.

Token rotation (refresh family)

  1. Log in → grab refresh cookie → wait for access expiry → make a request (rotates refresh) → replay OLD refresh value via DevTools → next request clears cookies and forces re-login. Proves family revocation works.

Headers

  1. curl -I https://<staging>/ shows:
    • Strict-Transport-Security: max-age=31536000; includeSubDomains
    • X-Frame-Options: DENY
    • X-Content-Type-Options: nosniff
    • Content-Security-Policy: ... (after H1)

Password reset (after implementation)

  1. Reset for unknown email → 200, no email, no error.
  2. Reset for known email → 200, email arrives.
  3. Use the link → set new password → other-device sessions revoked.
  4. Reuse the link → 400, "already used or expired."
  5. Forged token (different sha256 prefix) → 400.

Edge cases

  1. JWT_SECRET_KEY="" → app crashes at boot (it currently does — preserve this).
  2. NODE_ENV=production without JWT_REFRESH_SECRET_KEY after M3 → crashes at boot.
  3. GET /api/integrations/v1/bank-accounts with no auth + INTEGRATION_API_TOKEN_PREFIX unset → 503.

Automated

  1. npm run test:e2e passes.
  2. Add npm run lint && npx tsc --noEmit to CI before next build (L6).
  3. npm audit --production clean.

Critical files (where changes will land)

  • src/lib/actions/user.ts — C3 (signup enumeration), H2 (login timing)
  • src/lib/actions/auth.ts — wire in email verification gate (C2), password reset actions
  • src/lib/apiHelper.ts — C4 (error leakage)
  • src/lib/auth.ts — H3 (drop blacklist-on-rotate), M3 (require both JWT keys in prod)
  • src/lib/auth-routes.ts — add /forgot-password, /reset-password, /verify-email to public paths
  • src/middleware.ts — H1 (CSP), M1 (drop legacy cookie reads)
  • src/lib/models/userModel.ts — add emailVerified (C2)
  • src/lib/models/passwordResetTokenModel.ts — new, password reset
  • src/lib/models/emailVerificationTokenModel.ts — new, verification
  • src/lib/actions/passwordReset.ts — new
  • src/lib/email/* — new, transactional email transport
  • next.config.mjs — H4 (drop API GET caching) or CSP via headers()
  • .env.example — document JWT_REFRESH_SECRET_KEY requirement in prod, add mail provider keys

Out of scope for this audit

  • Source-level secret-scanning of git history (run gitleaks or trufflehog separately).
  • Dependency CVE scan (run npm audit --production and osv-scanner).
  • Live penetration testing — this is a static review.
  • Notion API integration's data-handling outside auth (the integration API auth is covered; what it does after auth is not).
  • Infrastructure layer (Vercel project settings, DNS, WAF). Worth a separate 1-hour Vercel-dashboard sweep before launch.

Hub

  • [[Projects/personal-finance-notion/personal-finance-notion|MOC]]
  • [[Projects/personal-finance-notion/context/index|Context index]]