Why use SST instead of Vercel for a Next.js app?

Vercel is genuinely easier for straightforward deployments. If you need zero infrastructure config and fast preview environments, it wins on simplicity. We moved to SST because Storyie has requirements that Vercel either can't support or charges a lot for: IP-restricted staging environments, CloudFront cache policy overrides per route, Lambda-based cron jobs managed in the same codebase, and wildcard subdomains for multi-tenant user pages (`{username}.storyie.com`). SST handles all of that in TypeScript. The tradeoff is slower deploys (2–5 minutes vs. Vercel's ~30 seconds) and more complex debugging via CloudWatch.

What does the SST stage system actually give you?

A stage is a completely isolated deployment of your entire stack. Running `sst deploy --stage production` and `sst deploy --stage staging` produces two independent sets of CloudFront distributions, Lambda functions, S3 buckets, and DNS records. You write a single `sst.config.ts` that branches on `input.stage` to vary domains, env files, removal policies, and any other per-environment differences. The `removal: "retain"` option is particularly useful for production — it prevents accidental resource destruction if you ever run `sst remove`. Staging gets `removal: "remove"` so leftover resources clean themselves up.

How do CloudFront Functions work for IP restriction, and what are the gotchas?

CloudFront Functions run at edge on the `viewer-request` event, before the request ever hits your Lambda. They are extremely fast and cheap — far lighter than Lambda@Edge. The restriction code checks the client IP against an allowlist and returns a 403 for anything outside it. Two important gotchas: first, CloudFront Functions run in a strict ES5 environment, so modern JavaScript like `Array.prototype.includes()` and `const` will silently fail or throw. Use `indexOf()` and `var`. Second, always add bypass paths for webhook endpoints — Stripe, GitHub webhooks, or any other external service will be blocked by IP restriction and break your integrations if you don't explicitly skip those paths.

Why does the OAuth callback route need a special cache policy?

OAuth authorization codes are single-use. If CloudFront serves a cached response for `/api/auth/callback`, the second request that hits the cache gets the same (now-invalid) code and authentication breaks. The fix is to attach the `Managed-CachingDisabled` policy to that path pattern via an ordered cache behavior. SST exposes `transform.cdn` to customize the CloudFront distribution Pulumi resource directly, so you can prepend your custom behavior before SST's defaults. The `$resolve` helper is necessary to unwrap Pulumi's `Input ` types before you can spread properties from the default behavior.

How are the cron jobs structured, and why offset them?

Each cron job is an `sst.aws.Cron` resource that triggers a Lambda on a schedule. We run seven jobs: metrics aggregation, view and like counters, tag extraction, diary reminders, weekly summary emails, and milestone detection. The view aggregator runs at `:00` past every four hours; the like aggregator runs at `:30`. That 30-minute offset is intentional — both queries hit the same database, and staggering them avoids a simultaneous spike. The same logic applies to any pair of heavy batch jobs that share a data store. Naively scheduling everything at `:00` is fine until you hit scale, and fixing it retroactively is annoying.

What is the recommended cache header strategy for Next.js assets on CloudFront?

Versioned assets — anything under `_next/static/` — get a one-year TTL plus `immutable`, because the filename hash changes on every build. There is no reason to revalidate them. Non-versioned files (like `favicon.ico` or `robots.txt`) get no browser cache (`max-age=0`) but a one-day CDN cache with stale-while-revalidate. This means the CDN serves stale content for up to 10% of the CDN TTL while revalidating in the background, preventing thundering-herd reloads on every visitor. Deploy-time invalidation with `paths: "all"` flushes the CDN immediately so users never see stale HTML after a release.

Deploy Next.js to AWS with SST v3 (not Vercel)

Storyie's web app runs on Next.js, deployed to AWS via SST v3. The choice isn't exotic — most Next.js apps are fine on Vercel — but Storyie has enough infrastructure requirements that managing our own AWS stack with SST pays off in control and cost. This post walks through our actual sst.config.ts: the stage setup, the CloudFront customizations, Lambda tuning, and the cron job schedule.

TL;DR

sst.aws.Nextjs wraps OpenNext to build a Lambda + CloudFront + S3 deployment from a single component declaration.
Stages (production, staging) share one config file and produce fully isolated environments. The removal policy, env file, and domain all branch on input.stage.
CloudFront Functions handle staging IP restriction at the edge — written in ES5, with explicit bypasses for webhook paths.
OAuth callback routes need Managed-CachingDisabled to prevent auth codes from being served from cache.
Graviton (arm64) Lambda is ~20% cheaper with no meaningful performance difference.
Seven cron jobs run on staggered schedules to spread DB load.

Concern	Mechanism
Next.js deployment	`sst.aws.Nextjs` via OpenNext
Environment isolation	SST stages with per-stage domains and env files
Staging IP restriction	CloudFront Function on `viewer-request`
OAuth cache bypass	Ordered cache behavior with `Managed-CachingDisabled`
Lambda cost optimization	`arm64` + tuned memory per function type
Scheduled batch processing	`sst.aws.Cron` with offset schedules
Env var source of truth	`sst.config.ts` — infra code, not `.env` files

What SST gives you out of the box

SST's sst.aws.Nextjs component is the main primitive. It calls OpenNext internally to split your Next.js app into Lambda functions (server rendering, image optimization) plus S3 (static assets) plus CloudFront (CDN/routing), then wires it all together.

new sst.aws.Nextjs("StoryieWeb", {
  path: "./apps/web",
  domain: {
    name: "storyie.com",
    aliases: ["*.storyie.com"],
    dns: sst.cloudflare.dns(),
  },
});

That block creates the CloudFront distribution, Lambda functions, and S3 bucket. Storyie uses Cloudflare for DNS, so we pass sst.cloudflare.dns() and SST handles the CNAME/alias records automatically.

The wildcard alias (*.storyie.com) is required for multi-tenant user subdomains. Every user gets a {username}.storyie.com page, so we need CloudFront to match any subdomain and route it through Next.js.

Stage isolation

SST's stage system is one of its strongest features. A single config file generates completely separate infrastructure per stage:

app: async (input) => {
  const stage = input?.stage || "dev";
  const envFile = stage === "production" ? ".env.production" : ".env";

  return {
    name: "storyie",
    removal: input?.stage === "production" ? "retain" : "remove",
    home: "aws",
  };
},

Two things here worth calling out:

removal: "retain" means that if you ever run sst remove on the production stage, the CloudFormation stack is deleted but the underlying resources (S3 buckets, Lambda functions, CloudFront distribution) are retained. It's a safety net against accidental destruction. Staging uses "remove" so it cleans itself up.

Domain branching gives staging its own subdomain with the same wildcard structure:

const domain =
  stage === "production"
    ? {
        name: "storyie.com",
        aliases: ["*.storyie.com"],
        dns: sst.cloudflare.dns(),
      }
    : {
        name: "staging.storyie.com",
        aliases: ["*.staging.storyie.com"],
        dns: sst.cloudflare.dns(),
      };

staging.storyie.com mirrors production's multi-tenant structure. Any feature involving user subdomains can be tested against {username}.staging.storyie.com before it goes live.

CloudFront Function for IP restriction

Staging is restricted by IP. Only our office IPs and personal connections can reach it. We implement this with a CloudFront Function on the viewer-request event — it runs at edge before the request hits Lambda, so blocked traffic costs nothing beyond the CloudFront request price.

const ipRestrictionCode =
  stage !== "production"
    ? `
var allowedIPs = ["221.246.xxx.xxx", "153.166.xxx.xxx"];
var clientIP = event.viewer.ip;
var uri = event.request.uri;

// Webhook paths come from external services — bypass the IP check
var bypassPaths = ["/api/stripe/webhook"];
var shouldBypass = bypassPaths.some(function(path) {
  return uri === path || uri.startsWith(path + "?");
});

if (!shouldBypass && allowedIPs.indexOf(clientIP) === -1) {
  return {
    statusCode: 403,
    statusDescription: "Forbidden",
    headers: { "content-type": { value: "text/html" } },
    body: "Access denied.",
  };
}
`.trim()
    : undefined;

Three things to get right here:

CloudFront Functions run in ES5. No const, no let, no Array.prototype.includes(). Use var and indexOf(). Writing modern JS here will either fail silently or throw a CloudFront runtime error that's annoying to debug.
Webhook paths must be bypassed explicitly. Stripe's webhook events come from Stripe's IP ranges, not ours. Without the bypass, stripe trigger in development and live webhook deliveries both get 403'd.
The injection API takes a code string. SST's edge.viewerRequest.injection injects your code into the CloudFront Function's handler before the return statement:

edge: ipRestrictionCode
  ? {
      viewerRequest: {
        injection: ipRestrictionCode,
      },
    }
  : undefined,

Disabling the cache on OAuth routes

OAuth authorization codes are one-time-use. If CloudFront caches the response from /api/auth/callback, the second visitor (or retry) that hits the cache gets a stale response with an already-consumed code, and authentication breaks.

The fix is an ordered cache behavior that attaches Managed-CachingDisabled specifically to the /api/auth/* path pattern:

const cachingDisabledPolicy = await aws.cloudfront.getCachePolicy({
  name: "Managed-CachingDisabled",
});

const authCacheBehavior = {
  pathPattern: "/api/auth/*",
  viewerProtocolPolicy: "redirect-to-https",
  allowedMethods: ["GET", "HEAD", "OPTIONS", "PUT", "POST", "PATCH", "DELETE"],
  cachedMethods: ["GET", "HEAD"],
  cachePolicyId: cachingDisabledPolicy.id,
  compress: true,
};

SST exposes transform.cdn to reach the underlying Pulumi CloudFront resource. We prepend our behavior ahead of SST's defaults using $resolve to unwrap the Input<T> types:

transform: {
  cdn: (args) => {
    args.orderedCacheBehaviors = $resolve([
      args.orderedCacheBehaviors,
      args.defaultCacheBehavior,
    ]).apply(([existing, defaultBehavior]) => {
      const existingBehaviors = Array.isArray(existing) ? existing : [];
      return [
        {
          ...authCacheBehavior,
          targetOriginId: defaultBehavior.targetOriginId,
          originRequestPolicyId: defaultBehavior.originRequestPolicyId,
          functionAssociations: defaultBehavior.functionAssociations,
        },
        ...existingBehaviors,
      ];
    });
  },
},

The $resolve + .apply() pattern is the correct way to work with Pulumi's async Input types in SST — trying to read args.defaultCacheBehavior directly gives you a Pulumi Output, not the actual value.

Lambda configuration

server: {
  memory: "1024 MB",
  runtime: "nodejs22.x",
  architecture: "arm64",
  timeout: "20 seconds",
},
imageOptimization: {
  memory: "1536 MB",
},

arm64 (AWS Graviton) is meaningfully cheaper — roughly 20% less per GB-second than x86 — with equivalent or better performance for Node.js workloads. There's no reason not to use it for new deployments.

Image optimization gets more memory than the server function because Next.js's <Image> resize pipeline is memory-hungry. We found 1536 MB eliminates the occasional OOM on large uploaded images; the server function runs fine at 1024 MB.

nodejs22.x is the current LTS runtime. OpenNext keeps up with Node.js releases, so staying on the latest LTS gets you security patches without breaking changes.

Cron jobs

SST's sst.aws.Cron maps directly to EventBridge Scheduler → Lambda. All our background jobs live in the same sst.config.ts, next to the web deployment:

Job	Schedule	Purpose
PerformanceAggregator	Daily at 02:00 UTC	Aggregate performance metrics
ViewAggregator	Every 4 hours at :00	Count diary views
LikeAggregator	Every 4 hours at :30	Count diary likes
TagManager	Every hour	Extract and sync tags from diaries
DiaryReminderNotifier	Every 15 minutes	Send diary reminder push notifications
WeeklySummaryEmailSender	Sundays at 12:00 UTC	Send weekly summary emails
MilestoneEmailSender	Every 4 hours	Detect milestones and send emails

The 30-minute offset between ViewAggregator and LikeAggregator is deliberate:

// Views at :00
new sst.aws.Cron("ViewAggregator", {
  schedule: "cron(0 */4 * * ? *)",
  // ...
});

// Likes at :30 — staggered to avoid simultaneous DB load
new sst.aws.Cron("LikeAggregator", {
  schedule: "cron(30 */4 * * ? *)",
  // ...
});

Both jobs hit the same database tables. Running them simultaneously would double the instantaneous query load. Offsetting by 30 minutes costs nothing and keeps the DB load smooth.

Environment variables: one source of truth

SST passes environment variables to Lambda via the environment property. The principle we follow: stage-specific values live in infra code, not in .env files.

environment: {
  NEXT_PUBLIC_BASE_URL:
    stage === "production"
      ? "https://storyie.com"
      : "https://staging.storyie.com",
  // ... other vars
}

Even if .env.production contains NEXT_PUBLIC_BASE_URL=https://storyie.com, the sst.config.ts value overrides it at deploy time. This matters because .env files and infra code can drift independently. If the URL is defined in both places, one of them will eventually be wrong. Centralizing stage-specific values in sst.config.ts makes it the authoritative source.

Cache headers

invalidation: {
  paths: "all",
  wait: false,
},
assets: {
  nonVersionedFilesCacheHeader:
    "public,max-age=0,s-maxage=86400,stale-while-revalidate=8640",
  versionedFilesCacheHeader:
    "public,max-age=31536000,immutable",
},

Versioned files (everything under _next/static/) get a one-year browser cache plus immutable. The content hash in the filename guarantees these files never change between builds, so there's no reason to revalidate.

Non-versioned files get no browser cache (max-age=0) but a one-day CDN cache with stale-while-revalidate. Visitors always get a fresh file, but the CDN doesn't hammer the origin on every request.

wait: false on the invalidation means deploy doesn't block waiting for CloudFront to flush all paths — it kicks off the invalidation and returns. The invalidation finishes in the background, typically within a minute.

Compared to Vercel

To be direct: if your Next.js app doesn't have unusual infrastructure requirements, Vercel is easier. git push deploys, preview environments are automatic, and you don't touch CloudFormation.

We use SST because Storyie needs:

Fine-grained CloudFront control — IP restrictions, per-route cache policies, edge functions.
Cron jobs in the same codebase — Lambda-based scheduling without a separate service.
Multi-tenant wildcard domains — full control over *.storyie.com routing.
Cost — AWS charges per usage. Vercel Pro is per-seat regardless of scale.
No platform lock-in — AWS infrastructure generalizes; if SST itself is ever a problem, the underlying resources are standard CloudFormation.

The SST downsides are real:

Initial stack creation takes 10–15 minutes. CloudFormation bootstrapping is slow.
Deploys take 2–5 minutes vs. Vercel's ~30 seconds.
OpenNext compatibility lags. New Next.js features sometimes need an OpenNext release before they work correctly on Lambda.
Debugging requires CloudWatch. There's no Vercel-style function log UI — you go to CloudWatch Logs.

Takeaways

SST stages give you full environment isolation from a single config. Production and staging diverge only where they need to — domain, env file, removal policy — and share everything else.
CloudFront Functions are the right tool for edge IP restriction. Lightweight, cheap, and they run before Lambda. Write them in ES5 and remember to bypass external webhook paths.
OAuth routes must have caching disabled. Authorization codes are one-time-use; a cached response breaks auth.
Graviton (arm64) is a free cost reduction for Lambda-backed Next.js deployments. Use it by default.
Stagger cron jobs that share a database. Simultaneous batch queries compound unnecessarily.
Stage-specific env vars belong in sst.config.ts, not in .env files. Two sources of truth drift apart eventually.

Building a Monorepo with pnpm and TypeScript — workspace conventions and the package boundaries that feed into this deployment setup
Next.js 16 Deployment Deep Dive — Next.js-specific patterns that affect how OpenNext bundles the app
Cross-platform Lexical with use dom: monorepo gains and the bridges you still own — how the codebase that runs on this infrastructure is structured

Try Storyie

Storyie is live at storyie.com — the infrastructure described here is exactly what serves it. If you write a diary on the web and open it on the iOS app, you're seeing the same AWS deployment from two different entry points.

Deploy Next.js to AWS with SST v3 (not Vercel)

TL;DR

What SST gives you out of the box

Stage isolation

CloudFront Function for IP restriction

Disabling the cache on OAuth routes

Lambda configuration

Cron jobs

Environment variables: one source of truth

Cache headers

Compared to Vercel

Takeaways

Related Posts

Try Storyie

Deploying Next.js 16 to AWS with SST: App Router, Server Components, and ISR

Testing a Next.js + Expo monorepo: four layers, one CI pipeline

Secure AWS deploys with GitHub OIDC and Terraform

TL;DR

What SST gives you out of the box

Stage isolation

CloudFront Function for IP restriction

Disabling the cache on OAuth routes

Lambda configuration

Cron jobs

Environment variables: one source of truth

Cache headers

Compared to Vercel

Takeaways

Related Posts

Try Storyie

Related posts

Deploying Next.js 16 to AWS with SST: App Router, Server Components, and ISR

Testing a Next.js + Expo monorepo: four layers, one CI pipeline

Secure AWS deploys with GitHub OIDC and Terraform