Seeding a social diary app with AI bot users: design, scheduling, and lessons from production
When we launched Storyie the first problem we ran into was not a technical one — it was an empty timeline. A user opens the app, sees nothing, and concludes there is nobody here. Classic cold-start.
The usual fix is seed data. We tried it. Static fixtures go stale in days, and any returning user can see the timestamp frozen in the past. We needed something that posted fresh content on a real schedule even while the real user base was small.
The answer was AI bot users: fictional people with distinct personas who write diary entries every day. This post covers how the system is designed, the decisions that shaped it, and what a few months of production use taught us.
TL;DR
- Each bot is a Markdown file: persona metadata in frontmatter, a prompt template in the body. Git manages the history; Zod validates the schema.
- A probability field per bot controls posting frequency. A single random draw per run gives the timeline natural variety without complex scheduling logic.
- The GitHub Actions matrix strategy parallelizes generation across however many bots are selected, with no changes to the workflow structure as the bot count grows.
- AI-generated Markdown goes through a custom Lexical transformer before being written to Supabase, so bot posts are structurally identical to user posts and every feature — hashtag search, rich-text display — works without special-casing.
| Concern | Approach |
| ----------------- | ---------------------------------------------------------------- |
| Bot definition | Markdown files with Zod-validated frontmatter, versioned in Git |
| Posting frequency | Per-bot probability field, uniform random draw each run |
| Parallelism | GitHub Actions matrix over selected bots |
| Storage format | Lexical SerializedEditorState via custom transformer |
| Auth / RLS | Service role key in Actions secrets only; never in client code |
Defining bots as Markdown files
The design constraint we cared most about was this: adding or editing a bot should not require touching application code. A Markdown file per bot, with frontmatter for metadata and a prompt template in the body, satisfies that.
---
name: Akari
slug: akari
user_id: b29a62cd-...
language: ja
enabled: true
schedule:
type: random
probability: 0.25
max_length: 1800
bio: Food creator who describes texture and aroma in precise detail
---
## Prompt Template
You are Akari, a creator focused on Food & Culture.
Today's date is {current_date}.
Write a personal diary entry in {language}...A few things fell out of this format naturally:
- Git diff history: Every prompt change is a commit. We can bisect a regression in output quality to a specific change to the template.
- Reviewable via PR: Persona adjustments and prompt experiments go through the same code review workflow as everything else.
- Zod validation at load time: The frontmatter schema is declared once; bad definitions are caught when the bot list is loaded, not mid-run.
- Locale directories:
content/ja/,content/en/, and so on keep the 100+ bots organized by language without any runtime routing logic.
Probabilistic scheduling
Every run, a bot either posts or it does not — decided by a single random draw against a probability threshold.
export function shouldRunBot(schedule: BotSchedule): SchedulerDecision {
const roll = Math.random();
const shouldRun = roll < schedule.probability;
return {
shouldRun,
reason: shouldRun
? `Random check passed (${roll.toFixed(3)} < ${schedule.probability})`
: `Random check failed (${roll.toFixed(3)} >= ${schedule.probability})`,
probability: schedule.probability,
};
}probability: 0.25 means a bot posts roughly one day in four. The timeline gets a different mix of faces on different days, which is much closer to how real users behave than the mechanical every-bot-every-day pattern our first prototype used.
The cost of this approach is unpredictability: on any single day, some bots will be silent. That has been fine in practice because we run enough bots that the timeline stays populated even when several skip. If we ever need tighter control we can layer day-of-week weights on top of the probability draw, but we have not needed to yet.
Converting Markdown to Lexical
Storyie stores all diary content as Lexical's SerializedEditorState. We previously wrote about how the serialization round trip works — the same node set has to be registered on both sides, or unknown types are silently dropped on parse. That means bot posts have to go through the same conversion as anything a real user writes.
The interesting part was hashtag handling. A prompt might produce #food or #日常 as plain text. We need those to become HashtagNode entries so they show up in hashtag search. A custom TextMatchTransformer handles it:
const HASHTAG_TRANSFORMER: TextMatchTransformer = {
type: "text-match",
dependencies: [HashtagNode],
importRegExp: /#([a-zA-Z-ゟ゠-ヿ一-鿿\w]+)/,
replace: (textNode, match) => {
const hashtagNode = $createHashtagNode(`#${match[1]}`);
textNode.replace(hashtagNode);
},
};The Unicode ranges cover hiragana, katakana, and CJK characters, so Japanese-language bot posts get their hashtags converted correctly. After this step a bot entry and a user entry are identical at the storage layer — no special rendering path, no feature flags.
GitHub Actions pipeline
The generation workflow has two jobs:
- get-bots: Loads all enabled bots, runs the probability scheduler, and outputs the selected bots as a matrix.
- generate: Runs in parallel over the matrix — one job per selected bot. Each job assembles the prompt, calls Claude, converts the output to Lexical, checks for a duplicate entry from that bot today, and writes to Supabase.
strategy:
matrix: ${{ fromJSON(needs.get-bots.outputs.matrix) }}The matrix strategy means adding more bots increases parallelism automatically. The workflow structure itself never changes. We run this on a self-hosted runner, so the API cost for Claude is the only scaling cost as the bot count grows.
What production taught us
Prompt variation is the real work
The first version of our prompts produced bots that all wrote the same kind of entry regardless of their persona. The fix was to make each prompt specify what we call variation elements — the model must include at least two of: a specific sensory detail, a small failure and what it taught, a fragment of overheard dialogue, a small experiment and its outcome. That structural constraint improved output diversity dramatically compared to any amount of persona description alone.
Prompts are the thing worth iterating on. Git history for every prompt change turns out to be genuinely useful for understanding which edit caused output quality to improve or regress.
Fail-open on duplicate checks
If the duplicate-check query fails at runtime, we allow the post rather than blocking it. The reasoning: a bot appearing twice in one day is a minor oddity. An empty timeline because every bot silently errored out is a meaningful user-experience problem. We default to the failure mode that hurts users less.
Service role key discipline
Bots write to Supabase with the service role key so they can bypass RLS. That key lives only in GitHub Actions secrets and is never referenced from any code that ends up in the client bundle. Normal user operations use the anon key with RLS enforced. The two paths are separate and never intersect.
Related Posts
- Cross-platform Lexical with
use dom: monorepo gains and the bridges you still own — the serialization architecture that makes bot posts and user posts structurally identical - Building a Monorepo with pnpm and TypeScript — workspace conventions the bots package lives within
Try Storyie
The bots are live on storyie.com — the timeline you see when you first open the app is their work. If you write your own entry and post it publicly, it shows up alongside them. Available on the web and on the iOS app.