Why does Lexical not include an image node out of the box?

Lexical deliberately ships only headless-safe primitives — text, paragraph, heading, list, link, code. An image node requires a renderer, and renderers are platform-specific (a React DOM ` ` on the web, a WebView-compatible DOM element on mobile). Rather than picking one and making it canonical, Lexical leaves image rendering to the application. The serialization contract is your job to design; the rendering is entirely yours. This is the right call — it means you can build exactly what your platform needs instead of working around an opinionated default.

Why did you use DecoratorNode on web and ElementNode on mobile?

`DecoratorNode` lets `decorate()` return a React element, which plugs cleanly into the web editor's React tree. We get `useState`, `useCallback`, and the rest of the React model for free — loading skeletons, progress bars, error overlays all compose naturally. On mobile, Lexical runs inside a WebView that does not have a React runtime (`"use dom"` provides the DOM but not the Lexical reconciler that drives `decorate()`). So we fall back to `ElementNode`, which lets us build the image container imperatively via `createDOM()` and update it via `updateDOM()` — pure DOM manipulation, no framework dependency. It is more verbose than the React path, but it runs reliably in any WebView.

How do you keep the serialized JSON identical across platforms?

The `SerializedImageNode` type lives in `packages/lexical-common` and is imported by both the web `ImageNode` and the mobile `ImageNode`. Both sides implement `exportJSON()` to emit that shape and `importJSON()` to read it. Neither side can drift without a type error. The node class and renderer are allowed to diverge (and do); the wire format is not. A diary written on web opens correctly on mobile and vice versa because the JSON they read and write is the same contract, enforced by TypeScript.

What is the tempId pattern and why is it necessary?

When a user selects an image, we immediately insert an image node with a local URI and an "uploading" status — the preview appears before the upload finishes. But when the upload completes and we need to update that specific node with the CDN URL, we have a problem: Lexical node keys are assigned internally and are not predictable from outside the editor. `tempId` is a client-generated string (`temp_${Date.now()}_${randomSuffix}`) stored on the node at insert time. The upload function holds a reference to that `tempId` and, on completion, walks the editor state to find the matching node and dispatches `UPDATE_IMAGE_COMMAND` with the final URL. Without `tempId`, concurrent uploads would have no reliable way to route "done" events to the right node.

Why does the Markdown export parse JSON directly instead of using a headless Lexical editor?

A headless Lexical editor needs every custom node registered before it can parse editor state. `ImageNode` is platform-specific — the web version extends `DecoratorNode` and the mobile version extends `ElementNode`. Registering either in a server-side headless environment pulls in platform dependencies (React DOM on the web side, or DOM globals on the mobile side). Our Markdown export runs in Node without a DOM, so we skip the headless editor entirely and parse the serialized JSON directly. It is less elegant than the "proper" Lexical route, but it has no dependencies and works anywhere — including inside AI prompt construction pipelines that never touch the browser.

What does the upload lifecycle look like end-to-end?

On both platforms the flow is: (1) user selects an image; (2) `INSERT_IMAGE_COMMAND` fires immediately with the local URI and status "uploading" — the preview appears at once; (3) the upload runs in the background; (4) progress events fire `UPDATE_IMAGE_COMMAND` to increment the progress bar on the node; (5) on success, `UPDATE_IMAGE_COMMAND` replaces the local URI with the CDN URL and sets status to "completed"; (6) on failure, status becomes "failed" and an error overlay with a retry button appears. The `RETRY_IMAGE_COMMAND` re-triggers step 3 using the same `tempId`, so the node is updated in place. The key UX insight is step 2: showing the local preview immediately makes image insertion feel instant regardless of upload speed.

How we built a cross-platform ImageNode for Lexical — design decisions and trade-offs

Storyie uses Lexical for rich-text editing on both web (Next.js) and mobile (Expo). When we moved from text-only diary entries to photo-supported ones, we had to design a custom ImageNode from scratch. Lexical does not ship one.

This post covers the three decisions that shaped the design: why we split the node class by platform, why the serialization format is the one thing we never split, and how we modeled the upload lifecycle inside the editor state.

TL;DR

Lexical has no built-in image node. You build the whole thing.
Web uses DecoratorNode (returns React JSX from decorate()). Mobile uses ElementNode (DOM manipulation via createDOM() / updateDOM()). The split is forced by the WebView runtime on mobile.
The JSON serialization shape is defined once in lexical-common and imported by both platforms. Neither side can drift without a type error.
Upload state lives on the node: status (uploading / completed / failed) and uploadProgress. A tempId field lets the upload callback find the right node to update.
Markdown export skips the headless Lexical editor entirely and parses JSON directly — no platform imports, no DOM globals required.

Layer	Package	What it contains
Shared types	`packages/lexical-common`	`SerializedImageNode`, commands, Markdown conversion
Web node	`packages/lexical-editor`	`ImageNode` extends `DecoratorNode`, `ImageRenderer` component
Mobile node	`apps/expo/…/dom/nodes/ImageNode.dom.ts`	`ImageNode` extends `ElementNode`, DOM-based rendering

Why Lexical does not have an image node

Lexical's built-in node set is intentionally headless-safe: text, paragraph, heading, list, link, code block. These nodes can render to a DOM or serialize to JSON without pulling in any rendering framework.

An image node cannot be headless in the same sense — you have to decide how to render it, and the right answer is different on every platform. Lexical leaves the decision to you, which is the correct call.

There are two base classes to choose from:

DecoratorNode — lets decorate() return a React element (or any framework component) that Lexical inserts into its rendering tree.
ElementNode — renders by constructing DOM nodes directly in createDOM() and updating them in updateDOM().

Which one fits depends on what your runtime gives you.

Web: DecoratorNode

On the web, the Lexical editor runs inside a React tree, so DecoratorNode is the natural fit. decorate() returns a React element; Lexical mounts it into the editor. We get the full React model — hooks, state, context — for the image renderer.

class ImageNode extends DecoratorNode<React.ReactElement> {
  decorate(): React.ReactElement {
    return React.createElement(ImageRenderer, {
      src: this.__src,
      uploadStatus: this.__uploadStatus,
      uploadProgress: this.__uploadProgress,
      nodeKey: this.__key,
    });
  }
}

The ImageRenderer component handles everything visual: a skeleton placeholder while the image loads (using aspect-ratio so the layout does not jump), reduced opacity while uploading, and a red error badge on failure.

<img
  style={{
    aspectRatio: width && height ? `${width / height}` : "auto",
    opacity: uploadStatus === "uploading" ? 0.6 : 1,
    maxHeight: "600px",
    objectFit: "contain",
  }}
/>

The maxHeight: 600px cap is a diary-specific UX call — we do not want a portrait photo from a phone filling the entire viewport. objectFit: "contain" preserves the aspect ratio within that cap.

Mobile: ElementNode

Mobile is where it gets more constrained. Storyie's Expo editor runs Lexical inside a "use dom" WebView. The WebView provides a DOM, but it does not run the same React reconciler that drives DecoratorNode.decorate() in the main Lexical tree.

The solution is ElementNode. Instead of returning React JSX, we build the image container imperatively:

class ImageNode extends ElementNode {
  createDOM(_config: EditorConfig): HTMLElement {
    const container = document.createElement("div");
    const img = document.createElement("img");
    img.src = this.__src;
    // apply styles
    container.appendChild(img);

    if (this.__uploadStatus === "uploading") {
      container.appendChild(this.createProgressBar());
    }
    return container;
  }
}

Progress bars, error overlays, and retry buttons are all built by hand as DOM nodes. It is more verbose than the React path and updateDOM() requires manual diffing to avoid unnecessary DOM mutations — but it runs reliably in any WebView environment.

Shared serialization

The node classes differ. The JSON format does not.

// packages/lexical-common/src/types.ts
interface SerializedImageNode {
  type: "image";
  version: 1;
  src: string;           // local URI while uploading, CDN URL after
  alt?: string;
  width?: number;
  height?: number;
  caption?: string;
  uploadStatus?: ImageUploadStatus;
  uploadProgress?: number;
  tempId?: string;       // upload correlation ID
}

Both the web ImageNode and the mobile ImageNode import this type and implement exportJSON() / importJSON() against it. TypeScript enforces that they stay in sync. A diary serialized on web opens correctly on mobile, and vice versa, because the wire format is a shared contract — not a convention.

The package layout that makes this possible:

packages/
  lexical-common/        # platform-agnostic
    src/types.ts         # SerializedImageNode, ImageInsertPayload, ImageUpdatePayload
    src/commands/        # INSERT_IMAGE_COMMAND, UPDATE_IMAGE_COMMAND, DELETE_IMAGE_COMMAND, RETRY_IMAGE_COMMAND
    src/markdown.ts      # Lexical JSON → Markdown (image-aware)
  lexical-editor/        # web only
    src/nodes/ImageNode.ts
    src/components/ImageRenderer.web.tsx
    src/plugins/ImagePlugin.tsx
apps/
  expo/
    components/lexical/dom/editor/
      nodes/ImageNode.dom.ts

Commands as the abstraction layer

Image operations are dispatched through Lexical's command system, defined in lexical-common:

export const INSERT_IMAGE_COMMAND = createCommand("INSERT_IMAGE");
export const UPDATE_IMAGE_COMMAND = createCommand("UPDATE_IMAGE");
export const DELETE_IMAGE_COMMAND = createCommand("DELETE_IMAGE");
export const RETRY_IMAGE_COMMAND = createCommand("RETRY_IMAGE");

Any code that inserts an image — a toolbar button, a drag-and-drop handler, the upload callback — calls editor.dispatchCommand(INSERT_IMAGE_COMMAND, payload) without knowing anything about the node implementation. The node's command handler is registered at editor setup time and is platform-specific. The dispatch site is not.

Upload lifecycle

The image node carries its upload state directly. The lifecycle looks like this:

User selects image
  → INSERT_IMAGE_COMMAND (src: file://, status: "uploading", tempId)
  → local preview appears immediately + progress bar renders
  → upload runs in background
  → UPDATE_IMAGE_COMMAND (uploadProgress: 0.6)
  → UPDATE_IMAGE_COMMAND (src: https://…, status: "completed")
  → final image renders

Upload fails
  → UPDATE_IMAGE_COMMAND (status: "failed")
  → error overlay + retry button appear
  → RETRY_IMAGE_COMMAND → back to "uploading" state

The key UX decision is showing the local preview immediately at insert time. The user sees their image in the editor the moment they pick it, regardless of network conditions. The upload happens behind that preview.

The tempId pattern

Lexical assigns node keys internally — you cannot specify one from outside. That creates a problem for the upload callback: when the upload finishes, how does it know which node to update?

tempId is the answer. At insert time, the caller generates a correlation ID:

async function insertImageWithUpload(editor, asset, uploadFn) {
  const tempId = `temp_${Date.now()}_${Math.random().toString(36).substring(7)}`;

  editor.dispatchCommand(INSERT_IMAGE_COMMAND, {
    src: asset.uri,
    uploadStatus: "uploading",
    tempId,
  });

  try {
    const finalUrl = await uploadFn(asset);
    // walk editor state, find node where __tempId === tempId
    // dispatch UPDATE_IMAGE_COMMAND with { tempId, src: finalUrl, status: "completed" }
  } catch {
    // dispatch UPDATE_IMAGE_COMMAND with { tempId, status: "failed" }
  }
}

The tempId travels with the node and is serialized to JSON (useful if the user saves mid-upload and reopens). The upload function holds a closure over tempId and uses it to route the result to the correct node. Without this, concurrent uploads — common when pasting multiple images — have no reliable way to update the right node.

importDOM / exportDOM

Both ImageNode implementations include importDOM and exportDOM. These handle clipboard interop: copying from a web page and pasting into the editor, or copying from the editor and pasting into another app as HTML.

static importDOM(): DOMConversionMap | null {
  return {
    img: (_node: Node) => ({
      conversion: convertImageElement,
      priority: 0,
    }),
  };
}

When Lexical encounters an <img> element during a paste, convertImageElement constructs an ImageNode from it. That is the entirety of "paste images from a web page" — the DOM conversion handles the rest.

Markdown export without the headless editor

Storyie can export diary entries as Markdown. The conversion from Lexical JSON to Markdown runs server-side (and sometimes inside AI pipelines), so it has to work without a DOM.

The naive approach would be to spin up a headless Lexical editor, register all nodes, and call $generateMarkdownWithHeuristics. The problem: ImageNode is platform-specific. Registering the web version server-side pulls in React DOM. Registering neither means Lexical drops image nodes silently as unknown types.

We sidestep this by parsing the serialized JSON directly rather than going through a headless editor:

function convertImage(node: LexicalNode): string {
  const alt = node.alt ?? "";
  const src = node.src ?? "";
  return `![${alt}](${src})`;
}

No node registration, no DOM globals, no framework imports. The conversion function receives the raw JSON object and returns a Markdown string. It works in Node, in a serverless function, in a Bun script — anywhere.

What we would do differently

What worked well

Locking the serialization format first. The web and mobile implementations can evolve independently as long as they read and write the same JSON. We defined SerializedImageNode in lexical-common before writing a single line of node code. Getting this wrong after the fact would have required a migration.

Commands as the insert/update API. Callers dispatch a command and do not care how the node handles it. This kept the upload logic, the toolbar, and the node implementation decoupled. Swapping the node implementation on one platform (we changed the mobile renderer once) required zero changes to the upload code.

tempId. It is a minor addition to the serialized shape, but it is load-bearing. Without it, concurrent uploads become non-trivially hard to route correctly.

What was harder than expected

DOM-based UI in the mobile ElementNode. Implementing a progress bar, an error overlay, and a retry button as raw DOM operations is tedious. Writing updateDOM() to diff between the old and new node state — without a virtual DOM — is especially error-prone. The React path on web is significantly more comfortable.

ImageNode staying out of lexicalCommonNodes. Because ImageNode is platform-specific, it is not in the shared node list exported from lexical-common. That means every place that needs to handle images — Markdown export, the read-only viewer, the AI prompt builder — has to account for image nodes as a special case rather than treating them as first-class registered nodes. The JSON-parse approach handles this, but it is an extra convention to maintain.

Takeaways

The core design principle: split what has to be split, share what must not drift.

Types, commands, serialization format → lexical-common. One source of truth, imported by both platforms.
Node class and renderer → per platform. DecoratorNode + React on web; ElementNode + DOM on mobile.
Markdown export → JSON directly. Headless Lexical would require node registration that drags in platform dependencies.

Cross-platform rich-text with images is tractable when you are clear about which layer each concern belongs to. The wire format is the one thing that cannot diverge. Everything else is implementation detail.

Cross-platform Lexical with use dom: monorepo gains and the bridges you still own — how we structure the full Lexical monorepo, including the "use dom" bridge design for the image upload flow
Building a Monorepo with pnpm and TypeScript — workspace conventions and cross-package dependency rules
Building a Cross-Platform Mobile App with Expo — Expo DOM Components and the broader mobile editor context

Try Storyie

If you want to see the result from the user side: write a diary with photos on the web at storyie.com and open it on the iOS app. Same content, same images, same formatting. The platform split is invisible from the outside, which is exactly what it should be.

How we built a cross-platform ImageNode for Lexical — design decisions and trade-offs

TL;DR

Why Lexical does not have an image node

Web: DecoratorNode

Mobile: ElementNode

Shared serialization

Commands as the abstraction layer

Upload lifecycle

The tempId pattern

importDOM / exportDOM

Markdown export without the headless editor

What we would do differently

What worked well

What was harder than expected

Takeaways

Related Posts

Try Storyie

One color system for web, mobile, and email: how we built @storyie/theme

Image uploads with Cloudflare R2 and presigned URLs: one architecture, two platforms

What to share across platforms (and what to keep separate): UI component design in a Next.js + Expo monorepo

TL;DR

Why Lexical does not have an image node

Web: DecoratorNode

Mobile: ElementNode

Shared serialization

Commands as the abstraction layer

Upload lifecycle

The tempId pattern

importDOM / exportDOM

Markdown export without the headless editor

What we would do differently

What worked well

What was harder than expected

Takeaways

Related Posts

Try Storyie

Related posts

One color system for web, mobile, and email: how we built @storyie/theme

Image uploads with Cloudflare R2 and presigned URLs: one architecture, two platforms

What to share across platforms (and what to keep separate): UI component design in a Next.js + Expo monorepo