
Streaming JSON Patching Architecture
SURGICAL JSON PATCHING WITH REAL-TIME STREAMING FOR AI-POWERED DOCUMENT EDITING.
Streaming JSON Patching Architecture
This article explains the streaming JSON patching pipeline used by the resume chatbot project. It focuses on how the system converts freeform editing intents into surgical RFC 6902 patches, how those patches stream back to the client token by token, and how the UI stabilizes partially-emitted paths before applying changes.
Why this matters, at a glance
- Sending full regenerated documents for every edit is expensive and noisy. It forces the client to reconcile entire blobs, triggers wide UI re-renders, and makes fine-grained undo difficult.
- A structured, typed document model lets the AI reason about locations explicitly. Instead of asking the model to “update the skills section”, we give it paths like
/skills/2/keywords/0and a schema to validate against. - Streaming patches let the UI show edits as they are generated, provide immediate feedback, and reduce bandwidth by sending only the minimal edit operations.
Why structured data
Traditional AI-first editors often treat documents as plain text or markdown. That works for single, monolithic content, but it breaks down when content has typed sections, arrays, and nested objects.
JSON Resume style schemas give us explicit types and paths. The model can add a skill with a single RFC 6902 “add” operation, or replace a company description with “replace”. The client receives a sequence of precise operations instead of guessing boundaries.
Example: structured resume snippet
{ "basics": { "name": "Alex Example", "summary": "Full-stack developer focused on small teams and high quality code." }, "skills": [ { "name": "JavaScript", "keywords": ["ESNext", "Node.js", "React"] } ]}When the model needs to add a keyword it can emit a patch with path /skills/0/keywords/- and op add. That path is unambiguous. The client can apply it immediately using fast-json-patch and re-render only the affected component.
Three-layer architecture overview
The runtime separates responsibilities into three layers. Each layer is intentionally small to keep reasoning and trust boundaries clear.
Diagram: patch pipeline
Streaming patch pipeline from chat intent through stabilization and validated apply.
Layer 1, conversation agent responsibilities
The conversation agent is the chat surface you interact with. It streams text to the client with streamText() and uses a user-selectable model for conversational quality and cost control. Its job is intent capture. When a change requires document modification it decides which tool to call and with what parameters.
Responsibilities
- Interpret user intent from chat messages
- Validate intent and decide if a structured tool should run
- Call
patchResumeor other tools when modification is required
Layer 2, patch tool responsibilities
The patchResume tool owns document modifications. It fetches the current document state, streams the snapshot to the client, and then initiates a nested LLM call that generates RFC 6902 patches. The tool is the only code path that applies patches to persisted documents.
A representative implementation
export const patchResume = ({ session, dataStream }) => tool({ description: "Apply surgical JSON patches to an existing resume...", inputSchema: z.object({ id: z.string(), changes: z.array(z.object({ description: z.string() })) }), execute: async ({ id, changes }) => { // 1. Fetch current resume from database const document = await getDocumentById({ id }); let currentContent = JSON.parse(document.content);
// 2. Send current content to client immediately dataStream.write({ type: "data-resumeDelta", data: document.content, transient: true });
// 3. Initiate nested LLM call for patch generation const patchResult = streamObject({ model: getArtifactModel(), // gpt-4.1-mini schema: patchSchema, prompt: `Generate patches for: ${changes.map(c => c.description).join(", ")}`, });
// 4. Stream patches as they arrive for await (const partial of patchResult.partialObjectStream) { if (partial.patches?.length) { // Apply all patches in the partial object at once for efficiency currentContent = applyPatch(currentContent, partial.patches); dataStream.write({ type: "data-resumeDelta", data: JSON.stringify(currentContent), transient: true }); } } }});Notes on the example
- The tool writes a transient snapshot immediately so the client has the latest state while the patch model starts working.
streamObjectis the Vercel AI SDK helper that returns a partial object stream while the model emits tokens.applyPatchis from fast-json-patch. It applies an RFC 6902 op and returns the mutated document.
Layer 3, structured data model
The patch generator runs with a fixed model choice tuned for predictable structured output. In this project we use gpt-4.1-mini for patch generation and we validate emitted objects with Zod. The generator’s output is a sequence of RFC 6902 operations grouped as patch bundles.
Why a separate patch model
- The chat model focuses on natural language quality and user experience.
- The patch model focuses on strict, schema-conformant JSON output. It is cheaper and more deterministic when constrained by a schema.
Patch bundles
A patch bundle is a small collection of RFC 6902 operations that represent one coherent change. Bundles arrive as partial objects while the model streams. Each bundle may contain 1 to N operations. The client applies bundles as they are stabilized.
Path stabilization and buffering rules
Partial streams create a practical problem: JSON patch paths arrive as the model emits tokens. A single path like /skills/0/keywords/- can be emitted in fragments. If the client tries to apply a patch before the path is complete it can target the wrong node.
The stabilization filter
We designed a small stabilization filter that waits for two conditions before accepting a patch:
- The
pathfield parses as a complete JSON Pointer. - The operation matches the Zod schema for the
patchobject.
This filter is intentionally conservative. It rejects partial paths until the parser confirms completion. When the model emits a path in pieces the filter buffers tokens until the pointer is syntactically complete.
Example failure mode
- Model emits
/skithen pauses. If the client applied a patch to/skiit would be wrong. - Once the pointer completes to
/skills/0/keywords/-the filter releases the patch to the application step.
Implementation sketch
function stabilizePatchStream(partialObjectStream) { const buffer = []; for await (const partial of partialObjectStream) { buffer.push(partial); const candidate = mergeBuffer(buffer); if (isValidPatchBundle(candidate)) { yield candidate; buffer.length = 0; } }}The intent is simple, but the edge cases matter. The stabilizer must not wait forever, so it enforces a short timeout and emits what it has when the model pauses beyond a threshold. UI consumers treat those emissions as tentative until the next bundle confirms them.
Semantic text operations and rules
RFC 6902 gives us add, remove, replace and so on. For rich text fields we added a small set of semantic operations that keep intent explicit while remaining easy to validate and apply.
Common semantic ops
appendSentenceadds text to the current paragraph with a single space separator.prependSentenceinserts text at the paragraph start.appendParagraphinserts a new paragraph separated by two newlines.prependParagraphinserts a paragraph before the current one.
These ops are only allowed on fields that we mark as rich text in the schema. That avoids accidental use on raw string fields that should not contain newlines.
Example semantic op
{ "op": "appendSentence", "path": "/basics/summary", "value": "Strong focus on code quality."}Client application rules and examples
- When a semantic op arrives the client converts it into a small patch sequence that manipulates the string while preserving surrounding markup.
- Paragraph ops are only allowed where
fieldType === 'richtext'in the Zod schema. - Semantic ops are audited in the server logs so automated tests can assert that only expected fields receive them.
Tool governance and ownership rules
In a multi-tool system overlapping responsibilities create ambiguity. We resolve that by assigning ownership at the code level and by restricting active tools per conversation turn.
Tool ownership
createDocumentowns document initialization. It writes the first version and sets metadata.patchResumeowns all modifications after create. It is the only tool allowed to write patches to the document store.
Active tool limits and enforcement
We expose experimental_activeTools at the conversation level. The conversation agent sets this list for each turn. If a tool is not listed the agent cannot call it. This keeps the space of available capabilities small and prevents the model from calling tools it should not use.
How conflicts are prevented
- Tools declare explicit input schemas via Zod. The agent must satisfy those schemas to call a tool.
- The server enforces tool ownership. Calls from other tools are rejected.
What This Enables
- Minimal UI updates. The client only re-renders components that correspond to mutated paths.
- Scroll-to-edit behaviors that jump to the changed section and highlight new text.
- Simple, incremental version history. Each successful patch bundle creates a new revision entry. The composite primary key uses
id + createdAtso a single document can have many versions without a costly copy-on-write strategy.
Testing and determinism
Streaming behavior is inherently non-deterministic, so we build deterministic tests by mocking the patch model. The mock provider emits pre-canned partial object streams that simulate token-by-token output. This lets end-to-end tests verify path stabilization, semantic ops, and UI application logic.
See the testing article for the details.
Links and references
- Parent project overview: Resume Chatbot Overview
- Testing deep dive: Deterministic Testing for AI Streaming
External references
- Vercel AI SDK streamObject: https://ai.sdk.dev/docs/reference/ai-sdk-core/stream-object
- fast-json-patch: https://github.com/Starcounter-Jack/JSON-Patch
- RFC 6902: JSON Patch: https://datatracker.ietf.org/doc/html/rfc6902
- JSON Resume Schema: https://jsonresume.org/schema
- Zod: https://zod.dev
Appendix: smaller code snippets
Zod schema for a patch bundle
export const patchOpSchema = z.object({ op: z.string(), path: z.string(), value: z.any().optional()});
export const patchBundleSchema = z.object({ patches: z.array(patchOpSchema), meta: z.object({ source: z.string() }).optional()});Applying a patch safely on the server
import { applyPatch } from 'fast-json-patch';
function applyBundleSafely(currentContent, bundle) { try { for (const op of bundle.patches) { // fast-json-patch mutates the document and returns a result applyPatch(currentContent, [op], /*validate*/ true, /*mutateDocument*/ true); } return currentContent; } catch (err) { logger.error('Patch application failed', { err, bundle }); throw err; }}Operational notes
- Keep the patch model’s temperature low. Higher randomness yields more path fragmentation and spurious ops.
- Validate incoming bundles with Zod before attempting to apply them.
- Record transient writes separately from durable writes so the client can receive progress without creating intermediate revisions.
Security and malicious input
Patch operations operate on paths we control. Even if an attacker manages to influence the patch model, the server rejects operations that target protected paths like /metadata or /access.
Implementation enforcements
- Patch tool filters incoming ops against an allowlist of mutable paths.
- Zod schemas define which fields are editable and the allowed primitive types.
- Server-side logging captures the model prompt and the emitted bundle for postmortem inspection.
Limitations and tradeoffs
- Partial streaming requires stabilization heuristics. That adds complexity and introduces short, tentative states on the client.
- Semantic ops simplify rich text edits, but they are still higher level than a true rich-text diff format like OT or CRDTs. This design favors simplicity and auditability over complex concurrent editing.
Closing notes
The streaming JSON patching approach gives us fine-grained, auditable, and efficient updates for structured documents. It moves intent handling into the conversation layer and makes the patch tool the single source of truth for document modifications.