Streaming Chat Agent

Real-time commerce chat with token-by-token streaming. Next.js App Router + Vercel AI SDK. Tool calls appear in the UI as they happen.

TIME

~15 min

STACK

Next.js 15Vercel AI SDK

SERVERS

stripeasaascorreios

Real-time commerce chat with token-by-token streaming. Next.js App Router + Vercel AI SDK. Tool calls appear in the UI as they happen — the canonical customer-facing chat pattern.

↻ STREAMING ARCHITECTURE

Tokens flow frontend ⇄ backend in real time

Frontend · app/page.tsx

useChat()

What's the Correios rate to 01310-100?

calling codespar_ship…

O frete SEDEX para 01310-100 é R$ 18,50, prazo…

HTTP POST→

←SSE stream

Backend · app/api/chat/route.ts

streamText()

14:08INFOPOST /api/chat
14:08INFOsession created → sb_9f7c
14:08TOOLcodespar_ship → correios
14:10DONErates fetched · 2.1s
14:10INFOstreaming response…

useChat sends messages → streamText runs agent loop → tools execute automatically → tokens stream back to UI

Prerequisites

npm install @codespar/sdk @codespar/vercel ai @ai-sdk/openai

Backend route

The API route creates a session per request, gets tools, and returns a streaming response. The Vercel adapter handles tool execution automatically — no manual loop.

app/api/chat/route.ts

import { streamText } from "ai";
import { openai } from "@ai-sdk/openai";
import { CodeSpar } from "@codespar/sdk";
import { getTools } from "@codespar/vercel";

const codespar = new CodeSpar({ apiKey: process.env.CODESPAR_API_KEY! });

export async function POST(req: Request) {
  const { messages } = await req.json();

  const session = await codespar.create("user_123", {
    servers: ["stripe", "asaas", "correios"],
  });

  const tools = await getTools(session);

  const result = streamText({
    model: openai("gpt-4o"),
    tools,
    maxSteps: 10,
    system: `Commerce assistant for a Brazilian store. Be concise.`,
    messages,
    onFinish: async () => { await session.close(); },
  });

  return result.toDataStreamResponse();
}

The Vercel adapter's getTools returns tools with built-in execute methods. Tool calls happen automatically as the LLM requests them — no manual loop needed.

Frontend component

app/page.tsx

"use client";

import { useChat } from "@ai-sdk/react";

export default function Chat() {
  const { messages, input, handleInputChange, handleSubmit } =
    useChat({ api: "/api/chat" });

  return (
    <div>
      {messages.map((m) => (
        <div key={m.id}>{m.content}</div>
      ))}
      <form onSubmit={handleSubmit}>
        <input value={input} onChange={handleInputChange} />
      </form>
    </div>
  );
}

How it works

User types a message, useChat sends it to /api/chat
Backend creates a CodeSpar session with the servers you want available
streamText starts streaming the LLM response token-by-token
When the LLM decides to call a tool, Vercel AI SDK executes it via session.execute() automatically
Tool results feed back into the stream, the LLM continues generating
Session closes when the response finishes — via onFinish

Adding tool call indicators

Show the user when tools are being called — dramatically improves perceived latency and trust.

app/page.tsx (enhanced)

{messages.map((m) => (
  <div key={m.id}>
    {m.content && <p>{m.content}</p>}
    {m.toolInvocations?.map((tool) => (
      <div key={tool.toolCallId}>
        {tool.state === "result"
          ? `✓ ${tool.toolName} completed`
          : `⏳ Calling ${tool.toolName}...`}
      </div>
    ))}
  </div>
))}

Streaming vs status streams

Two different streaming surfaces, often confused:

LLM token streaming (this cookbook) — streamText from the Vercel AI SDK, or session.sendStream(prompt) from the SDK directly. Streams chat completion tokens token-by-token as the model generates them.
Status streams — session.paymentStatusStream(toolCallId) and session.verificationStatusStream(toolCallId). SSE endpoints under /v1/tool-calls/:id/<thing>/stream that push settlement / KYC outcome events as the underlying provider webhooks fire. Different layer, different lifetime. See /docs/concepts/sse-streaming.

Production considerations

Session per request. Each /api/chat call creates a new session. For multi-turn conversations where you want to reuse a session, store the session ID client-side and reopen it.
Error handling. Wrap session.close() in onFinish — it runs on both success and error paths.
Rate limiting. Add rate limiting to the API route to prevent abuse.
Authentication. Use Clerk or NextAuth to authenticate users before creating sessions. Pass the real userId as the first argument to codespar.create(userId, config) for billing attribution.

Next steps

COOKBOOK

E-Commerce Checkout

Full checkout flow with invoicing and shipping — same streaming pattern.

COOKBOOK

Webhook Listener

Non-interactive counterpart — deterministic loops triggered by events.

PROVIDER

Vercel AI SDK

Full @codespar/vercel API reference.

CONCEPT

Sessions

Session lifecycle, scoping, per-user isolation.

Streaming Chat Agent

On this page