When we started building the CodeSpar SDK, we had 57 MCP servers with 453 tools published on npm. Developers could install any of them and wire them into their agents manually. But the developer experience was painful. You had to discover which servers existed, figure out which tools to use, manage authentication for each provider, and handle the orchestration yourself.
We needed an SDK that made all of this invisible. One import, one session, one line to send a message to a commerce agent.
Five lines. That is the target DX. Here is how we got there.
Sessions as context containers
The core abstraction is the Session. When you call codespar.create(userId, options), the SDK creates an immutable context container that bundles together:
- A specific user (scoped by userId, isolated per organization)
- A commerce preset (which MCP servers to connect)
- Auth credentials for each provider (managed, auto-refreshed)
- Policy rules (budget limits, rate limits, approval gates, time windows)
- An MCP endpoint for framework-agnostic access
- Billing context (every tool call is tracked as a billing unit)
Once created, a session does not change. Need different servers? Create a new session. This immutability simplifies debugging, makes sessions safe to pass between functions, and eliminates an entire class of state management bugs.
Sessions are also the billing unit. Every tool call within a session is logged to the session_tool_calls table with input/output payloads, duration, and status. These records drive metered billing via Stripe. This means you can give each customer their own session and bill them precisely for what they used.
6 meta tools, not 99
The Brazilian preset connects to 6 MCP servers with 99 raw tools between them. Passing all 99 tool definitions to an LLM would consume a significant portion of the context window and confuse the model with too many choices. We measured this: Claude Sonnet with 99 tools spends 40% more tokens reasoning about which tool to call and makes incorrect selections 3x more often than with 6 well-scoped meta tools.
Instead, we expose 6 meta tools that route dynamically to the correct underlying server:
When an agent calls codespar_pay, the MetaToolExecutor resolves which MCP server handles payments for the current preset, finds the best matching tool on that server using keyword scoring against the input parameters, and executes it. The agent never sees the 99 raw tools. It sees 6 clear categories.
The right number of tools for an agent is the smallest number that covers all the use cases.
Each meta tool definition uses input_schema (not inputSchema) to match the MCP specification. The schema is intentionally loose -- a description string and a flexible params object -- so the LLM can express intent naturally and the MetaToolExecutor handles the mapping to the specific API parameters the underlying server expects.
Managed auth
Each MCP server requires different authentication. Zoop uses OAuth2. Nuvem Fiscal uses API keys. Melhor Envio uses OAuth2 with refresh tokens. Managing credentials for 6+ providers per session is a pain developers should not have to deal with.
The SDK's AuthManager handles this:
Tokens are stored per-user and auto-refreshed when they expire. The AuthStore interface is pluggable, so you can use the default in-memory store for development or swap in PostgreSQL, Redis, or any other backend for production.
The key design decision: auth is per-user, not per-session. When a user authorizes Zoop once, every subsequent session for that user inherits the credential. This means the OAuth flow happens once during onboarding, and every session after that just works.
Framework adapters
The SDK works with any AI framework through adapter packages. Each package converts session tools into the format the framework expects. The adapters are thin: they depend only on @codespar/sdk for types and the session interface. No heavy dependencies.
The await getTools(session) call is async because it fetches the current tool definitions from the session's connected MCP servers. Tools can change between sessions (different presets, different servers), so the adapter always fetches fresh definitions rather than caching stale ones.
Note: @codespar/claude is for the Anthropic API directly. @codespar/vercel is for the Vercel AI SDK (which can use Anthropic, OpenAI, or any provider underneath). @codespar/mcp generates MCP configuration for IDE-based clients.
Real-time streaming with session.sendStream()
For user-facing applications, waiting 8-15 seconds for a Complete Loop to finish before showing any UI update is unacceptable. session.sendStream() returns a Server-Sent Events stream that emits events as each tool call starts, progresses, and completes:
The stream protocol is compatible with the standard SSE format, so it works with any HTTP client. The CodeSpar dashboard sandbox uses this exact API to show real-time tool execution progress to users.
One important detail: sendStream() handles the full agentic loop internally. The model receives the tools, reasons about which to call, executes them via MCP, reads the results, and continues until the task is complete. You do not need to implement a tool execution loop on the client side -- the SDK manages it server-side and streams events back.
Policy enforcement built in
Every session has a PolicyBridge that wraps the PolicyEngine. You can set budget limits, rate limits, time windows, and approval gates at session creation:
Policy checks happen before every tool execution. Budget usage is tracked per session. If a policy denies an action, the SDK throws a typed PolicyDeniedError or BudgetExceededError that your application can handle gracefully.
The approval gate is particularly important for high-value commerce operations. An AI agent processing a R$50,000 wholesale order should pause and wait for human confirmation. The gate sends a notification (email, Slack, or webhook), waits for approval, and then resumes execution. If the timeout expires without approval, the tool call is denied and the agent receives a clear error explaining why.
Commerce events as triggers
The TriggerManager emits typed events when commerce actions are detected in agent responses:
Events are detected by analyzing tool calls and response content using regex patterns that match both English and Portuguese terms. This means a payment confirmation triggers payment.completed whether the agent says "Payment confirmed" or "Pagamento confirmado."
Triggers can also fire server-side. You can configure triggers in the dashboard that run automatically when specific events occur -- for example, auto-issuing an NF-e whenever a payment completes, or sending a WhatsApp notification whenever a shipping label is created. This turns the SDK from a tool execution layer into a full commerce automation engine.
Billing integration
Every tool call is a billing unit. This is a deliberate design choice: billing at the tool call level gives you precise cost attribution per user, per session, per order. There is no ambiguity about what you are paying for.
The billing model maps directly to Stripe metered billing. At the end of each billing cycle, the session_tool_calls table is aggregated per organization and reported as metered usage. Overages are billed at the per-call rate for your plan tier.
Why not bill per session or per API call? Because sessions vary wildly in complexity. A simple "check my balance" query is one tool call. A Complete Loop is six. A customer support agent handling a return might make twelve. Per-tool-call billing ensures fair pricing regardless of how your agents use the SDK.
| Plan | Tool calls/mo | Price | Per-call |
|---|---|---|---|
| Hobby | 20,000 | $0 | Free |
| Starter | 200,000 | $29/mo | $0.000145 |
| Growth | 2,000,000 | $229/mo | $0.000115 |
| Enterprise | Custom | Custom | Custom |
What we shipped
Five packages. All typed. All building via Turborepo. Published on npm at v0.2.0.
| Package | Purpose |
|---|---|
| @codespar/sdk | Core: sessions, tools, execute, send, sendStream, loop, managed auth |
| @codespar/claude | Claude API adapter (getTools, handleToolUse, toToolResultBlock) |
| @codespar/openai | OpenAI function calling adapter |
| @codespar/vercel | Vercel AI SDK adapter with execute() functions |
| @codespar/mcp | MCP config generator for IDE integration |
The architecture decision that made shipping fast: all adapters depend on @codespar/sdk for types and the session interface, but they do not depend on each other. Installing @codespar/openai does not pull in Vercel AI SDK. Installing @codespar/vercel does not pull in the Anthropic SDK. You only pay for the dependencies you actually use.
What is next
The SDK at v0.2.0 covers the core surface: sessions, tools, send, sendStream, execute, loop, managed auth, and billing tracking. Here is what is coming next, in order of priority:
- Stripe billing integration (Marco 3). Wire metered billing from
session_tool_callsto Stripe. Quota enforcement in the execute and send endpoints. Self-serve plan upgrades from the dashboard. This is the monetization milestone. - Real MCP routing. Currently, the meta tools execute against mock servers. The next step connects them to the 57 live MCP servers via Streamable HTTP transport. Tool calls will hit real Zoop, Nuvem Fiscal, Melhor Envio, and Omie APIs.
- Managed connection UI. An OAuth connection flow in the dashboard where users click "Connect Zoop" and complete the authorization without writing code. Tokens stored server-side, auto-refreshed, never exposed to the client.
- Python SDK. Same API surface, same session model, targeting LangChain and CrewAI adapters. Python is essential for reaching the ML/data science agent community.
- Server-side triggers. Define automation rules in the dashboard: "When
payment.completed, auto-issue NF-e" or "Whenshipping.label_created, notify customer on WhatsApp." This turns CodeSpar from a tool execution layer into a commerce automation engine that runs with no code at all. - Mexico and Colombia presets. The MCP server catalog already covers 4 countries. Adding presets for Mexico (SAT/CFDI, Conekta, Envia) and Colombia (DIAN, Wompi, Envia) unlocks the rest of LatAm.
The SDK is the product. Everything else -- dashboard, servers, docs -- exists to make the SDK easier to adopt.
The SDK is open source. The code is at github.com/codespar/codespar-core. Install it with npm install @codespar/sdk and tell us what breaks.
This post covers the public SDK (@codespar/sdk@0.2.0). For a hands-on tutorial, see the Complete Loop tutorial. The enterprise packages (PolicyEngine, MandateGenerator, PaymentGateway) are covered in the thesis post.