Skip to main content
ConceptsMeta-tools

codespar_discover

Semantic + lexical tool search. pgvector text-embedding-3-small + pg_trgm fallback, with connection bias toward providers the session can already reach.

2 min read · updated

codespar_discover

Meta-tool

codespar_discover is how an agent finds the right tool when it does not yet have one loaded into the session. Powered by OpenAI text-embedding-3-small (1536-dim) over the mcp_tools.embedding column, with a pg_trgm lexical fallback. Ranked results are biased toward providers the session has already connected — a connected match outranks a higher-scoring unconnected match at the margin.

Covered by session.discover(query) typed wrapper.

Typed wrapper

const result = await session.discover("emit a Pix QR code for a buyer in Brazil");

for (const match of result.matches) {
  console.log(match.tool_name, match.server_id, match.score);
}
result = session.discover("emit a Pix QR code for a buyer in Brazil")

for match in result["matches"]:
    print(match["tool_name"], match["server_id"], match["score"])

Direct execute

const result = await session.execute("codespar_discover", {
  query: "send WhatsApp template for shipping update",
  limit: 5,
});

Args shape

FieldTypeRequiredDescription
querystringYesNatural-language search — describe the intent, not the tool name
limitnumberNoMaximum matches to return (default 10)

Result shape

type DiscoverResult = {
  matches: Array<{
    tool_name: string;       // e.g. "codespar_charge" or a server-specific tool
    server_id: string;       // e.g. "asaas", "mercadopago"
    score: number;           // 0..1, higher is better
    description: string;     // tool description from the catalog
    connected: boolean;      // is this server reachable in the current session?
  }>;
};

A connected match scoring 0.85 outranks an unconnected match scoring 0.92 — the bias is intentional. The agent should prefer tools it can actually call right now; if the best match is unconnected, pair codespar_discover with codespar_manage_connections to start a connection flow.

Two-stage retrieval

  1. Semantic — embed the query with text-embedding-3-small, cosine-rank against mcp_tools.embedding.
  2. Lexical fallback — if semantic returns thin results (low max score), pg_trgm runs over tool_name + description as a backup.

Operators populate the embedding column via the embed-mcp-tools.ts seed script, which is idempotent — only re-embeds rows whose description_hash changed since the last run.

Operator setup

No connection needed by the operator — codespar_discover runs against the catalog index in the CodeSpar backend, not against an external provider. The OpenAI embeddings call is on CodeSpar's side; tenants do not stamp an OpenAI key for this.

The operator runbook at codespar-enterprise/docs/operations/meta-tool-runbook.md covers re-embedding when the catalog changes.

Cost note

codespar_discover counts as a tool call because it queries the live catalog and the embedding index. Cache the response for the duration of the conversation to avoid redundant calls — a typical agent only needs to discover once per intent.

See also

Edit on GitHub

Last updated on