How-to

Host the AI in your app.

chat() is the all-in-one turn: PAMbase reads the user's memory, composes a memory-grounded prompt, runs a model, returns the reply, and records anything durable it learns — all behind one call. This is the fastest way to ship a conversational feature without running your own model or retrieval.

Bring your own persona (read this first)

The hosted assistant is memory-grounded and neutral on purpose — PAMbase never authors a personality, mood, or voice, because users don't author personas either. You shape the character. Pass your persona, voice, and live framing through appContext (described below), or host the model yourself with your own system prompt (see the BYOK pattern in Request context). PAMbase brings the person; your app brings the persona.

Prerequisites

You hold a connection token with ai:host:chat (or ai:host:companion for an embedded companion). A scope is one permission. Examples use Margin, a reading companion that recommends what to read next.

One call: Margin's recommendation chat

Give it an intent (a label for the moment), the user's message, and optional appContext — live state plus your persona framing. You get back the reply, tone tags, the memories recorded this turn, and token usage.

typescript

const res = await pambase.chat({
  intent: "app.companion.turn",
  userMessage: "What should I read next?",
  appContext: {
    persona: "Margin, a warm, well-read companion who recommends concisely.",
    shelf: ["Designing Data-Intensive Applications"], // live state the model can't infer
  },
});

console.log(res.reply);            // e.g. "Given your essays on rate limiting, try 'Release It!'…"
console.log(res.toneTags);         // e.g. ["warm", "encouraging"]
console.log(res.memoryCandidates); // memories captured this turn (see auto-record below)
console.log(res.usage);            // { tokensIn, tokensOut, model }

Expected result: a recommendation grounded in the user's saved highlights and observed taste, in Margin's voice — and any new fact (“wants something shorter next”) is recorded automatically.

What happens behind the scenes

PAMbase builds a context bundle for your intent (the same one getContext() returns).
Composes the system prompt from the light identity + relevant memories + your appContext framing.
Exposes a record_memory tool the model can call when it learns something durable (this is the one bit of tool-calling involved).
Runs the model and returns the reply.
Persists any new memories with your write scopes applied — out-of-scope candidates are dropped.

Auto-record (and how to turn it off)

By default, chat() auto-records durable memories it learns during the turn — they appear in res.memoryCandidates and persist under your write scopes. For turns you don't want remembered (debug, tutorials, throwaway prompts), pass ephemeral: true and nothing is recorded.

typescript

// A throwaway preview turn — answer, but don't remember anything.
const res = await pambase.chat({
  intent: "app.companion.turn",
  userMessage: "Just testing — say hi.",
  ephemeral: true,        // nothing persists; res.memoryCandidates is empty
});

Streaming for a typing effect

For a live typing UI, use chatStream(). It returns an async generator of events; iterate with for await. Pass an AbortSignal to cancel mid-stream.

margin/stream.ts

const controller = new AbortController();

const stream = pambase.chatStream(
  { intent: "app.companion.turn", userMessage: "What should I read next?" },
  controller.signal,
);

for await (const ev of stream) {
  switch (ev.type) {
    case "delta":
      process.stdout.write(ev.text);           // append token text to the UI
      break;
    case "memory_recorded":
      console.log("learned:", ev.candidate);   // a memory was persisted mid-turn
      break;
    case "done":
      console.log("\nusage:", ev.usage);       // { tokensIn, tokensOut, model }
      break;
    case "error":
      console.error(ev.message);               // terminal — the stream ends
      break;
  }
}

The event union:

Event	Shape	Meaning
`delta`	`{ type: "delta", text }`	A chunk of reply text — concatenate in order.
`memory_recorded`	`{ type: "memory_recorded", candidate }`	A durable memory was persisted this turn (skipped when ephemeral).
`done`	`{ type: "done", usage }`	Stream finished; final token usage.
`error`	`{ type: "error", message }`	Terminal error; the stream ends.

Error handling

The SDK throws typed errors. The two you must handle around hosting:

typescript

import { UnauthorizedError, ScopeDeniedError, RateLimitError } from "@pambase/sdk";

try {
  const res = await pambase.chat({ intent: "app.companion.turn", userMessage });
} catch (err) {
  if (err instanceof UnauthorizedError) {
    // 401 — token expired or revoked. Clear it and re-run the connect flow.
    await beginReconnect();
  } else if (err instanceof ScopeDeniedError) {
    // 403 — you lack ai:host:chat / ai:host:companion (err.scope names it).
    disableChatFeature(err.scope);
  } else if (err instanceof RateLimitError) {
    await delay(err.retryAfterMs); // retry guidance from the platform
  } else {
    throw err;
  }
}

There is no refresh token

On UnauthorizedError (401), the only recovery is re-running the connect flow. Don't retry the same call. Network errors, 429s, and 5xx are retried automatically by the SDK.

Gotchas & best practices

History is bounded — don't resend it. The gateway clips conversation history to a token budget server-side. Send the new turn, not the full transcript; pass at most the last few turns if you need continuity.
Put live state in appContext, not userMessage. The current shelf, open document, or persona framing go in appContext; the user's words go in userMessage.
Use ephemeral: true for turns you don't want recorded.
Don't embed system instructions in userMessage. It breaks grounding — frame via appContext or host the model yourself.
Watch res.usage to track token spend per turn.

Required scope

Either ai:host:chat (a full chat surface) or ai:host:companion (an embedded companion). A missing grant surfaces as ScopeDeniedError. Declare it in your manifest.

What to expect next

Gateway — how prompts are composed, models run, and history is bounded.
Request context — the BYOK alternative when you host the model.
Remember & recall — write memory explicitly instead of via auto-record.