Host the AI in your app.
chat() is the all-in-one turn: PAMbase reads the user's memory, composes a memory-grounded prompt, runs a model, returns the reply, and records anything durable it learns — all behind one call. This is the fastest way to ship a conversational feature without running your own model or retrieval.
appContext (described below), or host the model yourself with your own system prompt (see the BYOK pattern in Request context). PAMbase brings the person; your app brings the persona.ai:host:chat (or ai:host:companion for an embedded companion). A scope is one permission. Examples use Margin, a reading companion that recommends what to read next.One call: Margin's recommendation chat
Give it an intent (a label for the moment), the user's message, and optional appContext — live state plus your persona framing. You get back the reply, tone tags, the memories recorded this turn, and token usage.
const res = await pambase.chat({intent: "app.companion.turn",userMessage: "What should I read next?",appContext: {persona: "Margin, a warm, well-read companion who recommends concisely.",shelf: ["Designing Data-Intensive Applications"], // live state the model can't infer},});console.log(res.reply); // e.g. "Given your essays on rate limiting, try 'Release It!'…"console.log(res.toneTags); // e.g. ["warm", "encouraging"]console.log(res.memoryCandidates); // memories captured this turn (see auto-record below)console.log(res.usage); // { tokensIn, tokensOut, model }
Expected result: a recommendation grounded in the user's saved highlights and observed taste, in Margin's voice — and any new fact (“wants something shorter next”) is recorded automatically.
What happens behind the scenes
- PAMbase builds a context bundle for your intent (the same one
getContext()returns). - Composes the system prompt from the light identity + relevant memories + your
appContextframing. - Exposes a
record_memorytool the model can call when it learns something durable (this is the one bit of tool-calling involved). - Runs the model and returns the reply.
- Persists any new memories with your write scopes applied — out-of-scope candidates are dropped.
Auto-record (and how to turn it off)
By default, chat() auto-records durable memories it learns during the turn — they appear in res.memoryCandidates and persist under your write scopes. For turns you don't want remembered (debug, tutorials, throwaway prompts), pass ephemeral: true and nothing is recorded.
// A throwaway preview turn — answer, but don't remember anything.const res = await pambase.chat({intent: "app.companion.turn",userMessage: "Just testing — say hi.",ephemeral: true, // nothing persists; res.memoryCandidates is empty});
Streaming for a typing effect
For a live typing UI, use chatStream(). It returns an async generator of events; iterate with for await. Pass an AbortSignal to cancel mid-stream.
const controller = new AbortController();const stream = pambase.chatStream({ intent: "app.companion.turn", userMessage: "What should I read next?" },controller.signal,);for await (const ev of stream) {switch (ev.type) {case "delta":process.stdout.write(ev.text); // append token text to the UIbreak;case "memory_recorded":console.log("learned:", ev.candidate); // a memory was persisted mid-turnbreak;case "done":console.log("\nusage:", ev.usage); // { tokensIn, tokensOut, model }break;case "error":console.error(ev.message); // terminal — the stream endsbreak;}}
The event union:
| Event | Shape | Meaning |
|---|---|---|
delta | { type: "delta", text } | A chunk of reply text — concatenate in order. |
memory_recorded | { type: "memory_recorded", candidate } | A durable memory was persisted this turn (skipped when ephemeral). |
done | { type: "done", usage } | Stream finished; final token usage. |
error | { type: "error", message } | Terminal error; the stream ends. |
Error handling
The SDK throws typed errors. The two you must handle around hosting:
import { UnauthorizedError, ScopeDeniedError, RateLimitError } from "@pambase/sdk";try {const res = await pambase.chat({ intent: "app.companion.turn", userMessage });} catch (err) {if (err instanceof UnauthorizedError) {// 401 — token expired or revoked. Clear it and re-run the connect flow.await beginReconnect();} else if (err instanceof ScopeDeniedError) {// 403 — you lack ai:host:chat / ai:host:companion (err.scope names it).disableChatFeature(err.scope);} else if (err instanceof RateLimitError) {await delay(err.retryAfterMs); // retry guidance from the platform} else {throw err;}}
UnauthorizedError (401), the only recovery is re-running the connect flow. Don't retry the same call. Network errors, 429s, and 5xx are retried automatically by the SDK.Gotchas & best practices
- History is bounded — don't resend it. The gateway clips conversation history to a token budget server-side. Send the new turn, not the full transcript; pass at most the last few turns if you need continuity.
- Put live state in
appContext, notuserMessage. The current shelf, open document, or persona framing go inappContext; the user's words go inuserMessage. - Use
ephemeral: truefor turns you don't want recorded. - Don't embed system instructions in
userMessage. It breaks grounding — frame viaappContextor host the model yourself. - Watch
res.usageto track token spend per turn.
Required scope
Either ai:host:chat (a full chat surface) or ai:host:companion (an embedded companion). A missing grant surfaces as ScopeDeniedError. Declare it in your manifest.
What to expect next
- Gateway — how prompts are composed, models run, and history is bounded.
- Request context — the BYOK alternative when you host the model.
- Remember & recall — write memory explicitly instead of via auto-record.