PAMbaseDocs
Concept

AI gateway.

The gateway is PAMbase running a memory-grounded language-model turn on your behalf. You send a user message; PAMbase pulls the relevant slice of the user's memory, runs the model, and hands you back a reply. It exists for apps that want a smart, personalized assistant but don't bring their own model.

Why it matters

Without the gateway, adding an assistant means wiring up a model provider, building your own retrieval over the user's memory, and writing the new memories back yourself. The gateway collapses all of that into a single call. For Margin, it's how the built-in “what should I read next?” chat works without Margin ever touching a model API.

Hosted vs. your own model

There are two ways to power an assistant; pick based on whether you already run a model:

  • Hosted (the gateway) — best when you don't have a model. PAMbase pays the model provider and you pay PAMbase, so billing is clean and there's a single SLA. Requires the scope ai:host:chat (or ai:host:companion for an embedded companion surface).
  • Bring your own key (BYOK) — if you already have a model provider, supply your own key. PAMbase still grounds the turn in memory and records new memories, but charges only for the memory and retrieval work, not for hosted tokens.

If your app has its own model and its own prompting stack already, you may not need the gateway at all — pull a brief or raw context and run your own turn. The gateway is the shortcut, not the only path.

What happens on each call

A gateway call does three things in one round trip. First it builds context — it gathers the memories most relevant to the moment, using semantic search, graph links, recency, and importance (the mechanics live in Vector + graph), all filtered to the scopes you hold. Then it runs the model with that context plus the user's tone note. Finally it records any durable memories the turn produced, filtered by your memory:write:<scope> scopes — unless you mark the turn ephemeral: true, in which case nothing is written.

The hosted assistant is neutral
PAMbase does not impose a personality. The hosted model is a neutral assistant grounded in the user's memory; it honors the user's tone note if one is set. Shape its voice with your own appContext and framing — your app brings the persona.

Margin's recommendation turn

Here is Margin's recommend chat. appContext passes app-side state into the turn — here, what the user is currently reading — so the recommendation can react to the moment:

margin/recommend.ts
const reply = await pambase.chat({
intent: "app.companion.turn",
userMessage: "What should I read next?",
appContext: { currentArticle: "A primer on backpressure" },
});
reply.reply; // → assistant text, e.g. a ranked list of suggestions
reply.toneTags; // → tone hints from the broker
reply.memoryCandidates; // → memories recorded from this turn (if any)
reply.usage; // → token + model accounting
Auto-record and tool-calling
By default chat() auto-records the memories it surfaces in memoryCandidates, so the user's memory keeps improving as they talk — pass ephemeral: true to opt a turn out. The gateway also handles tool-calling internally where supported, so a turn can look things up before replying.

Need tokens as they stream? Use chatStream(). The full call signature, streaming, and response shape are in Host the AI.

Next

  • Host the AI — the complete chat and streaming reference.
  • The user's memory — what grounds every gateway turn, and where the tone note comes from.
  • Permissions — the ai:host:* and memory:write scopes a turn relies on.