model is a "provider/model" string.
The provider registry resolves each string to one runtime:
- Veryfront Cloud
- a direct vendor such as OpenAI, Anthropic, or Google
- an OpenAI-compatible service such as OpenRouter
- a local model
model in most agents and let runtime conventions pick the backend.
Prerequisites
- At least one agent defined under
agents/(see Agents). - One of the following:
- A Veryfront Cloud token (
VERYFRONT_API_TOKENplusVERYFRONT_PROJECT_SLUG), - An API key for a direct provider (
OPENAI_API_KEY,ANTHROPIC_API_KEY, orGOOGLE_API_KEY), or - A local inference target if you want to run without external providers.
- A Veryfront Cloud token (
Runtime conventions (recommended)
For most projects, omitmodel entirely and let runtime defaults choose the
right backend:
useChat() also exposes inferenceMode so you can confirm
whether the response used cloud, server-local, or browser inference.
By convention:
- local development without cloud bootstrap uses local inference or explicit provider env vars
- Veryfront Cloud is selected automatically when
VERYFRONT_API_TOKENand project context such asVERYFRONT_PROJECT_SLUGare available VERYFRONT_DEFAULT_MODEL,VERYFRONT_DEFAULT_EMBEDDING_MODEL, andVERYFRONT_RAG_BACKENDare escape hatches, not required config
Set provider environment variables
Set only the variables for the provider you use:OPENAI_API_KEYfor OpenAI.ANTHROPIC_API_KEYfor Anthropic.GOOGLE_API_KEYfor Google.OPENAI_BASE_URLfor OpenAI-compatible services.
Zero-config local AI
Chat works out of the box with no API keys. When no cloud provider key is set, the framework automatically falls back through a three-tier inference chain:- Server-local: runs SmolLM2-135M with
@huggingface/transformersand ONNX Runtime. The model is downloaded and cached on first use (~100MB). - Browser fallback: when the server can’t load ONNX (e.g. compiled binaries), the chat handler returns a
503withNO_AI_AVAILABLE. TheuseChathook detects this and loads the same model in a Web Worker via CDN.
useChat exposes inferenceMode ("cloud", "server-local", or "browser") so your UI can adapt.
To explicitly use a local model:
Model strings
Agents reference models as"provider/model". The framework splits on the first /, so nested model IDs work:
OpenAI-compatible services
Override the base URL to route through OpenRouter, Azure OpenAI, Ollama, or any OpenAI-compatible API:apiKey and baseURL are resolved per-request, so each project in a multi-tenant setup can have its own configuration.
Custom provider registration
For providers not covered by env vars, useregisterModelProvider():
doGenerate() and doStream().
Direct model resolution
For cases outside the agent system:Verify it worked
Call your agent’s AG-UI route once provider env vars are set:inferenceMode field on useChat reports
whether the call used cloud, server-local, or browser inference.