Skip to main content

Documentation Index

Fetch the complete documentation index at: https://veryfront.com/docs/llms.txt

Use this file to discover all available pages before exploring further.

Providers

Provider registry. Maps “provider/model” strings to framework-compatible model runtimes.

Prerequisites

  • At least one agent defined under agents/ (see Agents).
  • One of the following:
    • A Veryfront Cloud token (VERYFRONT_API_TOKEN plus VERYFRONT_PROJECT_SLUG),
    • An API key for a direct provider (OPENAI_API_KEY, ANTHROPIC_API_KEY, or GOOGLE_API_KEY), or
    • A local inference target if you want to run without external providers.
For most projects, omit model entirely and let runtime defaults choose the right backend:
import { agent } from "veryfront/agent";

export default agent({
  system: "You are a helpful assistant.",
});
Verify provider resolution through any AG-UI route that uses this agent:
curl -N http://localhost:3000/api/ag-ui \
  -H "Content-Type: application/json" \
  -d '{"messages":[{"id":"1","role":"user","parts":[{"type":"text","text":"Reply with the active inference mode if available."}]}]}'
In a client UI, useChat({ api: "/api/ag-ui" }) also exposes inferenceMode so you can confirm whether the response used cloud, server-local, or browser inference. By convention:
  • local development without cloud bootstrap uses local inference or explicit provider env vars
  • Veryfront Cloud is selected automatically when VERYFRONT_API_TOKEN and project context such as VERYFRONT_PROJECT_SLUG are available
  • VERYFRONT_DEFAULT_MODEL, VERYFRONT_DEFAULT_EMBEDDING_MODEL, and VERYFRONT_RAG_BACKEND are escape hatches, not required config

Explicit provider environment variables

VariableProvider
OPENAI_API_KEYOpenAI
ANTHROPIC_API_KEYAnthropic
GOOGLE_API_KEYGoogle
OPENAI_BASE_URLCustom OpenAI-compatible endpoint
Explicit provider env vars still work when you want to pin a provider directly:
import { agent } from "veryfront/agent";

export default agent({
  model: "openai/gpt-5.2", // OpenAI
  // model: "anthropic/claude-sonnet-4-6", // Anthropic
  // model: "google/gemini-2.5-flash",     // Google
  system: "You are a helpful assistant.",
});

Zero-config local AI

Chat works out of the box with no API keys. When no cloud provider key is set, the framework automatically falls back through a three-tier inference chain:
Cloud provider (API key set)
    ↓ fallback (no key)
Server-local (SmolLM2-135M via ONNX Runtime)
    ↓ fallback (ONNX unavailable, e.g. compiled binary)
Browser Worker (transformers.js from CDN)
  • Server-local: runs SmolLM2-135M with @huggingface/transformers and ONNX Runtime. The model is downloaded and cached on first use (~100MB).
  • Browser fallback: when the server can’t load ONNX (e.g. compiled binaries), the chat handler returns a 503 with NO_AI_AVAILABLE. The useChat hook detects this and loads the same model in a Web Worker via CDN.
The fallback is transparent: useChat exposes inferenceMode ("cloud", "server-local", or "browser") so your UI can adapt. To explicitly use a local model:
agent({ model: "local/smollm2-135m" });
// Also available: "local/smollm2-360m", "local/smollm2-1.7b"
To disable server-side local AI (e.g. in tests):
VERYFRONT_DISABLE_LOCAL_AI=1

Model strings

Agents reference models as "provider/model". The framework splits on the first /, so nested model IDs work:
// Veryfront Cloud explicit override
agent({ model: "veryfront-cloud/openai/gpt-5.2" });

// Direct provider override
agent({ model: "openai/gpt-5.2" });

// Nested model ID (e.g. OpenRouter)
agent({ model: "openai/meta-llama/llama-3.1-405b" });

OpenAI-compatible services

Override the base URL to route through OpenRouter, Azure OpenAI, Ollama, or any OpenAI-compatible API:
OPENAI_API_KEY=<API_KEY>
OPENAI_BASE_URL=https://openrouter.ai/api/v1
Both apiKey and baseURL are resolved per-request, so each project in a multi-tenant setup can have its own configuration.

Custom provider registration

For providers not covered by env vars, use registerModelProvider():
import { registerModelProvider } from "veryfront/provider";

registerModelProvider("ollama", (id) => {
  // Return a framework-compatible model runtime for this model ID.
  // Prefer built-in providers when possible; custom registration is an
  // advanced interop surface for non-standard backends. The runtime must
  // implement the framework's generation hooks, including doGenerate()
  // and doStream().
  return createOllamaRuntime(id);
});

// Then use it
agent({ model: "ollama/llama3.2" });
The factory receives the model ID and must return a framework-compatible model runtime with the generation surface the framework expects, including doGenerate() and doStream().

Direct model resolution

For cases outside the agent system:
import { resolveModel } from "veryfront/provider";

const model = resolveModel("openai/gpt-5.2");
const cloudModel = resolveModel("veryfront-cloud/openai/gpt-5.2");

Verify it worked

Call your agent’s AG-UI route once provider env vars are set:
curl -N http://localhost:3000/api/ag-ui \
  -H "Content-Type: application/json" \
  -d '{"messages":[{"id":"1","role":"user","parts":[{"type":"text","text":"Reply with the active inference mode if available."}]}]}'
A token stream that ends without an authentication error means the provider resolved. In a chat UI, the inferenceMode field on useChat reports whether the call used cloud, server-local, or browser inference.

Next

  • Middleware: add CORS, rate limiting, and logging
  • Agents: agents use providers for AI models