Skip to main content

At a glance

Credentials

Set these per environment. See Connect an integration.
VariableRequiredDescription
REPLICATE_API_TOKENYesReplicate API token (starts with r8_) Docs.

Setup

  1. Create a Replicate account: Go to https://replicate.com and sign up (GitHub sign-in supported). New accounts get a small amount of free usage before billing is required.
  2. Create an API token: Open https://replicate.com/account/api-tokens, give the token a descriptive name (e.g. ‘Veryfront Integration’), and create it.
  3. Store the token: Copy the token and add it to your .env file as REPLICATE_API_TOKEN=r8_…
  4. Verify access: Run the List Models tool to confirm the token works. A 401 means the token is wrong or revoked.
  • Predictions are billed per second of compute - costs vary widely by model and hardware
  • Create Prediction sends Prefer: wait=60 to return synchronously when possible; long-running models still return status ‘starting’ or ‘processing’ - poll with Get Prediction
  • Use the version ID from Get Model’s latest_version.id field when creating predictions
Provider API reference: https://replicate.com/docs/reference/http

Tools

ToolAccessDescription
List ModelsReadList public models available on Replicate
Get ModelReadGet details about a model, including its latest version ID
Create PredictionWriteRun a model by creating a prediction from a version ID and input object
Get PredictionReadGet the status and output of a prediction
Cancel PredictionWriteCancel a running prediction

Example prompts

  • Find a Replicate model for a task I describe and show me its latest version ID and inputs.
  • Run a Replicate model with inputs I provide and report the output when it finishes.