LLM Providers
Zup wraps the Vercel AI SDK so you can use any supported LLM provider with the same interface. Supported providers include Anthropic, OpenAI, Google Gemini, Mistral, Groq, xAI, Cohere, Perplexity, Together AI, DeepInfra, Cerebras, OpenRouter, Azure OpenAI, Amazon Bedrock, Google Vertex AI, and any OpenAI-compatible endpoint.
LLM configuration is optional. Many plugins (like http-monitor) work without an LLM. Plugins that require LLM access (like investigation-orienter) will check for ctx.llm at runtime.
Configuration
Section titled “Configuration”Set the llm field in your agent options:
Anthropic
Section titled “Anthropic”import { createAgent } from 'zupdev';
const agent = await createAgent({ name: 'my-agent', llm: { provider: 'anthropic', apiKey: process.env.ANTHROPIC_API_KEY!, model: 'claude-sonnet-4-6', }, plugins: [...],});| Field | Type | Required | Description |
|---|---|---|---|
provider | 'anthropic' | Yes | Selects the Anthropic provider. |
apiKey | string | Yes | Anthropic API key. |
model | string | Yes | Model name (e.g., 'claude-sonnet-4-6', 'claude-haiku-4-20250514'). |
baseURL | string | No | Custom API endpoint. Useful for proxies or API gateways. |
OpenAI
Section titled “OpenAI”const agent = await createAgent({ name: 'my-agent', llm: { provider: 'openai', apiKey: process.env.OPENAI_API_KEY!, model: 'gpt-4o', }, plugins: [...],});| Field | Type | Required | Description |
|---|---|---|---|
provider | 'openai' | Yes | Selects the OpenAI provider. |
apiKey | string | Yes | OpenAI API key. |
model | string | Yes | Model name (e.g., 'gpt-4o', 'gpt-4o-mini'). |
baseURL | string | No | Custom API endpoint. |
organization | string | No | OpenAI organization ID. |
Google Gemini
Section titled “Google Gemini”llm: { provider: 'google', apiKey: process.env.GOOGLE_API_KEY!, model: 'gemini-2.0-flash',}Mistral
Section titled “Mistral”llm: { provider: 'mistral', apiKey: process.env.MISTRAL_API_KEY!, model: 'mistral-large-latest',}llm: { provider: 'groq', apiKey: process.env.GROQ_API_KEY!, model: 'llama-3.3-70b-versatile',}xAI (Grok)
Section titled “xAI (Grok)”llm: { provider: 'xai', apiKey: process.env.XAI_API_KEY!, model: 'grok-2',}OpenRouter
Section titled “OpenRouter”llm: { provider: 'openrouter', apiKey: process.env.OPENROUTER_API_KEY!, model: 'anthropic/claude-sonnet-4',}Azure OpenAI
Section titled “Azure OpenAI”llm: { provider: 'azure', apiKey: process.env.AZURE_API_KEY!, model: 'gpt-4o', resourceName: 'my-resource', apiVersion: '2024-12-01-preview', // optional}Amazon Bedrock
Section titled “Amazon Bedrock”llm: { provider: 'amazon-bedrock', model: 'anthropic.claude-sonnet-4-6-v1:0', region: 'us-east-1', accessKeyId: process.env.AWS_ACCESS_KEY_ID!, secretAccessKey: process.env.AWS_SECRET_ACCESS_KEY!,}Google Vertex AI
Section titled “Google Vertex AI”llm: { provider: 'google-vertex', model: 'gemini-2.0-flash', project: 'my-gcp-project', location: 'us-central1',}Other providers
Section titled “Other providers”Cohere, Perplexity, Together AI, DeepInfra, and Cerebras all follow the same simple pattern:
llm: { provider: 'cohere', // or 'perplexity', 'togetherai', 'deepinfra', 'cerebras' apiKey: process.env.COHERE_API_KEY!, model: 'command-a-08-2025',}OpenAI-compatible
Section titled “OpenAI-compatible”For any provider that exposes an OpenAI-compatible API (Ollama, vLLM, LiteLLM, etc.):
llm: { provider: 'openai-compatible', baseURL: 'http://localhost:11434/v1', apiKey: 'ollama', model: 'llama3.1',}LLM capability
Section titled “LLM capability”When an LLM is configured, ctx.llm is populated with an LLMCapability object that provides four methods:
type LLMCapability = { provider: LLMProvider; config: LLMConfig;
generateText(prompt: string, options?: GenerateOptions): Promise<TextResult>; generateStructured<T>(prompt: string, schema: ZodSchema<T>, options?: GenerateOptions): Promise<T>; streamText(prompt: string, options?: GenerateOptions): AsyncIterable<TextChunk>; chat(messages: ChatMessage[], options?: ChatOptions): Promise<ChatResult>;};Usage patterns
Section titled “Usage patterns”Generate text
Section titled “Generate text”The simplest usage — send a prompt, get text back.
const result = await ctx.llm.generateText( 'Summarize the current system health based on these metrics: ...', { temperature: 0.3, maxTokens: 500, system: 'You are an SRE agent analyzing system health.', });
console.log(result.text); // The generated textconsole.log(result.usage); // { promptTokens, completionTokens, totalTokens }console.log(result.model); // The actual model that respondedTextResult:
type TextResult = { text: string; usage?: TokenUsage; finishReason?: 'stop' | 'length' | 'content_filter' | 'tool_calls'; model?: string;};Generate structured output
Section titled “Generate structured output”Use a Zod schema to get validated, typed output from the LLM. The AI SDK uses native structured output where the provider supports it (OpenAI, Google) and tool-based extraction as a fallback, with automatic Zod schema validation.
import { z } from 'zod';
const HealthSummary = z.object({ status: z.enum(['healthy', 'degraded', 'down']), affectedServices: z.array(z.string()), severity: z.enum(['low', 'medium', 'high', 'critical']), recommendation: z.string(),});
type HealthSummary = z.infer<typeof HealthSummary>;
const summary: HealthSummary = await ctx.llm.generateStructured( 'Analyze these observations and determine system health: ...', HealthSummary, { temperature: 0.1, // Lower temperature for more deterministic structured output system: 'You are an SRE agent. Respond with a structured health assessment.', });
// summary is fully typed as HealthSummaryconsole.log(summary.status); // 'degraded'console.log(summary.affectedServices); // ['api-gateway', 'auth-service']If the LLM returns invalid output or the response fails Zod validation, generateStructured throws an error.
Stream text
Section titled “Stream text”For long-running generation or real-time output, use streamText to get an async iterable of text chunks:
const stream = ctx.llm.streamText( 'Explain the root cause of this outage in detail: ...', { maxTokens: 2000, system: 'You are an SRE agent performing post-incident analysis.', });
for await (const chunk of stream) { process.stdout.write(chunk.text);
if (chunk.done) { console.log('\n--- Generation complete ---'); }}TextChunk:
type TextChunk = { text: string; done: boolean; // true on the final chunk};Chat with tool calling
Section titled “Chat with tool calling”The chat method supports multi-turn conversations and LLM tool calling. This is the foundation for the investigation system.
const tools = [ { name: 'query_logs', description: 'Search application logs', inputSchema: { type: 'object', properties: { query: { type: 'string', description: 'Log search query' }, timeRange: { type: 'string', description: 'Time range (e.g., "1h", "30m")' }, }, required: ['query'], }, }, { name: 'get_metrics', description: 'Fetch system metrics', inputSchema: { type: 'object', properties: { metric: { type: 'string', description: 'Metric name' }, period: { type: 'string', description: 'Time period' }, }, required: ['metric'], }, },];
const messages: ChatMessage[] = [ { role: 'user', content: 'Investigate why the API latency spiked at 14:30 UTC.' },];
const result = await ctx.llm.chat(messages, { tools, system: 'You are an SRE agent. Use the available tools to investigate.', maxTokens: 4096,});
// Check if the LLM wants to call toolsif (result.stopReason === 'tool_use') { for (const toolCall of result.toolCalls) { console.log(`Tool call: ${toolCall.name}(${JSON.stringify(toolCall.input)})`); // Execute the tool and feed results back... }}ChatMessage types:
type ChatMessage = | { role: 'user'; content: string } | { role: 'assistant'; content: string; toolCalls?: ToolCall[] } | { role: 'tool'; toolCallId: string; content: string };ChatResult:
type ChatResult = { content: string; toolCalls: ToolCall[]; stopReason: 'end_turn' | 'tool_use' | 'max_tokens' | 'stop_sequence'; usage?: TokenUsage; model?: string;};ToolDefinition:
type ToolDefinition = { name: string; description: string; inputSchema: Record<string, unknown>; // JSON Schema object};GenerateOptions reference
Section titled “GenerateOptions reference”All generation methods accept an optional GenerateOptions object:
| Field | Type | Default | Description |
|---|---|---|---|
maxTokens | number | 4096 | Maximum tokens to generate. |
temperature | number | Provider default | Sampling temperature (0-2). Lower values produce more deterministic output. |
topP | number | Provider default | Top-P / nucleus sampling. |
stop | string[] | — | Stop sequences that halt generation. |
timeout | number | — | Request timeout in milliseconds. |
system | string | — | System prompt prepended to the conversation. |
ChatOptions extends GenerateOptions with an additional tools field:
| Field | Type | Description |
|---|---|---|
tools | ToolDefinition[] | Tool definitions the LLM can call. |
Using LLM in plugins
Section titled “Using LLM in plugins”Plugins access the LLM through ctx.llm. Always check for its existence first, since LLM configuration is optional:
import { definePlugin, createOrienter } from 'zupdev';
export const myPlugin = () => definePlugin({ id: 'my-plugin',
orienters: { analyze: createOrienter({ name: 'llm-analysis', description: 'Use LLM to analyze observations', orient: async (observations, ctx) => { if (!ctx.llm) { return { source: 'my-plugin/analyze', findings: ['LLM not configured -- skipping analysis'], confidence: 0.3, }; }
const result = await ctx.llm.generateText( `Analyze these observations: ${JSON.stringify(observations)}`, { temperature: 0.2 } );
return { source: 'my-plugin/analyze', findings: [result.text], confidence: 0.8, }; }, }), },});Investigation orienter
Section titled “Investigation orienter”The investigation-orienter plugin is a production example of LLM-powered orientation. It uses ctx.llm.chat with tool calling to run a multi-turn investigation loop within the Orient phase:
import { investigationOrienter } from 'zupdev/plugins/investigation-orienter';
const agent = await createAgent({ llm: { provider: 'anthropic', apiKey: process.env.ANTHROPIC_API_KEY!, model: 'claude-sonnet-4-6', }, plugins: [ investigationOrienter({ triggerSeverity: 'warning', // Only investigate warning+ observations maxTurns: 15, // Max tool-calling rounds tools: [ { name: 'query_logs', description: 'Search logs', parameters: z.object({ query: z.string() }), execute: async (params) => { // Query your logging system return JSON.stringify(results); }, }, ], }), ],});The investigation orienter checks whether any observation meets the triggerSeverity threshold. If so, it builds a prompt from the observations and runs a tool-calling loop. The LLM’s final response is parsed into a SituationAssessment with extracted findings, contributing factors, and impact assessment.
Creating an LLM provider directly
Section titled “Creating an LLM provider directly”If you need LLM access outside of an agent context, you can create a provider directly:
import { createLLMProvider, createLLMCapability } from 'zupdev';
// Create a raw providerconst provider = createLLMProvider({ provider: 'anthropic', apiKey: process.env.ANTHROPIC_API_KEY!, model: 'claude-sonnet-4-6',});
const result = await provider.generateText('Hello, world!');
// Or create a full LLMCapability (same object that appears on ctx.llm)const llm = createLLMCapability({ provider: 'openai', apiKey: process.env.OPENAI_API_KEY!, model: 'gpt-4o',});
const structured = await llm.generateStructured('...', myZodSchema);Token usage tracking
Section titled “Token usage tracking”All methods that return TextResult or ChatResult include optional usage information:
type TokenUsage = { promptTokens: number; completionTokens: number; totalTokens: number;};Token counts are provided by the upstream API and may not be available for all providers (especially some OpenAI-compatible ones).