Fly Machines
The fly-machines plugin monitors Fly.io machines across multiple apps. It tracks machine state, health checks, and detects deployments by watching for instance ID and image digest changes. Unlike platforms with a dedicated deployments API, Fly.io deployments are inferred from machine-level changes — when a machine’s instance_id changes, a new deployment has occurred.
Installation
Section titled “Installation”import { createAgent } from 'zupdev';import { flyMachines } from 'zupdev/plugins/fly-machines';
const agent = await createAgent({ name: 'fly-agent', plugins: [ flyMachines({ auth: { token: process.env.FLY_API_TOKEN! }, apps: [ { name: 'my-app', serviceName: 'My App', }, ], }), ],});Requirements
Section titled “Requirements”A Fly.io API token is required. Generate one with fly tokens create from the Fly CLI.
Plugin options
Section titled “Plugin options”| Field | Type | Default | Description |
|---|---|---|---|
auth | FlyAuthConfig | — | Required. Authentication configuration. |
auth.token | string | — | Required. Fly.io API token. |
apps | FlyAppConfig[] | — | Required. Apps to monitor. At least one app must be configured. |
pollIntervalMs | number | 60000 | Polling interval in milliseconds. |
maxMachinesPerApp | number | 50 | Maximum machines to track per app. |
apiBaseUrl | string | 'https://api.machines.dev' | Fly Machines API base URL. |
App configuration
Section titled “App configuration”Each app describes a Fly.io application to monitor:
| Field | Type | Required | Description |
|---|---|---|---|
name | string | Yes | Fly.io app name (as shown in fly apps list). |
serviceName | string | Yes | Human-readable service name for SRE context (used in observations and findings). |
regions | string[] | No | Filter machines by specific regions (e.g., ['ord', 'cdg']). If not set, all regions are included. |
metadata | Record<string, string> | No | Filter machines by metadata key-value pairs. |
OODA phase contributions
Section titled “OODA phase contributions”Observe: fly-machine-status
Section titled “Observe: fly-machine-status”The observer polls the Fly Machines API for each configured app and produces two types of observations:
Deployment events (fly-machines/deployment): Emitted when machines are updated (detected via instance_id changes). Each deployment event includes:
- App and service names
- New image digest, repository, and tag
- Machines and regions affected
- Deployment status:
completed(all machines started),in_progress,partial(mixed success/failure), orfailed - Severity:
infofor completed,criticalfor failed,errorfor partial
Machine state (fly-machines/machine): One observation per machine with:
- Machine ID, name, state, region, and instance ID
- Image reference (digest, repository, tag)
- Guest configuration (CPU, memory)
- Health check results
- Most recent machine event
- Severity:
warningif the machine is stopped, destroyed, or has failing health checks;infootherwise
API errors (fly-machines/error): Emitted when the Fly API call fails for an app, with warning severity.
Orient: analyze-fly-machines
Section titled “Orient: analyze-fly-machines”Analyzes Fly.io observations to provide deployment and health context:
- Reports deployment status (completed, in progress, partial, failed) with machine counts and affected regions
- Groups machines by service and reports running/total counts per region
- Identifies stopped or suspended machines
- Reports machines with failing health checks
- Sets
contributingFactorwhen deployment failures or health check failures are detected - Confidence:
0.85
REST API endpoints
Section titled “REST API endpoints”All endpoints require authentication by default.
GET /fly/apps
Section titled “GET /fly/apps”Lists all configured Fly.io apps with machine status summary.
Response:
{ "apps": [ { "name": "my-app", "serviceName": "My App", "configuredRegions": ["ord", "cdg"], "activeRegions": ["ord", "cdg"], "lastFetchTime": "2025-06-15T10:30:00.000Z", "machineCount": 4, "runningCount": 4, "isConsistent": true, "currentImageDigest": "sha256:abc123...", "imageDigestCount": 1 } ]}The isConsistent field indicates whether all machines are running the same image digest. When false, a deployment may be in progress or partially failed.
GET /fly/apps/:appName/machines
Section titled “GET /fly/apps/:appName/machines”Returns detailed machine information for a specific app.
Response:
{ "app": { "name": "my-app", "serviceName": "My App" }, "machines": [ { "id": "e784079b449483", "name": "my-app-machine-1", "state": "started", "region": "ord", "instanceId": "01HXYZ...", "imageRef": { "repository": "registry.fly.io/my-app", "tag": "deployment-01HXYZ", "digest": "sha256:abc123..." }, "guest": { "cpu_kind": "shared", "cpus": 1, "memory_mb": 256 }, "checks": { "http": { "name": "http", "status": "passing", "output": "HTTP 200", "updated_at": "2025-06-15T10:29:00.000Z" } }, "recentEvents": [ { "type": "start", "status": "started", "timestamp": "2025-06-15T10:00:00.000Z" } ], "createdAt": "2025-06-01T00:00:00.000Z", "updatedAt": "2025-06-15T10:00:00.000Z" } ], "lastFetchTime": "2025-06-15T10:30:00.000Z"}Deployment detection
Section titled “Deployment detection”The plugin detects deployments by comparing each machine’s instance_id against its previously known value. When an instance ID changes, the machine has been updated — typically as part of a deployment.
Machines with the same new image digest are grouped into a single deployment event. The deployment status is determined by the states of the affected machines:
| Status | Condition |
|---|---|
completed | All affected machines are in the started state |
in_progress | Machines are still transitioning (not all started, none failed) |
partial | Some machines started, some failed (destroyed or replacing) |
failed | All affected machines failed |
Full example
Section titled “Full example”import { createAgent } from 'zupdev';import { flyMachines } from 'zupdev/plugins/fly-machines';
const agent = await createAgent({ name: 'fly-monitor', mode: 'continuous', loopInterval: 30000, api: { port: 3000, auth: { apiKeys: [{ key: process.env.API_KEY!, name: 'admin' }], }, }, plugins: [ flyMachines({ auth: { token: process.env.FLY_API_TOKEN! }, pollIntervalMs: 60000, maxMachinesPerApp: 50, apps: [ { name: 'api-prod', serviceName: 'API (Production)', regions: ['ord', 'cdg', 'nrt'], }, { name: 'worker-prod', serviceName: 'Background Worker', metadata: { role: 'worker' }, }, { name: 'api-staging', serviceName: 'API (Staging)', regions: ['ord'], }, ], }), ],});
const server = agent.startApi({ port: 3000 });await agent.start();Three Fly.io apps are monitored across multiple regions. The API production app is filtered to three specific regions; the worker app filters by metadata. Deployments are detected automatically when machines update their instance IDs. Machine health checks are tracked and surfaced as observations. The REST API on port 3000 provides on-demand machine and deployment status.