zup

Zup is an open source reliability agent for your production systems.

It watches for problems, figures out what's going on using current state and past incidents, and fixes things, automatically or with your approval.

built around the ooda loop:

┌─────────────┐          ┌─────────────┐
│   OBSERVE   │ ───────▶ │   ORIENT    │
└─────────────┘          └─────────────┘
       ▲                        │
       │                        │
       │                        ▼
┌─────────────┐          ┌─────────────┐
│     ACT     │ ◀─────── │   DECIDE    │
└─────────────┘          └─────────────┘

key features:

· runs continuously: watches system health and fixes problems (if you want it to)

· plugin architecture: use the built-in plugins or write your own

· alerting integration: send alerts to zup and check on its work via the api

· fully open source: read every line, fork it, run it however you want