AGENT/WATCH // release stability dashboard
··· auth
project

Loading…

UNSTABLE0–3.4 · 0 is worst RISKY3.5–5.1 MIXED5.2–6.7 MOSTLY STABLE6.8–8.1 STABLE8.2–10 · 10 is best LIVE · cron every 20 min Method
  • 8.2 – 10 STABLE — low observed release risk; 10 is the most stable end
  • 6.8 – 8.1 MOSTLY STABLE — real issues exist, but evidence points away from broad breakage
  • 5.2 – 6.7 MIXED — enough risk signal to check affected workflows before upgrading
  • 3.5 – 5.1 RISKY — core or broad failures are present in the issue stream
  • 0 – 3.4 UNSTABLE — 0 is the least stable end; severe/broad/core signals dominate
  • 5.0 COLLECTING SIGNAL — brand-new release, held neutral for the first 3 hours

Agent Watch grades OpenClaw and Hermes Agent releases on a 0–10 real-world stability scale: 10 means most stable, 0 means least stable.

Inputs. Each GitHub issue is classified for sentiment, severity, impact scope, affected functionality, estimated affected user share, duplicate clustering, and workaround status. Core CLI/gateway/agent failures weigh more than niche adapter or provider-specific issues.

Formula. The grade starts from impact-weighted issue risk, discounts confirmed workarounds and duplicate clusters, then blends in signed-in community ratings as real-environment evidence. Brand-new releases sit at neutral 5 for the first 3 hours while signal accumulates. Once the next 3 versions ship, unchanged older scores are frozen against the latest input change.

Updated every 20 minutes via cron — pulled from public GitHub data + LLM classification.

syncing capability matrix