Rank #1227 · on radar since 2026-07-03

Workspace-Bench

Benchmark self-evolving Agent upon realistic large-scale file workspaces

Visit homepage ↗large-language-modelsdatasetllmautonomous-agents+4

Momentum

42.2

24h–7d–

Why it's ranked

Every score decomposes into published factors — the same math for every tool, paid or not. Read the methodology →

Velocity (weighted, cohort-normalized)	0.438
Signal decay	0.995
Corroboration	1.000
Quality gate	1.000

github · forks3 latest · 2 snapshots

github · stars39 latest · 2 snapshots