ClawWork is an economic accountability framework for AI agents developed by the Data Intelligence Lab at the University of Hong Kong (HKUDS). The core concept: agents are given a budget and must earn income by completing professional tasks, then pay for their own token usage — maintaining economic solvency across a benchmark of 220 tasks spanning 44 professional sectors.
The benchmark measures not just task completion rates but economic efficiency: how much value an agent generates relative to its inference costs. Top-performing agents achieve $1,500+ per hour equivalent, a metric that quantifies the ROI of using AI agents for professional work. The framework supports Claude Sonnet 4.6, Gemini 3.1 Pro, and Qwen-3.5-Plus, enabling direct cross-model economic comparisons.
ClawWork represents a significant evolution in agent evaluation methodology. Rather than measuring success as binary task completion, it introduces economic pressure that rewards agents for being efficient, prioritizing high-value tasks, and avoiding unnecessary computation. For organizations evaluating OpenClaw or other agent frameworks for real business deployment, ClawWork provides a rigorous, economically grounded benchmark beyond standard accuracy metrics.