arXivApril 15, 2026Worth watching

KWBench: Measuring Unprompted Problem Recognition in Knowledge Work

Best models solve only 28% of professional problems without being told the task type, revealing unprompted diagnosis as the missing capability for knowledge work automation.

Applied AIAI Agents

Read original on arXiv →