Keftek

KWBench: Measuring Unprompted Problem Recognition in Knowledge Work

Best models solve only 28% of professional problems without being told the task type, revealing unprompted diagnosis as the missing capability for knowledge work automation.

Applied AIAI Agents
Read original on arXiv