Blog

The Future of Facts

T.R. Davidson, A. Surina, C. Gulcehre

Language models are quickly becoming the default interface to factual knowledge—but they don't treat factual queries uniformly. We trace the life cycle of synthetic facts through training to find where such "factual generation-verification gaps" emerge, then confirm the same patterns in natural experiments on frontier models.

TL;DR: Training mechanisms behind factual generation-verification gaps
MAY 2026 | 10 MIN READ

Think Together

Many critical problems involve distributed information that must be synthesized under severe time constraints. This necessitates a shift away from massive models toward small, specialized agents capable of dynamic communication. We label these scenarios "BotMapReduce" problems and explore how high-speed, multi-model coordination could shift the future of AI development.

TL;DR: Many important problems require agentic cooperation
MAR 2026 | 10 MIN READ

The Collaboration Gap

T.R. Davidson, A. Fourney, S. Amershi, R. West, E. Horvitz, E. Kamar

🎓 arXiv paper

Imagine hanging up a frame in your living room. You and a friend naturally divide the work, clarify confusing instructions, and adapt if a screw is missing. We expect this type of fluid, on-the-fly teamwork from humans, but can today’s AI agents do the same? Well, not quite.

TL;DR: Exploring the challenges of dynamic agentic collaboration
NOV 2025 | 10 MIN READ

Self-Recognition in Language Models

T.R. Davidson, V. Surkov, V. Veselovsky, G. Russo, R. West, C. Gulcehre

🎓 arXiv paper 🤖 GitHub repository 📰 IEEE Spectrum article

A rapidly growing number of applications is being built on just a few frontier LMs. This dependency might introduce novel security risks if LMs develop self-recognition capabilities. Inspired by human verification methods, we assess self-recognition in LMs using model-generated "security questions".

TL;DR: Novel insights on self-recognition and position bias in LMs
SEP 2024 | 10 MIN READ

Report abuse