Amnesiac generalist behemoths are not the future of language models
You don’t need a generic genius. You need a specialist learner.
Jul 28, 2025
We're not here to ride the AI hype bubble. We believe robust and fundamental technical breakthroughs applied to real-world systems is what really moves the needle.
Amnesiac generalist behemoths are not the future of language models
You don’t need a generic genius. You need a specialist learner.
Jul 28, 2025
Amnesiac generalist behemoths are not the future of language models
You don’t need a generic genius. You need a specialist learner.
Jul 28, 2025
The Bitter Lesson of LLM Evals
Turning expert judgment into a compounding moat. Because in LLM evals, scaling care beats scaling compute.
Jul 13, 2025
The Bitter Lesson of LLM Evals
Turning expert judgment into a compounding moat. Because in LLM evals, scaling care beats scaling compute.
Jul 13, 2025
Do transformers notice their own mistakes? Finding a linear hallucination detector inside LLMs
A linear signal in LLMs reveals hallucinations, is detected by a frozen observer, and steered with a single vector.
May 8, 2025
Do transformers notice their own mistakes? Finding a linear hallucination detector inside LLMs
A linear signal in LLMs reveals hallucinations, is detected by a frozen observer, and steered with a single vector.
May 8, 2025
Resurrecting the salmon: seeing clearer inside LLMs with domain-specific SAEs
A powerful, efficient, and domain-robust strategy for safeguarding medical-text generation
Feb 15, 2025
Resurrecting the salmon: seeing clearer inside LLMs with domain-specific SAEs
A powerful, efficient, and domain-robust strategy for safeguarding medical-text generation
Feb 15, 2025
Why mechanistic interpretability needs a paradigm inversion
The conventional scaling paradigm for language models themselves may be fundamentally misaligned with interp.
Jan 13, 2025
Why mechanistic interpretability needs a paradigm inversion
The conventional scaling paradigm for language models themselves may be fundamentally misaligned with interp.
Jan 13, 2025
From training to deployment, we help you launch a specialist LLM that outperforms generic models, adapts automatically, and runs reliably at scale.
From training to deployment, we help you launch a specialist LLM that outperforms generic models, adapts automatically, and runs reliably at scale.