AI in Practice

Improving Agent Systems & AI Reasoning

DeepSeek-R1, OpenAI o1 and o3, test-time compute scaling, model post-training — and what the shift toward Reasoning Language Models actually means for the people building agent systems on top of them.

March 2025
Research

Introducing Layer Enhanced Classification (LEC)

A novel approach to lightweight safety classification that outperforms GPT-4o on content safety and prompt injection detection — using fewer than 100 training examples and a 0.5B parameter model. Here's how it works and why it matters.

December 2024
AI in Practice

Understanding Techniques for Solving GenAI Challenges

Pre-training, fine-tuning, RAG, prompt engineering — these aren't just buzzwords. This piece breaks down the actual mechanics of each approach and helps you choose the right technique for your specific problem.

May 2024
AI in Practice

Are Language Models Benchmark Savants or Real-World Problem Solvers?

Leaderboard scores tell you how a model performs on a benchmark. They tell you much less about how it performs on your problem. This piece examines the gap between evaluation and deployment — and what it means for how we assess AI progress.

February 2024