WTF

2026-03-12
WTF is the Recovering Difference Softmax Algorithm?

Duolingo's notification algorithm isn't just A/B testing with extra steps. It's a bandit that knows when to shut up — and that's the hard part.

8 min wtf-explainers
2026-03-12
Signal Fusion: How Semantic, Relational, and Direct Signals Combine to Make Recommendations That Don't Suck

Every recommendation system that works well is fusing multiple signal types. The ones that don't understand this ship vibes-based retrieval and wonder why users leave. A taxonomy of signals, how they combine, and what the SOTA ecosystem gets right and wrong.

15 min tier-1-opinions
2026-03-12
Optimising Recall and Precision in LangSmith Experiments

Your retrieval pipeline returns results. But does it return the right results? How New Computer used LangSmith's experiment framework to achieve 50% higher recall and 40% higher precision in agentic memory retrieval — and what you can steal from their approach.

10 min tier-1-opinions
2026-03-11
WTF is LightGBM?

Gradient-boosted decision trees for people who've never trained one — how they work, when they win, when they don't, and why tabular foundation models are about to make this conversation more complicated.

9 min wtf-is
2026-03-11
WTF is Two-Tower Recommendation?

The architecture behind every recommendation system that actually works at scale — why splitting the model in half is the key to serving billions of candidates in milliseconds.

8 min wtf-is
2026-03-10
Through-Line Detection: What LLMs See That Rule Systems Can't

Cross-modal pattern detection across behavioral data, free-text, and metadata — the capability that actually justifies using an LLM in a recommendation system.

11 min deep-dives
2026-03-10
Multi-Window Temporal Aggregation for Behavioral Trajectory

The same metric across 7, 30, and 90 days tells you where someone is heading, not just where they are. Here's why that distinction is the whole game.

16 min systems-deep-dives
2026-03-10
From Single LLM Call to Deep Agent: An Honest Migration Path

Start with one function call. Add skills when the prompt gets too long. A no-framework guide to building agents that actually ship.

13 min architecture-deep-dives
2026-03-10
Signal Stability Classification: Inference Cost-Benefit in Hybrid Recommendation Systems

Not all behavioral signals deserve the same compute budget. Genre affinity changes over weeks; session mood changes in seconds. Classify by stability, infer by tier, and stop pretending daily batch is the answer to everything.

16 min foundational-patterns
2026-03-10
Query-Theme-Keyed Search Expansion

Two users search 'sleep' and get different results — with no LLM at query time. How pre-computed, theme-keyed expansion terms turn a flat search into something that actually knows you.

12 min tier-2
2026-03-10
Pre-computed Personalization: The Offline Agent Pattern

Why your personalization agent should never run at request time. The LLM does its heavy lifting on a schedule; your product serves the artifacts. Zero latency, infinite scale.

12 min foundational-patterns
2026-03-10
Negative Signals as First-Class Citizens in Recommendation

What users don't do matters more than what they do. Most recommendation systems are built entirely on applause. Here's why the silence is louder.

15 min tier-2
2026-03-10
The Multi-Artifact Output Pattern

One LLM call, multiple output shapes for multiple consumers. Design your schema like a protocol, not an afterthought.

13 min engineering-deep-dives
2026-03-10
Intention-Action Gaps as Behavioral Signals

What you say you'll do vs what you actually do — the gap is the insight.

12 min product-deep-dives
2026-03-10
Filter Bubble Mitigation in Personalized Systems

How to avoid turning personalization into an algorithmic echo chamber — and why the fix isn't as simple as randomly throwing garbage at your users.

15 min ethical-counterweights
2026-03-10
Empathy Architecture: Designing LLM Outputs That Don't Feel Like Surveillance

"After you close the laptop" vs "Your Evening Wind-Down" — why the difference matters, and how to build systems that infer emotion without weaponising it.

13 min tier-1-opinions
2026-03-09
KARL: Knowledge Agents via Reinforcement Learning

Databricks trained an RL-based search agent on GLM 4.5 Air that beats Claude 4.6 and GPT 5.2 on enterprise knowledge retrieval — at a fraction of the cost.

9 min paper-deep-dives

No results