Skip to main content

Blog

Thoughts on serverless GenAI, framework design, and building production AI systems.

From Use Case to Evaluation Pipeline in 10 Minutes

Every LLM deployment needs evaluation. Here's how to auto-generate complete evaluation infrastructure from a simple use case description.

evaluation mlops quality

February 14, 2025

Cost-Aware GenAI: Model Routing for Serverless

How to reduce GenAI costs by 60-80% using intelligent model routing that matches request complexity to the cheapest capable model.

cost-optimization serverless genai

January 19, 2025

Why LangChain Fails in Lambda (And What Does)

Existing LLM frameworks assume long-running servers with persistent memory. Here's why that breaks in AWS Lambda and how to solve it.

serverless aws-lambda llm