Edge of Context

Practical AI engineering by Slava Dubrov. I write about the parts of AI systems that have to survive production: agent runtimes, memory, security, retrieval, evaluation, LLM infrastructure, and the developer tooling around them.

Start here

Agent architecture: reasoning loops, memory, tool use, security, long-running runtimes, and harness engineering.
Retrieval and evaluation: RAG evaluation, agent evals from traces, search ranking, schema-guided agent memory, and context engineering.
LLM infrastructure: fine-tuning, quantization, vLLM structured outputs, LoRAX serving, and LLM engineering concepts.
Developer tooling: uv on macOS, pyproject.toml, and local LLMs on macOS.

Browse the blog

The blog index lists every post and can be filtered by topic or series. Engineering the Agentic Stack walks through the agent architecture part by part. For author background, conference talks, and topic coverage, see About Me.