About me
I'm Slava Dubrov, also known as Viacheslav Dubrov. I build production ML and AI systems, and I lead teams that do the same.
These days I work on the Agent Execution team at HubSpot: LLM deployment, fine-tuning, evaluation, and the runtime pieces that make agents behave in production. Before that, I worked on HubSpot's retrieval, grounding, and memory infrastructure, so I have opinions about what happens after the demo works.
Why read this blog
I write the notes I wish I had when I was debugging production AI systems. A lot of AI work looks clean in a notebook and gets messy when real users, latency, permissions, data drift, and cost enter the picture. This blog focuses on that version.
Relevant background:
- HubSpot Agent Execution: LLM fine-tuning, inference optimization, agent evaluation, and safety guardrails in production.
- HubSpot Embedding Hub and Context Layer: retrieval, grounding, and memory infrastructure for AI agents.
- Wayfair: fraud and scam detection systems, plus embedding systems I built and led, with about $4M in annual savings.
- Speaker at World Agentic AI Summit Berlin 2026: "Engineering the Agentic Stack".
- PhD in AI diagnostics, peer-reviewed papers, and patents.
- Work across data pipelines, training, evaluation, deployment, and the operational details between them.
- Production ML on AWS and GCP across batch, streaming, and real time systems.
- Open source code, tutorials, and write-ups for people building AI systems that have to run.
Speaking
- "Engineering the Agentic Stack" - World Agentic AI Summit, Berlin (2026). Production architecture for agentic AI systems: Cognitive Engine, Cortex (memory architecture), and Schema-Guided Reasoning.
What I write about
Mostly production failures and what I did about them.
- Agent architecture: AI Agent Reasoning Loops, AI Agent Memory Architecture, AI Agent Tool Use, AI Agent Security, Long-Running AI Agent Runtime
- Context and retrieval: Context Engineering for AI Agents for agentic systems, RAG patterns
- LLM development: LLM Fine-Tuning Guide, Schema-Guided Reasoning on vLLM, LoRAX Serving Guide
- Developer tooling: Python setup, uv on macOS, MCP Server Tutorial with uv and FastMCP
Tech radar
LLM serving and fine-tuning: vLLM, LoRAX, LoRA/QLoRA, VLMs, SGR/SO
Agents: LangGraph, Claude, Google ADK, CrewAI, LlamaIndex, SmolAgents
Safety and evaluation: guardrails, automated evals, LLM-as-a-judge, observability
Vector and retrieval: Qdrant, Faiss, semantic search, hybrid retrieval, reranking, context compression
Tools and workflows: MCP (Model Context Protocol), A2A, FastMCP, n8n
MLOps: AWS (two certs), GCP/Vertex AI, Kubernetes, Kubeflow, Airflow, Ray, MLflow
Core: Python, SQL, Scala, Java, Rust, PyTorch, FastAPI, Spark, Polars
Let's connect
I am usually interested in production ML, agent systems, retrieval, evaluation, and cleanup work on pipelines that became too complicated.