About me

I'm Slava Dubrov, also known as Viacheslav Dubrov. I build production ML and AI systems, and I lead teams that do the same.

These days I work on the Agent Execution team at HubSpot: LLM deployment, fine-tuning, evaluation, and the runtime pieces that make agents behave in production. Before that, I worked on HubSpot's retrieval, grounding, and memory infrastructure, so I have opinions about what happens after the demo works.

Why read this blog

I write the notes I wish I had when I was debugging production AI systems. A lot of AI work looks clean in a notebook and gets messy when real users, latency, permissions, data drift, and cost enter the picture. This blog focuses on that version.

Relevant background:

HubSpot Agent Execution: LLM fine-tuning, inference optimization, agent evaluation, and safety guardrails in production.
HubSpot Embedding Hub and Context Layer: retrieval, grounding, and memory infrastructure for AI agents.
Wayfair: fraud and scam detection systems, plus embedding systems I built and led, with about $4M in annual savings.
Speaker at World Agentic AI Summit Berlin 2026: "Engineering the Agentic Stack".
PhD in AI diagnostics, peer-reviewed papers, and patents.
Work across data pipelines, training, evaluation, deployment, and the operational details between them.
Production ML on AWS and GCP across batch, streaming, and real time systems.
Open source code, tutorials, and write-ups for people building AI systems that have to run.

Speaking

"Engineering the Agentic Stack" - World Agentic AI Summit, Berlin (2026). Production architecture for agentic AI systems: Cognitive Engine, Cortex (memory architecture), and Schema-Guided Reasoning.

What I write about

Mostly production failures and what I did about them.

Agent architecture: AI Agent Reasoning Loops, AI Agent Memory Architecture, AI Agent Tool Use, AI Agent Security, Long-Running AI Agent Runtime
Context and retrieval: Context Engineering for AI Agents for agentic systems, RAG patterns
LLM development: LLM Fine-Tuning Guide, Schema-Guided Reasoning on vLLM, LoRAX Serving Guide
Developer tooling: Python setup, uv on macOS, MCP Server Tutorial with uv and FastMCP

Tech radar

LLM serving and fine-tuning: vLLM, LoRAX, LoRA/QLoRA, VLMs, SGR/SO

Agents: LangGraph, Claude, Google ADK, CrewAI, LlamaIndex, SmolAgents

Safety and evaluation: guardrails, automated evals, LLM-as-a-judge, observability

Vector and retrieval: Qdrant, Faiss, semantic search, hybrid retrieval, reranking, context compression

Tools and workflows: MCP (Model Context Protocol), A2A, FastMCP, n8n

MLOps: AWS (two certs), GCP/Vertex AI, Kubernetes, Kubeflow, Airflow, Ray, MLflow

Core: Python, SQL, Scala, Java, Rust, PyTorch, FastAPI, Spark, Polars

Let's connect

I am usually interested in production ML, agent systems, retrieval, evaluation, and cleanup work on pipelines that became too complicated.