Why I Measure My RAG (And You Should Too)
RAGAS evaluation, retrieval metrics, and why "it works on my machine" isn't enough for production LLM systems. How I built an evaluation pipeline for my CV chatbot and what the metrics taught me.
Deep dives on ML engineering, MLOps, computer vision, and the craft of building production AI systems.
RAGAS evaluation, retrieval metrics, and why "it works on my machine" isn't enough for production LLM systems. How I built an evaluation pipeline for my CV chatbot and what the metrics taught me.
Why splitting object detection and classification into separate stages gives you better accuracy, easier maintenance, and deployment flexibility. Lessons from a multi-stage CV pipeline in production.
Running MLflow, Grafana, and Docker on modest hardware. Pragmatic MLOps for solo engineers and small teams in low-resource environments — no Kubernetes required.