Building AI Agents You Can Trust & Evaluate.
Agentic RAG + Evaluation-Driven Development with Toni Ramchandani
Hosted by
Toni Ramchandani is an AI engineering practitioner focused on building practical, reliable AI systems. The focus is not hype. The focus is clean architecture, working code, measurable behavior, and systems that can improve over time.
Free event · Registration required
Hosted by
Toni Ramchandani is an AI engineering practitioner focused on building practical, reliable AI systems. The focus is not hype. The focus is clean architecture, working code, measurable behavior, and systems that can improve over time.
Event Description & Syllabus
Most AI workshops end when the demo works.
This series starts there.
Join Toni Ramchandani for a hands-on live engineering series where we build an Agentic RAG support assistant from scratch, then evolve it into a system we can observe, test, evaluate, and improve.
In Session 1, we build the baseline agent: knowledge-base RAG, deterministic tool calling, order lookup, ticket creation, OpenAI/Ollama support, ChromaDB local vector search, and multi-turn conversation flow. These match the current
Day 1 build promise on the event page.
The goal is simple:
Build the agent first. Then prove it works.
Live Stream Details
Platform: Live on YouTube
Format: Interactive code-along
Date: Saturday, June 13, 2026
Time: 10:00 AM IST onwards
Stack: OpenAI, Ollama, ChromaDB, Python, Agentic RAG patterns
What We Are Building in Session 1?
- By the end of Day 1, attendees will understand and build:
- A working Agentic RAG support assistant
- Knowledge-base retrieval over support documents
- Deterministic tool calling for order lookup
- Support ticket creation workflow
- OpenAI and Ollama provider support
- Local vector search with ChromaDB
- Multi-turn conversation handling
- A clean baseline for future tracing and evaluations
This is not a chatbot demo. This is the first step in building a reliable AI engineering workflow.
Why This Series Is Different?
Most AI tutorials show a working answer.
This series focuses on the full engineering lifecycle:
Can we build it?
Can we observe it?
Can we evaluate it?
Can we prevent regressions?
Can we trust it in production?
Across the series, we will move from a baseline Agentic RAG assistant to traces, failure analysis, evaluation datasets, deterministic checks, LLM-as-a-judge scoring, regression suites, CI gates, monitoring, and feedback loops.
The Series Roadmap
1. Build
Create the baseline Agentic RAG support agent.
2. Observe
Add traces to inspect retrieval, tool calls, prompts, responses, and decision paths.
3. Analyze
Identify failure modes: wrong retrieval, wrong tool, missing clarification, hallucinated action, poor synthesis.
4. Measure
Create datasets and evaluators to test agent behavior repeatedly.
5. Judge
Use LLM-as-a-judge for answer quality, groundedness, completeness, and policy alignment.
6. Gate
Add regression tests and CI release gates so bad prompts or model changes do not silently ship.
7. Monitor
Track real production behavior and turn failures into future test cases.
Who Is This Series For?
- AI engineers
- Software engineers
- Solution architects
- Platform engineers
- SDETs and QA engineers
- Engineering managers
- AI enthusiasts moving beyond prompt engineering
Basic Python familiarity is helpful.
No prior Agentic RAG experience is required.
No shortcuts. No hand-waving. Just clean code and real architecture. Join us live, ask your questions, and lets code together!
What to expect
- Live Q&A with the instructor
- Practical examples from real-world QA workflows
- Recording available to registered attendees
- Certificate of participation
Register for this event
Free · Online registration
Know someone who'd benefit?
qabash.com/events/building-ai-agents-you-can-trust-evaluate
Building AI Agents You Can Trust & Evaluate.
Jun 13, 2026 · Free