I design and ship production-grade AI-native infrastructure — autonomous Kubernetes platforms, agentic RAG pipelines, and multi-model orchestration systems that act with confidence and fail gracefully.
I'm a senior AI/Cloud engineer specialising in production-grade agentic systems, cloud-native infrastructure, and multi-agent orchestration. My focus is engineering software that can reason, decide, and act with minimal human intervention.
Every platform I build is infrastructure-as-code first — Terraform-managed, properly observed, and documented from day one. Not proof-of-concepts. Deployable, maintainable systems.
My work sits at the intersection of AWS cloud architecture and modern AI frameworks: LangGraph, confidence-gated decision engines, and retrieval-augmented generation at production scale.
"Every remediation is a Git commit. The cluster never changes outside a reviewed pull request or a high-confidence auto-apply."
Autonomous Kubernetes remediation platform — intercepts anomalies, reasons through root cause via a multi-agent pipeline, and resolves incidents without human involvement unless confidence demands it.
Medical Q&A system that routes queries across a PubMed/FDA corpus and live web search, with a safety guardrail and iterative relevance checking before streaming an answer token-by-token.
Inference gateway that abstracts OpenAI, Anthropic, and AWS Bedrock behind a single API — with intelligent routing, cost tracking, and automatic failover.
[email protected] · isokan.dev