I build systems that don't break at 3am. Specializing in reliability engineering, intelligent automation, and scalable infrastructure at Colt Technology Services.
I'm a Site Reliability Engineer with 3+ years of experience operating and improving production systems at scale. My journey started with curiosity about how complex systems stay alive — and evolved into a career built around making sure they do.
At Colt Technology Services, I've owned end-to-end service lifecycles, led infrastructure migrations during acquisitions with zero downtime, and built monitoring strategies that caught incidents before users noticed them.
Recently I've been exploring the intersection of AI and SRE workflows — building tools that use LLMs to accelerate root cause analysis and reduce the cognitive load on engineers during production incidents.
Real tools solving real problems — from AI-powered incident response to intelligent job outreach automation. Each project reflects an SRE mindset: reliability, observability, and automation at the core.
Production incidents cost time. Every minute an engineer spends manually grepping through logs is a minute users are impacted. This tool feeds raw logs into a free LLM and outputs a structured Root Cause Analysis report automatically — turning what used to take 30+ minutes of manual investigation into a near-instant insight.
Most CI/CD pipelines treat reliability as an afterthought. This one doesn't. Built with SRE principles from day one — every stage has quality gates, security scans, and automated validation before a single byte reaches production. Simulates a real enterprise workflow end-to-end.
Built out of necessity — most cold outreach tools send emails blindly and flood your inbox with bounce notices. This tool applies SRE thinking to job outreach: verify before you act, detect failures in real time, log everything. Complete with a live web dashboard, SMTP verification, IMAP bounce detection, and persistent reporting.
Whether it's a challenging SRE role, an interesting infrastructure problem, or just a conversation about reliability engineering — I'd love to connect.
Based in Gurugram, India. Open to remote, hybrid, or relocation for the right role. Currently focused on SRE, DevOps, and Platform Engineering positions where reliability and automation truly matter.