Monitoring & Observability Stack
Architecting for Insight and Reliability
I specialize in building and managing the critical observability platforms that empower engineering teams to understand system behavior and ensure high levels of reliability. My approach focuses on creating scalable, cost-effective solutions using best-in-class open-source technology.
At Carousell Group, I led the strategic migration away from a costly SaaS monitoring solution to a more powerful and flexible self-hosted stack. I architected and deployed a new observability platform centered around VictoriaMetrics for time-series data storage, Prometheus for metrics collection, and Grafana for visualization. For logging, I implemented a centralized ELK Stack.
To ensure the reliability of the monitoring platform itself, I further migrated the entire stack to run on a Kubernetes cluster. This provided high availability and simplified management while optimizing costs.
Key Competencies
- Observability Architecture: Designing and building scalable monitoring stacks from the ground up.
- Tooling Expertise: Deep, hands-on knowledge of Prometheus, VictoriaMetrics, Grafana, and the ELK Stack.
- High Availability: Deploying and managing observability platforms on Kubernetes for resilience.
- Metrics & Alerting: Defining key service-level indicators (SLIs) and configuring actionable alerting.
- Cost Optimization: Strategically reducing observability costs while improving capabilities.