Skip to Content
Back to Skills

Monitoring & Observability Stack

Architecting for Insight and Reliability

I specialize in building and managing the critical observability platforms that empower engineering teams to understand system behavior and ensure high levels of reliability. My approach focuses on creating scalable, cost-effective solutions using best-in-class open-source technology.

At Carousell Group, I led the strategic migration away from a costly SaaS monitoring solution to a more powerful and flexible self-hosted stack. I architected and deployed a new observability platform centered around VictoriaMetrics for time-series data storage, Prometheus for metrics collection, and Grafana for visualization. For logging, I implemented a centralized ELK Stack.

To ensure the reliability of the monitoring platform itself, I further migrated the entire stack to run on a Kubernetes cluster. This provided high availability and simplified management while optimizing costs.

Key Competencies

  • Observability Architecture: Designing and building scalable monitoring stacks from the ground up.
  • Tooling Expertise: Deep, hands-on knowledge of Prometheus, VictoriaMetrics, Grafana, and the ELK Stack.
  • High Availability: Deploying and managing observability platforms on Kubernetes for resilience.
  • Metrics & Alerting: Defining key service-level indicators (SLIs) and configuring actionable alerting.
  • Cost Optimization: Strategically reducing observability costs while improving capabilities.