
Lead Devops Engineer (Bangkok based, relocation provided)
Agoda
about 1 month ago
Bangkok, Thailand
Mid Level / Senior / Staff+
Responsibilities
- Lead the technical vision and execution of new SRE platforms.
- Define and promote SRE best practices across Agoda’s services.
- Design, build, and operate reliability platforms.
- Own safe deployment strategies integrated with monitoring.
- Identify and mitigate reliability and scaling risks.
- Improve system resilience by partnering with platform and operation teams.
- Lead major incident response and operational excellence.
- Maintain and evolve incident and observability tooling.
- Advance platform observability and reliability signals.
- Define reliability roadmaps and OKRs.
Requirements
- Demonstrated ownership of architecting and operating mission-critical production systems.
- Proven ability to lead complex cross-team initiatives.
- Expertise in programming languages such as Go, Python, Rust, or Java.
- Deep hands-on experience with the Kubernetes ecosystem.
- Observability and monitoring expertise using Prometheus and Grafana.
- Strong incident management lifecycle experience.
- Experience with reliability engineering patterns.
- Solid data analysis skills, including SQL.
- Data-driven mindset for analyzing complex problems.
- Excellent communication and collaboration skills.
Tech Stack
Argo CDGoGrafanaIstioJavaKubernetesMicrosoft SQL ServerPostgreSQLPrometheusPythonRustSQL
Categories
DevOps