GrepJob
Agoda

Lead Software Engineer, Devops Platform (Bangkok based, relocation provided)

Agoda
Apply
about 3 hours ago
Bangkok, Thailand
Senior / Staff+

Responsibilities

  • Lead the technical vision and architecture of new SRE platforms.
  • Define and promote SRE best practices across Agoda’s services.
  • Design, build, and operate reliability platforms to enhance system resilience.
  • Own safe deployment strategies integrated with monitoring.
  • Identify and mitigate reliability and scaling risks proactively.
  • Lead major incident response and operational excellence initiatives.
  • Maintain and evolve incident and observability tooling.
  • Advance platform observability using Prometheus and Grafana.
  • Define reliability roadmaps and translate business goals into technical requirements.

Requirements

  • 8+ years of relevant experience in software engineering.
  • Demonstrated ownership of architecting and operating mission-critical systems.
  • Proven ability to lead complex cross-team initiatives.
  • Expertise in programming languages such as Go, Python, Rust, or Java.
  • Deep hands-on experience with the Kubernetes ecosystem.
  • Observability and monitoring expertise using Prometheus and Grafana.
  • Strong incident management lifecycle experience.
  • Experience with reliability engineering patterns like canary deployments.
  • Solid data analysis skills, including SQL and data pipelines.
  • Excellent communication and collaboration skills.

Tech Stack

Argo CDGoGrafanaIstioJavaKubernetesMicrosoft SQL ServerPostgreSQLPrometheusPythonRustSQL

Categories

DevOps