6 days ago
Beijing, ChinaMid Level / Senior
Responsibilities
- Own and improve platform reliability across production systems and environments.
- Manage cloud infrastructure, deployment pipelines, and runtime environments.
- Design and improve CI/CD workflows for safe and repeatable releases.
- Build and enhance monitoring, alerting, logging, and system observability.
- Lead incident response efforts and perform structured root cause analysis.
- Improve system resilience through redundancy, failover, and recovery mechanisms.
- Work with engineering teams to reduce production risk through better practices.
- Strengthen infrastructure security, access control, and secrets management.
- Support reliability for business-critical workflows across multiple countries.
- Continuously improve operational discipline, uptime, and system stability.
Requirements
- Experience in DevOps, SRE, platform engineering, or infrastructure-focused roles.
- Strong understanding of cloud infrastructure, CI/CD pipelines, and deployment systems.
- Experience with production monitoring, alerting, and incident management practices.
- Ability to troubleshoot infrastructure and production issues calmly and structurally.
- Strong understanding of reliability engineering principles like availability and fault tolerance.
- Experience supporting business-critical or high-availability systems.
- Strong ownership mindset during incidents and operational failures.
- Practical judgment on reliability, performance, security, and cost trade-offs.
- Comfortable working closely with engineering teams in fast-paced environments.
- Low ego, disciplined, and focused on long-term system stability.
Benefits
- Build reliable AI platform infrastructure supporting end-to-end insurance automation.
- Solve real-world reliability and scaling challenges in a high-impact engineering role.
- Work with experienced engineers across multiple countries in a global team.
- Enjoy a fully remote work environment while collaborating with Malaysia-based teams.
- Gain international exposure by building systems used across Southeast Asia markets.
- Access a learning and development budget for continuous technical growth and certifications.
- Experience a high ownership environment with strong autonomy over infrastructure strategy.
- Be part of a modern engineering culture focused on stability, observability, and excellence.
- Receive competitive compensation based on experience and impact.
