GrepJob
Bjak

DevOps Engineer - Platform Reliability (Remote, China)

Bjak
Apply
6 days ago
Beijing, ChinaMid Level / Senior

Responsibilities

  • Own and improve platform reliability across production systems and environments.
  • Manage cloud infrastructure, deployment pipelines, and runtime environments.
  • Design and improve CI/CD workflows for safe and repeatable releases.
  • Build and enhance monitoring, alerting, logging, and system observability.
  • Lead incident response efforts and perform structured root cause analysis.
  • Improve system resilience through redundancy, failover, and recovery mechanisms.
  • Work with engineering teams to reduce production risk through better practices.
  • Strengthen infrastructure security, access control, and secrets management.
  • Support reliability for business-critical workflows across multiple countries.
  • Continuously improve operational discipline, uptime, and system stability.

Requirements

  • Experience in DevOps, SRE, platform engineering, or infrastructure-focused roles.
  • Strong understanding of cloud infrastructure, CI/CD pipelines, and deployment systems.
  • Experience with production monitoring, alerting, and incident management practices.
  • Ability to troubleshoot infrastructure and production issues calmly and structurally.
  • Strong understanding of reliability engineering principles like availability and fault tolerance.
  • Experience supporting business-critical or high-availability systems.
  • Strong ownership mindset during incidents and operational failures.
  • Practical judgment on reliability, performance, security, and cost trade-offs.
  • Comfortable working closely with engineering teams in fast-paced environments.
  • Low ego, disciplined, and focused on long-term system stability.

Benefits

  • Build reliable AI platform infrastructure supporting end-to-end insurance automation.
  • Solve real-world reliability and scaling challenges in a high-impact engineering role.
  • Work with experienced engineers across multiple countries in a global team.
  • Enjoy a fully remote work environment while collaborating with Malaysia-based teams.
  • Gain international exposure by building systems used across Southeast Asia markets.
  • Access a learning and development budget for continuous technical growth and certifications.
  • Experience a high ownership environment with strong autonomy over infrastructure strategy.
  • Be part of a modern engineering culture focused on stability, observability, and excellence.
  • Receive competitive compensation based on experience and impact.

Tech Stack

Categories