GrepJob
Bjak

Site Reliability Engineer - Insurance Platform (Remote, China)

Bjak
Apply
6 days ago
Beijing, ChinaMid Level / Senior

Responsibilities

  • Own reliability and operational stability of BJAK’s production systems.
  • Design and improve monitoring, alerting, logging and observability across services.
  • Lead incident response, troubleshooting and structured root cause analysis.
  • Improve system resilience through redundancy, failover and recovery strategies.
  • Work with engineers to design systems that are reliable, scalable and operable in production.
  • Improve deployment safety through CI/CD pipelines, release strategies and automation.
  • Reduce recurring incidents by identifying root causes and driving long-term fixes.
  • Manage and optimize cloud infrastructure supporting business-critical workflows.
  • Strengthen operational practices including on-call processes, incident playbooks and SLAs.
  • Continuously improve system uptime, performance and operational maturity.

Requirements

  • Experience in Site Reliability Engineering, DevOps, platform engineering or infrastructure roles.
  • Strong understanding of distributed systems, cloud infrastructure and production operations.
  • Experience with monitoring, alerting and observability tools.
  • Strong troubleshooting skills for production incidents and system failures.
  • Ability to design for reliability, scalability and fault tolerance.
  • Experience working with CI/CD pipelines and deployment automation.
  • Strong understanding of system performance, capacity planning and risk management.
  • Hands-on ownership mindset during incidents and operational issues.
  • Calm, structured and disciplined approach to production environments.
  • Strong collaboration with engineering teams in fast-paced environments.

Benefits

  • Support mission-critical automation at scale.
  • Solve real-world reliability and distributed systems challenges.
  • Work with experienced engineers across multiple countries.
  • Fully remote position with collaboration from Malaysia-based teams.
  • Build systems used across Southeast Asia markets.
  • Support continuous technical growth and certifications with a learning budget.
  • Strong autonomy over reliability and operational design.
  • Focus on stability, observability and engineering excellence.
  • Attractive salary package based on experience and impact.

Tech Stack

Categories