GrepJob
DigitalOcean

Senior Engineer, Inference Control Plane

DigitalOcean
Apply
1 day ago

Base Salary

$139k - $174k/yr

Responsibilities

  • Design and build scalable, multi-tenant services for AI inference.
  • Develop and operate high-scale distributed systems with reliability and performance goals.
  • Enhance platform resiliency through observability and operational tooling.
  • Collaborate with engineering teams to deliver production-grade systems and APIs.
  • Elevate engineering standards through strong software design and incident management.
  • Contribute to architecture decisions regarding service orchestration and reliability.
  • Participate in on-call rotations to improve service health and reduce incidents.

Requirements

  • 5+ years of experience in building and operating multi-tenant platforms or distributed backend systems.
  • Strong experience with high-scale distributed services in production environments.
  • Deep understanding of SRE principles including observability and incident management.
  • 1+ years of hands-on experience with Go/Golang in production systems.
  • 1+ years of experience with Kubernetes.
  • Strong understanding of cloud-native architectures and microservices.
  • Experience debugging performance and reliability issues in production systems.
  • Proficiency in tracking infrastructure and inference metrics.

Benefits

  • Competitive benefits including Employee Assistance Program and flexible time off.
  • Career development resources including reimbursement for conferences and access to LinkedIn Learning.
  • Equity compensation options and potential bonuses based on performance.