about 3 hours ago
Base Salary
$147k - $202k/yr
Responsibilities
- Champion observability best practices and educate engineering teams.
- Run services in production environments effectively.
- Design services for high growth and availability.
- Provision and monitor cloud-native infrastructure.
- Build and maintain scalable observability infrastructure using Terraform.
- Troubleshoot performance and operational issues.
- Automate operational tasks and improve scripts.
- Participate in major incident response and identify telemetry gaps.
- Drive cross-team initiatives to enhance instrumentation standards.
Requirements
- 5+ years of platform engineering, SRE, or DevOps experience.
- Experience with cloud infrastructure like AWS, Google Cloud, or Azure.
- Expertise in the Datadog ecosystem, including alerting standards and Terraform management.
- Strong coding skills in Node.js or Golang.
- Experience with containerization and orchestration tools like Docker and Kubernetes.
- Data-driven approach to debugging performance bottlenecks.
- Deep understanding of microservice architecture and best practices.
- Experience in coaching and mentoring junior engineers.
- Proven ability to lead cross-functional technical initiatives.
Benefits
- Health, dental, and vision insurance.
- 401(k) plan.
- Flexible spending account.
- Paid leave including PTO and parental leave.