about 5 hours ago
Base Salary
$149k - $186k/yr
Responsibilities
- Contribute to the observability strategy and roadmap.
- Design and enhance scalable observability solutions.
- Establish best practices for monitoring and incident management.
- Support operational excellence by improving incident response processes.
- Collaborate on cross-team initiatives to improve system reliability.
- Apply automation and AI-assisted workflows to improve root cause analysis.
- Work with stakeholders to surface observability insights.
- Analyze system and user signals to detect reliability issues.
- Optimize observability platforms for performance and cost-efficiency.
- Mentor peers and raise observability standards.
Requirements
- Solid hands-on experience in observability engineering or related roles.
- Strong expertise in monitoring and observability practices, especially with Datadog.
- Experience contributing to observability initiatives across teams.
- Proficiency with Kubernetes and AWS.
- Ability to influence technical decisions and collaborate with stakeholders.
- Good understanding of distributed systems principles.
- Experience defining and implementing SLOs and alerting strategies.
- Strong software engineering fundamentals in a modern programming language.
- Experience improving systems through automation.
- Strong analytical and problem-solving skills.
- Good communication and collaboration skills.
- A sense of ownership and accountability.
Benefits
- Array of health plans including mental health support and fitness benefits.
- Generous paid time off and sick leave.
- 401k with up to a 5% match.
- Commuter benefits and pet insurance.
- Annual bonus and long-term incentive opportunities.
