Senior Site Reliability Engineer
Chainalysisabout 14 hours ago
Responsibilities
- Own reliability as a product capability by defining SLOs and incident response practices.
- Build and improve platform foundations for safe and quick engineering workflows.
- Design and operate resilient systems for real-time ingestion and detection.
- Lead improvements in scalability, performance, and operational maturity.
- Drive modernization of the infrastructure stack, including Kubernetes and infrastructure as code.
- Enhance developer experience through internal tooling and operational standards.
- Collaborate with backend and platform engineers to reduce toil and improve reliability.
- Participate in incident management and root cause analysis.
Requirements
- 5+ years of experience in SRE, infrastructure engineering, or platform engineering.
- Strong experience with cloud-native systems in production, especially in demanding environments.
- Deep practical experience with Kubernetes, AWS, and infrastructure as code tools.
- Strong understanding of observability, monitoring, and performance tuning for distributed systems.
- Experience with CI/CD systems and operational automation.
- Solid coding ability in Python, Go, or Rust.
- Sound judgment around reliability trade-offs and safe delivery practices.
- Collaborative mindset with a focus on enabling product teams and mentoring engineers.
Benefits
- Opportunity to work on real problems with significant impact.
- Small team environment with high ownership and close customer interaction.
- Startup pace with the backing of a leading blockchain intelligence company.
- Commitment to diversity and inclusion in the workplace.