about 7 hours ago
Responsibilities
- Own the full lifecycle of a platform service from design to production deployment.
- Build reliable backend services in Go with a focus on correctness.
- Design and implement integrations with SaaS APIs, managing rate limits and failure recovery.
- Collaborate with engineers on cross-service interfaces and data models.
- Write high-quality code that sets standards for the engineering team.
- Participate in design reviews and contribute to architectural decisions.
Requirements
- 8–12 years of DevOps/SRE experience, with at least 3 years managing large-scale infrastructure.
- Experience with Kubernetes operations and resource management.
- Proficiency in Infrastructure as Code using Terraform or Pulumi.
- Ownership of CI/CD pipelines using tools like GitHub Actions or Jenkins.
- Experience with observability stacks and defining SLOs/SLAs.
- Strong incident management skills in a high-alert environment.
- Networking fundamentals knowledge including DNS and load balancing.
- Security and compliance mindset, especially in data handling.
- Scripting proficiency in Go or Python and shell scripting.
- Deep understanding of Linux systems and performance tuning.
Benefits
- High ownership over a real production service from day one.
- Opportunity to work on a greenfield platform with strong technical leadership.
- Small team environment where contributions are visible and impactful.
- Competitive salary and benefits.
Tech Stack
AWSAzureDatadogGitHub ActionsGoGoogle Cloud PlatformGrafanaJenkinsKubernetesPrometheusPythonTerraformVault