about 2 months ago
Base Salary
$150k - $300k/yr
Responsibilities
- Design, build, and maintain highly available infrastructure for AI/ML workloads.
- Implement monitoring, alerting, and observability systems for system health.
- Debug, optimize, and automate infrastructure for rapid deployment cycles.
- Proactively identify and resolve incidents to minimize downtime.
- Collaborate with engineers and founders to shape product and infrastructure strategies.
Requirements
- 5+ years of hands-on experience in production-grade infrastructure.
- Proficient in Python or similar languages and cloud platforms.
- Experience with container orchestration, networking, and storage technologies.
- Ability to build tools for diagnosing and addressing reliability issues.
- Quantitative, hands-on approach to system operations and automation.
Benefits
- Unlimited PTO for recharging.
- Free daily lunch with teammates.
- Reimbursed transportation costs.
- Generous health insurance covering medical, dental, and vision.
- Health and wellness budget of up to $150/month.
- Flexible parental leave schedule.
