Principal Software Engineer, Site Reliability
Upstart
14 days ago
Remote, Worldwide
Staff+
H1B Sponsor
Base Salary
$195k - $270k/yr
Responsibilities
- Lead the definition and adoption of SRE principles across engineering teams.
- Partner with leadership to shape long-term reliability and observability strategies.
- Champion distributed tracing and key performance metrics to improve system visibility.
- Build and scale self-healing systems to minimize manual intervention.
- Drive improvements to incident response processes, including for Machine Learning systems.
- Collaborate with Development Productivity and Quality teams to enhance engineering velocity.
- Influence technical roadmaps through data-driven insights and contributions.
- Own and deliver cross-functional initiatives from concept through execution.
Requirements
- 10+ years of experience in Software Engineering and Site Reliability Engineering.
- Proven track record as an SRE thought leader and evangelist.
- Strong communication and mentoring skills.
- Proficiency in Python, Go, and JavaScript/TypeScript.
- Experience with Infrastructure as Code tools like Terraform and CloudFormation.
- Expertise in observability and performance monitoring tools.
- Experience with on-call and incident management.
- Strong background in automation and building self-healing systems.
- Hands-on experience with LLM/GenAI to improve SRE processes.
- Program management skills to drive cross-functional projects.
Benefits
- Competitive compensation with base pay, bonuses, and equity grants.
- Generous 401(k) plan with matching contributions.
- Employee Stock Purchase Plan with discounted stock options.
- Affordable medical, dental, and vision coverage with high employer contribution.
- Paid time off, sick leave, and company holidays.
- Paid family and parental leave.
- Employee Assistance Program offering mental health support.
- Annual wellness and productivity allowances.
- Connection through team events and community initiatives.
Tech Stack
DatadogGoJavaScriptPrometheusPythonTerraformTypeScript
Categories
AI & MLData EngineeringDevOpsFull Stack