Site Reliability Engineer — HPC & Automation (Silicon Engineering)

about 4 hours ago

Redmond, WA, USAMid Level / Senior

H1B Sponsor

Base Salary

$125k - $150k/yr

Responsibilities

Deploy, upgrade, operate, maintain, and scale clusters and services.
Collaborate with engineers to develop automated solutions for silicon simulation workflows.
Manage infrastructure as code and utilize observability tools for cluster health.
Operate continuous integration pipelines and version control systems.
Identify and eliminate performance bottlenecks through measurement and engineering.

Requirements

Bachelor’s degree in computer science, information systems, or an engineering discipline, or 2+ years of relevant experience.
1+ years of development experience with Bash, Python, or other programming languages.
1+ years of experience with Linux operating systems.
Familiarity with containerization technologies like Docker and Kubernetes is preferred.
Knowledge of computer system concepts and experience with databases is a plus.

Benefits

Comprehensive medical, vision, and dental coverage.
401(k) retirement plan with company matching.
Paid parental leave and short/long-term disability insurance.
Three weeks of paid vacation and ten or more paid holidays per year.
Company shuttles for commuting to the SpaceX Redmond office.

Tech Stack

AnsibleBambooBashDockerGrafanaJenkinsKubernetes LinuxMySQLPostgreSQLPrometheusPuppetPythonSQLiteTerraform

Categories

AI & MLData EngineeringDevOps