GrepJob
SpaceX

Site Reliability Engineer — HPC & Automation (Silicon Engineering)

SpaceX
Apply
about 4 hours ago
Redmond, WA, USAMid Level / Senior
H1B Sponsor

Base Salary

$125k - $150k/yr

Responsibilities

  • Deploy, upgrade, operate, maintain, and scale clusters and services.
  • Collaborate with engineers to develop automated solutions for silicon simulation workflows.
  • Manage infrastructure as code and utilize observability tools for cluster health.
  • Operate continuous integration pipelines and version control systems.
  • Identify and eliminate performance bottlenecks through measurement and engineering.

Requirements

  • Bachelor’s degree in computer science, information systems, or an engineering discipline, or 2+ years of relevant experience.
  • 1+ years of development experience with Bash, Python, or other programming languages.
  • 1+ years of experience with Linux operating systems.
  • Familiarity with containerization technologies like Docker and Kubernetes is preferred.
  • Knowledge of computer system concepts and experience with databases is a plus.

Benefits

  • Comprehensive medical, vision, and dental coverage.
  • 401(k) retirement plan with company matching.
  • Paid parental leave and short/long-term disability insurance.
  • Three weeks of paid vacation and ten or more paid holidays per year.
  • Company shuttles for commuting to the SpaceX Redmond office.

Tech Stack

AnsibleBambooBashDockerGrafanaJenkinsKubernetesLinuxMySQLPostgreSQLPrometheusPuppetPythonSQLiteTerraform

Categories

AI & MLData EngineeringDevOps