GrepJob
SpaceX

Site Reliability Engineer, GNC

SpaceX
Apply
about 4 hours ago
Hawthorne, CA, USA
Mid Level / Senior

Base Salary

$125k - $175k/yr

Responsibilities

  • Deploy, upgrade, operate, and scale mission-critical GNC products and services.
  • Provision and maintain virtual and physical servers.
  • Monitor and maintain an HPC cluster with tens of thousands of CPUs.
  • Collaborate with GNC software engineers to create maintainable products.
  • Handle monitoring and incident response for web applications and services.
  • Manage GNC computational infrastructure with IT stakeholders.
  • Engage in the full lifecycle of services from development to operational.
  • Make data-driven recommendations for future hardware purchases.
  • Practice sustainable incident response and postmortems.
  • Provide end-user support for GNC engineering products.
  • Configure automated deployment pipelines for web applications.
  • Develop or improve GNC web apps for usability and robustness.
  • Document new software changes and improvements.
  • Focus on performance bottlenecks and improvement techniques.

Requirements

  • Bachelor’s degree in computer science, IT, engineering, math, or a scientific discipline and 2+ years of software development experience, or 4+ years of relevant professional experience.
  • Experience with Linux operating systems.
  • Experience with Python and Python-based development frameworks.
  • 2+ years of systems administration, site reliability engineering, or DevOps experience is preferred.
  • Expertise with Docker, Vagrant, and Kubernetes or similar technologies.
  • Strong understanding of virtualization and hypervisor technologies.
  • Experience with managing on-prem infrastructure and GPU fleets.
  • Ability to obtain a Top Secret clearance.

Benefits

  • Comprehensive medical, vision, and dental coverage.
  • 401(k) retirement plan with company matching.
  • Short and long-term disability insurance and life insurance.
  • Paid parental leave and 3 weeks of paid vacation.
  • 10 or more paid holidays per year.
  • Access to employee stock purchase plan and potential bonuses.

Tech Stack

AnsibleBazelDockerGradleKubernetesLinuxMakenpmPuppetPythonTerraformVagrant

Categories

BackendDevOps