Tenstorrent

Staff Engineer, HPC Systems Software

Tenstorrent

Apply
3 months ago
Austin, TX, USA or Santa Clara, CA, USA
Staff+
H1B Sponsor

Base Salary

$100k - $500k/yr

Responsibilities

  • Design and maintain automated OS deployment pipelines for bare-metal HPC clusters globally.
  • Manage large-scale configuration management using Ansible.
  • Deploy and lifecycle manage RHEL and Ubuntu systems across diverse hardware platforms.
  • Implement infrastructure-as-code for repeatable, version-controlled system configurations.
  • Troubleshoot OS-level issues and optimize system performance.
  • Collaborate with hardware design teams to standardize system configurations.
  • Build automation and tooling to streamline provisioning and system updates.

Requirements

  • Experienced in RHEL and Ubuntu administration in HPC or large-scale compute environments.
  • Highly skilled in Ansible for automation and configuration management.
  • Proficient with bare-metal provisioning systems like MAAS or Foreman.
  • Deep understanding of Linux system internals and performance troubleshooting.
  • Familiar with HPC cluster architecture and infrastructure-as-code practices.
  • Capable of diagnosing complex infrastructure issues independently.

Tech Stack

AnsibleBashDockerGrafanaLinuxPrometheusPython

Categories

AI & MLData EngineeringDevOpsSecurity