OpenAI

Software Engineer - Power Management, Hardware Health

OpenAI

Apply
over 1 year ago
San Francisco, CA, USA
Senior / Staff+

Base Salary

$295k - $445k/yr

Responsibilities

  • Develop and implement system-level and software-level solutions to optimize power usage in large-scale supercomputers.
  • Build automation to monitor power consumption patterns during training workloads.
  • Design algorithms to stabilize power fluctuations and prevent grid reliability issues.
  • Collaborate with researchers and engineers to create tools for real-time monitoring and fault detection.
  • Drive the development of power throttling mechanisms based on workload demands.
  • Work with hardware design teams to integrate power control requirements into IT hardware design.

Requirements

  • 7+ years of software engineering experience focused on large-scale, system-level challenges.
  • Strong proficiency in Python and familiarity with automation and scripting tools.
  • Experience with distributed systems for aggregating and analyzing streaming data.
  • Knowledge of electrical engineering concepts including digital signal processing and power systems.
  • Experience in system-level investigations and automated solutions for power management.
  • Strong analytical skills with experience in data analysis tools like SQL and Pandas.

Tech Stack

PandasPythonSQL

Categories

AI & MLBackendData Engineering