
Member of Technical Staff - Compute Platform
Reflection4 months ago
London, United Kingdom +2 moreMid Level / Senior
H1B Sponsor
Responsibilities
- Build and maintain tools for automatic remediation and capacity planning.
- Design and iterate on cluster management stacks for multi-GPU workloads.
- Implement comprehensive monitoring for cluster durability and performance.
- Prepare infrastructure for next-generation GPU deployments and larger clusters.
- Own multi-cloud storage and petabyte-scale data replication.
Requirements
- Systems-level engineering experience focused on cluster maintenance.
- Strong coding ability with a focus on systems or GPU infrastructure.
- Deep knowledge of GPU hardware and familiarity with NCCL.
- Experience with K8s-first architecture.
- Expertise in managing high-performance cloud storage across multiple data centers.
Benefits
- Top-tier compensation including salary and equity.
- Comprehensive medical, dental, vision, life, and disability insurance.
- Fully paid parental leave for all new parents.
- Paid time off and relocation support.
- Daily provided meals and opportunities for team connection.