about 2 hours ago
Responsibilities
- Define the holistic architecture for highly clustered AI environments.
- Influence the strategy for AI workload scheduling and orchestration.
- Profile and eliminate system-level bottlenecks across the AI pipeline.
- Work closely with software, firmware, and OS engineering to influence platform design.
- Drive the 3-to-5-year technical vision for the AI platform.
Requirements
- Demonstrated ability in systems engineering, cloud architecture, or HPC with at least 4+ years as a Lead or Principal Architect.
- Deep practical knowledge of large model training and deployment.
- Authoritative understanding of system-level bottlenecks and data pathways.
- Experience with container orchestration platforms and infrastructure-as-code.
- Exceptional ability to bridge the gap between AI researchers and hardware engineers.