about 1 month ago
Sunnyvale, CA, USAMid Level / Senior
H1B Sponsor
Base Salary
$160k - $240k/yr
Responsibilities
- Design and implement data pipelines for multimodal data ingestion and storage.
- Develop internal tooling for dataset exploration, curation, and quality monitoring.
- Build and maintain distributed training infrastructure for large-scale model training.
- Implement job orchestration workflows for model runs.
- Identify and remediate performance bottlenecks in compute and storage.
- Collaborate with teams to ensure infrastructure supports mission-critical use cases.
- Maintain observability and reliability tooling for training and inference.
Requirements
- 3+ years of experience in ML infrastructure, MLOps, or large-scale data systems.
- Proven experience with distributed training and workflow orchestration.
- Strong proficiency in Python and cloud-native infrastructure.
- Deep understanding of data engineering and ETL pipelines.
- Familiarity with containerization and monitoring systems.
- Experience optimizing GPU cluster utilization and profiling model performance.
- Bachelor’s degree or higher in a related technical field.
- Must be a U.S. Person due to access to export controlled information.
Benefits
- Competitive compensation package including base salary and bonus.
- Meaningful equity.
- Premium medical, dental, and vision plans with $0 paycheck contribution.
- Competitive PTO and company holiday calendar.
- Unlimited AI tokens.
- Catered lunch daily and fully stocked kitchen.
- EV charging.
- Relocation assistance depending on role eligibility.
