about 4 hours ago
San Francisco, CA, USA
Senior
Base Salary
$295k - $445k/yr
Responsibilities
- Partner with customers to define model behaviors and diagnose issues.
- Build and scale production ML systems for customization and fine-tuning workflows.
- Investigate training workflows to ensure desired outcomes and improve performance.
- Collaborate with backend engineers to integrate ML capabilities into AWS environments.
- Propose and implement improvements to post-training systems and developer workflows.
- Work with Research and Applied teams to enhance model training and evaluation practices.
- Design systems for safe model customization for enterprise customers.
- Debug and improve complex systems involving model behavior and infrastructure.
Requirements
- Master’s or PhD in Computer Science, Machine Learning, or related field, or equivalent experience.
- 7+ years of professional engineering experience in ML or product-driven roles.
- Strong experience in building, training, and deploying production AI systems.
- Familiarity with training large language models and post-training techniques.
- Solid software engineering fundamentals in Python, Rust, or similar languages.
- Experience with model customization, evaluation systems, and cloud infrastructure.
- Ability to collaborate across teams and operate in ambiguous environments.
- Bonus: experience with AWS, Kubernetes, and AI developer platforms.
Tech Stack
AWSKubernetesPythonPyTorchRustTensorFlow
Categories
AI & MLData Engineering