5 days ago
Palo Alto, CA, USASenior
Base Salary
$116k - $258k/yr
Responsibilities
- Build and own the runtime environment for 100+ specialized AI services.
- Design and implement SageMaker Multi-Model Endpoints and Inference Components.
- Build deterministic shells around probabilistic LM outputs for reliability.
- Implement automated benchmarking to detect semantic drift and hallucinations.
- Create reusable patterns and Terraform-based infrastructure for deployment.
- Collaborate with AI Researchers to optimize agentic autonomy.
Requirements
- 5+ years in SRE, Platform Engineering, or MLOps, with 2 years in LLMs/SLMs production.
- Deep expertise with AWS SageMaker, especially Multi-Model Endpoints.
- Experience with Small Language Models and parameter-efficient fine-tuning strategies.
- Strong proficiency in Python and Terraform.
- Experience with Docker, Kubernetes, or AWS ECS/Fargate.
- Familiarity with Snowflake and Vector Databases.
- Understanding of AI at scale as a statistical challenge.
- Experience building CI/CD pipelines for non-deterministic software.
- BS or MS in Computer Science, Engineering, Mathematics, or related field.