6 months ago
Singapore, SingaporeSenior
Responsibilities
- Design, build, and maintain the end-to-end MLOps platform using Kubernetes and Cloud Services.
- Use Terraform or similar tools to manage and scale ML-related infrastructure securely.
- Implement and optimize CI/CD pipelines for model training, testing, packaging, and deployment.
- Build highly available, low-latency model serving infrastructure.
- Implement monitoring, alerting, and logging solutions for infrastructure health and model performance.
- Evaluate and support ML tools such as Feature Stores and distributed model training pipelines.
- Ensure platform security and manage secrets for sensitive data.
Requirements
- 5+ years in backend software development with 2+ years focused on AI/ML Platform or MLOps infrastructure.
- Deep expertise in MLOps practices and automated deployment pipelines.
- Proven experience in designing low-latency model serving solutions.
- Proficiency in Python and writing maintainable code.
- Experience in developing large-scale distributed systems with high concurrency.
- Excellent communication and mentoring abilities.
- A relevant degree in Computer Science, Mathematics, or related fields.
