GrepJob
Bjak

Principal Machine Learning Engineer

Bjak
Apply
28 days ago
Singapore, SingaporeStaff+

Responsibilities

  • Build and own end-to-end ML pipelines spanning data, training, evaluation, inference, and deployment.
  • Fine-tune and adapt models using state-of-the-art methods such as LoRA and QLoRA.
  • Architect and operate scalable inference systems, balancing latency, cost, and reliability.
  • Design and maintain data systems for high-quality training data.
  • Implement evaluation pipelines covering performance, robustness, safety, and bias.
  • Own production deployment, including GPU optimization and scaling policies.
  • Collaborate with application engineering to integrate ML systems into products.
  • Make pragmatic trade-offs and ship improvements quickly.

Requirements

  • Strong background in deep learning and transformer-based architectures.
  • Hands-on experience training, fine-tuning, or deploying large-scale ML models in production.
  • Proficiency with at least one modern ML framework (e.g. PyTorch, JAX).
  • Experience with distributed training and inference frameworks (e.g. DeepSpeed, Ray).
  • Strong software engineering fundamentals for robust, maintainable systems.
  • Experience with GPU optimization, including memory efficiency and quantization.
  • Comfort owning ambiguous, zero-to-one ML systems end-to-end.
  • A bias toward shipping, learning fast, and improving systems through iteration.

Tech Stack

Apache SparkPyTorch

Categories

AI & MLData Engineering