about 1 month ago
Zürich, SwitzerlandStaff+ / Senior
Responsibilities
- Develop and optimize real-time multimodal models and serving frameworks.
- Implement inference optimization techniques and model acceleration strategies.
- Ensure high-performance systems using languages like C++, CUDA, or Rust.
- Manage distributed systems and scaling for concurrent connections.
- Take ownership of models from research to production, ensuring reliability.
Requirements
- Deep understanding of inference optimization and modern serving frameworks.
- Hands-on experience with model acceleration techniques like quantization and caching.
- Proficiency in high-performance programming languages and profiling code.
- Experience with distributed systems, Kubernetes, and multi-GPU inference.
- PhD in CS, Physics, Math, or equivalent practical experience is required.
- Professional fluency in English is necessary for collaboration.
Benefits
- Remote work opportunity within Switzerland.
- Full-time, permanent employment.
- Potential for U.S. visa and relocation support in the future.
