7 days ago
Santa Clara, CA, USAIntern / Entry Level
Responsibilities
- Support model quantization and deployment efforts for large-scale multimodal models.
- Assist with applying model optimization techniques under guidance from senior engineers.
- Work with research and platform teams to improve model deployability.
- Contribute to deployment tools, test pipelines, and runtime modules in C++ and Python.
- Help analyze model performance, memory usage, latency, and numerical accuracy.
- Participate in debugging and performance tuning across the model and system stack.
- Support validation and testing workflows for stable deployment.
Requirements
- BS, MS, or PhD in Computer Science, Electrical Engineering, Robotics, or a related field.
- Strong programming skills in C++ and/or Python.
- Familiarity with deep learning frameworks such as PyTorch.
- Basic understanding of model inference and optimization workflows.
- Exposure to model compression or quantization concepts.
- Interest in computer architecture and performance optimization.
- Strong problem-solving skills and ability to learn quickly.
- Good communication skills and ability to collaborate with teams.
Benefits
- A fun, supportive and engaging environment.
- Infrastructures and computational resources to support your work.
- Opportunity to work on cutting edge technologies with top talents.
- Opportunity to make a significant impact on the transportation revolution.
- Competitive compensation package.
- Snacks, lunches, dinners, and fun activities.