
CV/ML Platform Engineer
Allen Control Systemsabout 2 months ago
Austin, TX, USAMid Level / Senior
Responsibilities
- Deploy and operate Kubernetes clusters on bare-metal infrastructure with NVIDIA GPUs.
- Manage NVIDIA GPU clusters for ML training.
- Own the ACS CV/ML CI/CD pipeline.
- Improve and maintain core ML infrastructure, including model registration and versioning.
- Enhance ML model testing, performance analysis, and reporting tools.
- Automate repetitive model training and testing tasks.
- Coordinate with Software Team Platform Engineers to minimize duplication in infrastructure.
- Collaborate with the Software Team to optimize models for deployment on edge hardware.
Requirements
- 2+ years of experience in Platform Engineering or DevOps/MLOps.
- Strong programming skills for automating ML lifecycles and building CLI tools.
- Hands-on experience with NVIDIA GPU infrastructure and CUDA libraries.
- Experience implementing and maintaining MLOps platforms like Kubeflow or MLflow.
- Familiarity with high-performance storage solutions and data orchestration tools.
- Proven track record in building CI/CD pipelines for model validation and performance benchmarking.
- Experience with model optimization toolchains for ARM targets like NVIDIA Jetson.
- Proficiency with observability stacks adapted for ML.
- Strong Linux systems knowledge, including networking and security hardening.
Benefits
- Competitive salary.
- Health, Dental, Vision Insurance.
- Paid Time Off.