
Platform Engineer
Allen Control Systems2 months ago
Austin, TX, USAMid Level / Senior
Responsibilities
- Deploy and operate Kubernetes clusters on bare-metal infrastructure with NVIDIA GPUs.
- Manage NVIDIA GPU clusters for machine learning training.
- Own the full CI/CD pipeline from source to deployment.
- Build and maintain the observability stack for real-time system performance monitoring.
- Define and enforce infrastructure-as-code practices using tools like Terraform and Ansible.
- Manage network configuration, storage provisioning, and security hardening.
Requirements
- Proficiency in Python programming and Bash scripting.
- 2+ years of experience in platform engineering or DevOps with Kubernetes.
- Deep expertise in bare-metal Kubernetes administration.
- Hands-on experience with NVIDIA GPU infrastructure and ML orchestration tools.
- Strong CI/CD experience with build automation and pipeline tooling.
- Proficiency with observability tooling for log aggregation and metrics.
- Experience building C++ and Python toolchains on Linux.
Benefits
- Competitive salary.
- Health, Dental, Vision Insurance.
- Paid Time Off.