MLOps Architect - Gen Al

about 2 months ago

Arlington, VA, USASenior / Staff+

Base Salary

$118k - $189k/yr

Responsibilities

Design and implement scalable ML and LLM infrastructure on AWS.
Architect end-to-end ML and Generative AI lifecycle workflows.
Integrate LLM pipelines into the enterprise MLOps stack.
Define standards for CI/CD/CT pipelines across ML and GenAI workloads.
Architect Retrieval-Augmented Generation (RAG) pipelines.
Design and deploy LLM-based services using managed services and containerized custom inference services.
Establish prompt versioning and evaluation frameworks for LLM systems.
Implement guardrails for hallucination control and safety monitoring.
Define architecture for LLM fine-tuning workflows.
Implement scalable orchestration of LLM pipelines.
Architect scalable inference patterns for traditional ML models and LLM APIs.
Implement model monitoring frameworks for performance and quality.
Define SLAs/SLOs for ML and GenAI systems.
Design safe deployment strategies.
Implement cost tracking for training workloads and inference endpoints.
Optimize LLM workloads for cost-performance tradeoffs.
Partner with finance and engineering teams to forecast ML/GenAI infrastructure spend.
Define enterprise standards for experiment tracking and model registry.
Provide architectural guidance to data science and engineering teams.
Evaluate and recommend tooling across the ML/GenAI stack.

Requirements

6+ years of experience in ML engineering, data engineering, or MLOps roles.
Proven experience architecting ML platforms in AWS.
Strong hands-on experience with SageMaker.
Experience operationalizing LLM or Generative AI systems in production.
Experience building RAG pipelines and integrating vector databases.
Experience working with Databricks in production.
Experience implementing data governance and catalog systems.
Strong understanding of CI/CD principles for ML and GenAI.
Experience with containerization and orchestration.
Deep knowledge of infrastructure-as-code.
Strong understanding of observability and monitoring for ML systems.
Experience implementing cloud cost optimization strategies.
Strong Python proficiency.
Experience with foundation model fine-tuning.
Experience implementing model registries and experiment tracking tools.
Experience designing feature stores and embedding stores.
Familiarity with AI risk management and bias mitigation.
Experience supporting regulated or data-sensitive environments.
Platform-level architectural thinking.
Deep understanding of integrating GenAI into enterprise ML ecosystems.
Ability to balance scalability, governance, security, performance, and cost.
Strong technical leadership and cross-functional collaboration skills.
Hands-on ability to move from architecture design to implementation.

Benefits

Competitive base salary range of $117,800 – $189,000.
Annual incentive compensation eligibility up to 10%.
Comprehensive medical, dental, and employer-paid vision plans.
Flexible Spending Account for qualified out-of-pocket expenses.
Lifestyle Spending Account for physical, mental, and financial well-being.
100% company paid insurances for short-term and long-term disability.
Paid maternity and parental leave.
Commuter benefits for travel expenses.
LifeBalance program offering discounts on activities and services.
Tuition reimbursement of up to $5,000 annually.
Travel reimbursement for work-related travel.
Paid time off and sick time.
401K plan with a 25% match on contributions.

Tech Stack

AWSDatabricksDocker KubernetesMLflowPython Terraform

Categories

AI & MLData ScienceDevOps