
Senior AI Engineer, Agentic Evaluation & V&V
Slingshot Aerospace11 days ago
Remote, WorldwideSenior
Base Salary
$150k - $250k/yr
Responsibilities
- Extend and maintain Slingshot’s V&V SDK and evaluation framework for agentic AI systems.
- Design and implement agent-level and end-to-end evaluations, including benchmark scenarios and scoring logic.
- Build benchmark scenarios and tooling to measure planning, reasoning, and operational performance.
- Translate astrodynamics and mission-domain concepts into executable evaluation scenarios.
- Develop reusable SDK interfaces and evaluation utilities that connect V&V systems and agent workflows.
- Define and apply metrics for capability evaluation and failure analysis.
- Partner with cross-functional teams to identify evaluation needs.
- Contribute to best practices for evaluating complex, autonomous AI systems.
- Uphold strong engineering standards through testing and documentation.
Requirements
- 6+ years of experience in software engineering, machine learning engineering, or applied AI.
- Strong Python engineering skills with experience building SDKs or evaluation tooling.
- Experience designing evaluation frameworks, benchmarks, or test harnesses for AI/ML systems.
- Ability to analyze system behavior and evaluate performance in complex systems.
- Familiarity with modern agent frameworks and orchestration patterns.
- Experience working in cross-functional, multidisciplinary teams.
- Strong written and verbal communication skills.
- Bachelor’s degree in a relevant science or engineering field.
- Must be a U.S. citizen and eligible for a government security clearance.