GrepJob
Slingshot Aerospace

Senior AI Engineer, Agentic Evaluation & V&V

Slingshot Aerospace
Apply
11 days ago
Remote, WorldwideSenior

Base Salary

$150k - $250k/yr

Responsibilities

  • Extend and maintain Slingshot’s V&V SDK and evaluation framework for agentic AI systems.
  • Design and implement agent-level and end-to-end evaluations, including benchmark scenarios and scoring logic.
  • Build benchmark scenarios and tooling to measure planning, reasoning, and operational performance.
  • Translate astrodynamics and mission-domain concepts into executable evaluation scenarios.
  • Develop reusable SDK interfaces and evaluation utilities that connect V&V systems and agent workflows.
  • Define and apply metrics for capability evaluation and failure analysis.
  • Partner with cross-functional teams to identify evaluation needs.
  • Contribute to best practices for evaluating complex, autonomous AI systems.
  • Uphold strong engineering standards through testing and documentation.

Requirements

  • 6+ years of experience in software engineering, machine learning engineering, or applied AI.
  • Strong Python engineering skills with experience building SDKs or evaluation tooling.
  • Experience designing evaluation frameworks, benchmarks, or test harnesses for AI/ML systems.
  • Ability to analyze system behavior and evaluate performance in complex systems.
  • Familiarity with modern agent frameworks and orchestration patterns.
  • Experience working in cross-functional, multidisciplinary teams.
  • Strong written and verbal communication skills.
  • Bachelor’s degree in a relevant science or engineering field.
  • Must be a U.S. citizen and eligible for a government security clearance.

Tech Stack

MLflowPython

Categories

AI & MLData Science