GrepJob
Preference Model

Senior Software Engineer, RL Environments

Preference Model
Apply
15 days ago
San Francisco, CA, USASenior / Staff+

Responsibilities

  • Design, build, and refine RL tasks from ideation to iteration.
  • Own complex environments involving multi-step workflows and stakeholder interactions.
  • Critically evaluate coding agents' outputs and direct their work effectively.
  • Identify model capability gaps and redesign tasks to target subtle failure modes.
  • Contribute to the shared infrastructure and tooling for the environments team.
  • Mentor newer engineers as the team grows.

Requirements

  • Deep software engineering experience across multiple domains.
  • Genuine expertise in at least one specialty such as infrastructure or distributed systems.
  • Proficiency in Python programming language.
  • Extensive hands-on experience with coding agents like Claude Code or Codex.
  • Strong intuition for model behavior and anticipating shortcuts.
  • Ability to work independently on complex, ambiguous problems.

Benefits

  • Competitive cash and equity compensation above the 90th percentile.
  • Ownership and autonomy in a fast-moving startup environment.
  • Opportunity to collaborate with top machine learning engineers.
  • Comprehensive health, vision, and dental benefits.
  • 401K match available.
  • Visa sponsorship and relocation support offered.

Tech Stack

Categories