GrepJob
Preference Model

Reinforcement Learning Environments Engineer - Cybersecurity

Preference Model
Apply
about 5 hours ago
San Francisco, CA, USAMid Level / Senior

Responsibilities

  • Design and build RL environments and reward functions for security tasks.
  • Create environments covering the full vulnerability lifecycle: discovery, exploitation, and patching.
  • Develop environments for reverse engineering tasks across various code types.
  • Construct verifiable reward signals using various security tools.
  • Collaborate with team members to brainstorm and improve environment building.

Requirements

  • Strong security fundamentals with interests in offensive and defensive work.
  • Hands-on experience in finding, exploiting, or patching vulnerabilities.
  • Proficiency in Python and systems programming, with comfort in low-level languages.
  • Familiarity with security tooling such as fuzzers and debuggers.
  • Problem-solving skills with a drive for end-to-end solutions.

Benefits

  • Competitive cash and equity compensation.
  • Ownership and autonomy in a fast-moving startup environment.
  • Opportunity to work with top machine learning engineers.
  • Health, vision, dental benefits.
  • 401K match and visa sponsorship available.

Tech Stack

Categories