Researcher, Loss of Control

about 2 months ago

San Francisco, CA, USA

Mid Level / Senior

Base Salary

$295k - $445k/yr

Responsibilities

Design and implement mitigation components for loss of control risk.
Integrate safeguards across product and research surfaces in collaboration with teams.
Evaluate technical trade-offs and propose pragmatic solutions.
Collaborate with risk modeling and policy partners to align mitigation design.
Execute rigorous testing and red-teaming workflows to stress-test mitigations.

Requirements

Passion for AI safety and motivation to enhance AI models for real-world use.
Demonstrated experience in deep learning and transformer models.
Proficiency with frameworks such as PyTorch or TensorFlow.
Strong foundation in data structures, algorithms, and software engineering principles.
Familiarity with training and fine-tuning methods for large language models.
Experience designing and evaluating technical safeguards for advanced AI behavior.
Background knowledge in alignment, control, interpretability, or adversarial ML is a plus.

Tech Stack

PyTorchTensorFlow

Categories

AI & MLData ScienceTesting