GrepJob
Together AI

Forward Deployed Engineer (Inference & Post-Training)

Together AI
Apply
4 days ago
San Francisco, CA, USASenior / Staff+
H1B Sponsor

Base Salary

$270k - $300k/yr

Responsibilities

  • Select, configure, and optimize inference engines based on hardware and workload profiles.
  • Develop configuration updates to enhance POCs and optimize customer deployments.
  • Drive hands-on RL training runs and guide customers through post-training pipelines.
  • Act as the primary technical point of contact for strategic accounts.
  • Establish alignment with customers during onboarding to improve time-to-value.
  • Influence the software and model roadmap by providing field insights.

Requirements

  • 5+ years in a technical role focused on inference systems or post-training workflows.
  • Expert-level experience with inference engines like vLLM and TensorRT-LLM.
  • Deep knowledge of KV cache tuning, speculative decoding, and quantization techniques.
  • Hands-on experience with fine-tuning and post-training pipelines.
  • Broad knowledge of state-of-the-art open-source models and their applications.
  • Strong Python skills and comfort in production environments.

Benefits

  • Competitive compensation and startup equity.
  • Health insurance and other benefits.
  • Flexibility in remote work arrangements.

Tech Stack

Categories

AI & MLData Science