about 3 hours ago
Remote, WorldwideSenior / Mid Level
Base Salary
$100k - $200k/yr
Responsibilities
- Improve LLM accuracy through enhanced prompts, model selection, and evaluation coverage.
- Reduce latency, token usage, and costs while maintaining decision quality.
- Design validation, retries, and fallback mechanisms for ambiguous inputs.
- Build datasets, regression tests, and dashboards for model quality monitoring.
- Enhance agent orchestration and tool usage across various services.
- Debug live issues and improve operational runbooks.
Requirements
- 3+ years of professional software engineering experience in Python, TypeScript, or similar languages.
- Hands-on experience with production systems using LLMs and model-powered workflows.
- Experience designing evaluations and quality metrics for AI systems.
- Strong debugging skills across APIs, databases, and model outputs.
- Practical understanding of prompt engineering and common LLM failure modes.
- Ability to reason about correctness in uncertain environments.
Benefits
- Competitive salaries and meaningful long-term equity participation.
- Flexible vacation policy and family care support.
- 100% remote work environment.
- At least two team-wide offsites per year.
