about 1 month ago
Lima, PeruEntry Level / Mid Level
Responsibilities
- Create question templates and Python scripts to test AI Agents and RAG instances.
- Evaluate Agent performance using metrics like Success Rate and Tool Use Accuracy.
- Collaborate with AI Engineers to build 'Ground Truth' and Agentic Task Datasets.
- Write unit and integration tests for Python-based RESTful APIs.
- Run load and bulk testing to assess AI performance under high-volume requests.
- Perform rigorous testing of agent payloads to identify prompt injection risks.
Requirements
- Basic understanding of Agentic workflows and LLM orchestration.
- Familiarity with AI observability tools like LangSmith or OpenTelemetry.
- Proficiency in Python 3.10+ for automation and data manipulation.
- Strong experience testing and validating RESTful APIs and JSON structures.
- Ability to find edge cases where an AI might fail to follow instructions.
- Advanced English (B2/C1) for global team collaboration.
Benefits
- 100% Remote work environment.
- Holidays off and Paid Time Off.
- Health insurance assistance program.
- Competitive USD compensation.
- Strong team culture and collaborative environment.
- Ongoing training and growth opportunities.
