Location
divinópolis
Posted
May 30, 2026
Commute
Local Area
Local Opportunity Near You!
This job is in your area. Enjoy a short commute and work close to home.
Job Description
AI Benchmark Engineer
Turing is one of the world’s fastest-growing AI companies that accelerates the advancement and deployment of powerful AI systems. As an AI Benchmark Engineer, you will design and build high‑quality multi‑agent benchmark tasks based on real‑world software engineering workflows. These tasks are built from real open‑source code changes such as bug fixes, migrations, and refactors, and are used to evaluate how effectively AI agents can understand large codebases, apply precise modifications, and produce correct, testable outputs.
Responsibilities
- Build multi‑agent benchmark tasks based on real‑world open‑source code changes.
- Use the Harbor evaluation framework to run and validate tasks within Docker environments.
- Write clear, precise task instructions specifying file paths, function signatures, expected behavior, and constraints.
- Design and implement Python‑based verification scripts to validate cor...