Location
toronto
Posted
June 01, 2026
Commute
Local Area
Local Opportunity Near You!
This job is in your area. Enjoy a short commute and work close to home.
Job Description
Requirements
- You enjoy rapidly building prototypes that demonstrate the boundaries of what LLMs are capable of, and you have developed resources to measure those capabilities ,
- You have spent dozens of hours reviewing complex data and LLM outputs to ensure high data quality ,
- You are obsessive about rigorously measuring AI capabilities, and also about making sure your measurements actually align with the capabilities you care about ,
- You have strong software engineering skills ,
- If some of the above doesnβt line up perfectly with your experience, we still encourage you to apply!
What the job involves
- Evaluation is critical to making progress in scaling intelligence ,
- As models continue to become superhuman in many real-world use cases, we must continue to develop new evaluation techniques that accurately reflect what models are already capable of, as well as set the agenda for what future models sho...