Crossing Hurdles
Technical Expert | Remote
Crossing HurdlesCanada7 days ago
ContractRemote FriendlyInformation Technology, Research

Position: PhD Rater

Type: Part-Time

Compensation: $50–$100/hour

Location: Remote

Commitment: 30+ hours/week (primarily weekdays)


Role Responsibilities

  • Design challenging, real-world STEM benchmark problems in domains such as data science, machine learning, finance, and software engineering.
  • Implement tasks within an agentic development environment using Python.
  • Create reproducible problem setups with clear specifications and executable tests.
  • Evaluate and analyze AI model behavior, including reasoning traces and agent workflows.
  • Diagnose reasoning failures, logic gaps, and problem-solving limitations in AI systems.
  • Contribute to improving benchmark quality and evaluation frameworks for frontier AI models.


Requirements

  • Active or recently graduated PhD.
  • Deep expertise in data science, machine learning, finance, and/or Python-based software development.
  • Strong research background in advanced STEM topics.
  • Ability to commit reliably for 30+ hours per week.
  • Demonstrated technical output such as high-quality open-source contributions or research work.
  • Ability to analyze agent behavior traces and diagnose failures beyond surface-level errors.


Application Process

  • Upload resume
  • Interview
  • Submit form

Key Skills

Ranked by relevance