-
View all jobs
Seeking experienced researchers and technical experts to support a frontier-model evaluation project focused on agentic workflows. You will design and validate challenging benchmark tasks in data science, machine learning, finance, and coding to help identify reasoning and problem-solving gaps in advanced STEM models. The role involves building real-world tasks with executable tests and analyzing model or agent behavior.
Key Responsibilities
Key Responsibilities
- Design challenging, real-world STEM problems
- Implement each task within an agentic development environment using Python
- You will be engaged as an independent contractor.
- This is a fully remote role that can be completed on your own schedule.
- Projects can be extended, shortened, or concluded early depending on needs and performance.
- Payments are weekly on Stripe or Wise based on services rendered.
Key Skills
Ranked by relevance
machine learning
Related Jobs
3 roles aligned with this opportunity
View Job Details
Related
Data Analytics & Reporting
2026-04-11
Full-time
Not Applicable
Italy
Banking
Research
View Job Details
Related
Senior Backend Engineer .NET & Azure Cloud
2026-04-11
Full-time
Mid-Senior
Netherlands
Technology
Engineering
View Job Details
Related
Product Designer UI/UX
2026-04-10
Full-time
Not Applicable
France
Software Development
Design
Login to Apply
- Posted
- Mar 31, 2026
- Type
- Full-time
- Level
- Not Applicable
- Location
- Luxembourg
- Company
- YO IT Consulting
Industries
Software Development
Categories
Research
Analyst
Information Technology
Related Jobs
3 roles aligned with this opportunity
View Job Details
Related
Data Analytics & Reporting
2026-04-11
Full-time
Not Applicable
Italy
Banking
Research
View Job Details
Related
Senior Backend Engineer .NET & Azure Cloud
2026-04-11
Full-time
Mid-Senior
Netherlands
Technology
Engineering
View Job Details
Related
Product Designer UI/UX
2026-04-10
Full-time
Not Applicable
France
Software Development
Design