Track This Job
Add this job to your tracking list to:
- Monitor application status and updates
- Change status (Applied, Interview, Offer, etc.)
- Add personal notes and comments
- Set reminders for follow-ups
- Track your entire application journey
Save This Job
Add this job to your saved collection to:
- Access easily from your saved jobs dashboard
- Review job details later without searching again
- Compare with other saved opportunities
- Keep a collection of interesting positions
- Receive notifications about saved jobs before they expire
AI-Powered Job Summary
Get a concise overview of key job requirements, responsibilities, and qualifications in seconds.
Pro Tip: Use this feature to quickly decide if a job matches your skills before reading the full description.
This is a unique opportunity to apply your software engineering expertise toward shaping the next generation of intelligent systems.
About The Project
You'll annotate frontier-model trajectories on SWE-bench–style tasks derived from real open-source repositories. Currently, closed-source models do not expose their internal reasoning traces, making it difficult to understand how LLMs approach problem-solving.
To address this gap, you'll reconstruct and annotate the reasoning portions of model trajectories—using your own problem-solving process and the full task context to infer and infill the underlying thought process at each step.
Key Responsibilities
- Design benchmark tasks by ideating a vulnerability class (type/subtype + difficulty) and validating the intended exploit behavior
- Create or validate small runnable codebases (“environment/” repos) that include ingestion plus prompt/tool usage where the trust boundary is violated
- Validate the attack via an exploit script and document the unsafe behavior clearly
- Validate implementation of a patch that prevents the exploit and verify the fix is effective
- Produce task metadata (e.g., severity mapping, exact file/line locations, impact analysis, remediation summary, references)
- Conduct review + QC to ensure paths resolve, line ranges are correct, labels aren’t leaked, and the fix blocks the exploit
- 2+ years of experience in software engineering, with a focus on application security, vulnerability research, or secure software engineering
- Degree in Software Engineering, Computer Science, or a related field (Bachelor's minimum; advanced degree preferred)
- Strong proficiency in Python, JavaScript, TypeScript, or other common languages found in open-source projects
- Familiarity with version control workflows (Git, PRs, issue tracking)
- Comfortable articulating technical reasoning in clear, structured writing
- Start Date: Immediate
- Duration: 1–2 months
- Commitment: Part-time (15–25 hours/week, with flexibility up to 40 hours/week)
- Upload your resume
- AI interview: A short, 15-minute conversational session to understand your background, experience, and interest in the role
- Follow-up communication within a few days with next steps and onboarding details
Key Skills
Ranked by relevanceReady to apply?
Join Mercor and take your career to the next level!
Application takes less than 5 minutes

