Track This Job
Add this job to your tracking list to:
- Monitor application status and updates
- Change status (Applied, Interview, Offer, etc.)
- Add personal notes and comments
- Set reminders for follow-ups
- Track your entire application journey
Save This Job
Add this job to your saved collection to:
- Access easily from your saved jobs dashboard
- Review job details later without searching again
- Compare with other saved opportunities
- Keep a collection of interesting positions
- Receive notifications about saved jobs before they expire
AI-Powered Job Summary
Get a concise overview of key job requirements, responsibilities, and qualifications in seconds.
Pro Tip: Use this feature to quickly decide if a job matches your skills before reading the full description.
Position Overview:
iMerit seeks detail-oriented and analytically minded Multimodal GenAI Evaluation Analysts to
perform highly nuanced evaluations of AI system outputs across different modalities: text,
image, video, and multimodal interactions. Analysts will assess the accuracy, appropriateness,
quality, clarity, and cultural alignment of model outputs against complex guidelines, ensuring that
results align with project standards and real-world use cases. These evaluations will directly
inform the development and fine-tuning of advanced large language models (LLMs), vision
models (LVMs), and multimodal AI systems.
Role Responsibilities:
- Evaluate outputs generated by LLMs across multiple modalities (text, image captions,
- Assess quality against project-specific criteria such as correctness, coherence,
- Identify subtle errors, hallucinations, or biases in AI responses.
- Apply domain expertise and logical reasoning to resolve ambiguous or unclear outputs.
- Provide detailed written feedback, tagging, and scoring of outputs to ensure consistency
- Escalate unclear cases and contribute to refining evaluation guidelines.
- Collaborate with Project Managers and Quality Leads to meet accuracy, reliability, and
Skills & Competencies:
- Strong critical reading, observational, and evaluative skills across different modalities.
- Ability to articulate nuanced judgments with precision and clarity.
- Excellent English comprehension (CEFR B2 or above); additional languages a plus.
- Familiarity with LLMs, generative AI, and multimodal systems.
- Strong attention to detail and ability to apply guidelines consistently.
- Awareness of cultural and linguistic nuances, including potential bias and harm in AI
- Comfort with evolving workflows, rapid feedback cycles, and complex quality
Requirements:
- Bachelor's degree/ diploma or equivalent educational qualification.
- 1+ years of experience in data annotation, LLM evaluation, content moderation, or
- Demonstrated experience working with data annotation tools and software platforms.
- Strong understanding of language and multimodal communication (instruction following
- Ability to adapt quickly to changing project directions and fast-paced work environments.
- Previous experience creating or annotating complex data specifically for Large
- Prior exposure to generative AI, prompt engineering, or LLM fine-tuning workflows is a
While moderation of high-harm/high-risk material is not part of this role, candidates should be
aware that occasional exposure to NSFW or otherwise sensitive content may occur due to
imperfections in client-provided datasets. Applicants should indicate that they are comfortable
working in environments where such incidental exposure is a possibility.
What We Offer:
- Opportunities to shape the evaluation standards for next-generation multimodal AI
- Innovative and supportive global working environment.
- Competitive compensation and flexible remote working arrangements.
- Continuous learning and growth in applied AI evaluation.
- You will receive an iMerit platform assessment (15–30 minutes). If successfully completed, you’ll be invited to join the first project.
- After onboarding, once you’ve completed 10 hours of work, a quality test will be conducted.
- If you pass the quality test, you’ll continue on a 3-month project and will be invited to participate in upcoming projects.
- You will complete a quick 15–30 minute assessment. This requires downloading a browser extension, which can be removed once the assessment is completed.
- ID verification and background check are required.
- Onboarding will be completed through iMerit’s platform.
Commitment:
- Minimum 20 hours per week (flexible schedule).
- You may work more hours if desired.
- Malaysia – $5/hr
- Mexico, Colombia, Brazil, Costa Rica – $8.50/hr
- Argentina, Poland, Bulgaria, Romania, Malta, Latvia, Lithuania, UAE – $13/hr
- Portugal, Italy, Greece, Spain – $15.50/hr
- Canada, Australia, New Zealand, United Kingdom, Ireland, US, Finland, France, Sweden, Belgium, Austria, Denmark, Germany, Luxembourg, Estonia – $22/hr
Key Skills
Ranked by relevanceReady to apply?
Join Braintrust and take your career to the next level!
Application takes less than 5 minutes

