Track This Job
Add this job to your tracking list to:
- Monitor application status and updates
- Change status (Applied, Interview, Offer, etc.)
- Add personal notes and comments
- Set reminders for follow-ups
- Track your entire application journey
Save This Job
Add this job to your saved collection to:
- Access easily from your saved jobs dashboard
- Review job details later without searching again
- Compare with other saved opportunities
- Keep a collection of interesting positions
- Receive notifications about saved jobs before they expire
AI-Powered Job Summary
Get a concise overview of key job requirements, responsibilities, and qualifications in seconds.
Pro Tip: Use this feature to quickly decide if a job matches your skills before reading the full description.
Why this role
Exceptional models are built on strong data foundations. In this position, you will design and develop an automated labeling system that transforms large collections of unlabeled video and image data into high-quality training signals without relying on human annotation. Your work will have a direct impact on model accuracy, robustness, and iteration speed.
What you will do
• Build an automated labeling pipeline by creating methods that extract training signals from raw data, such as self-supervision, weak supervision, pseudo-labeling, teacher-student distillation, and synthetic augmentation.
• Develop reliability and quality assurance techniques without human input, including confidence calibration, uncertainty estimation, ensemble and consensus checks, automatic error detection, and metrics that forecast downstream model improvements.
• Ensure temporal and spatial consistency by enforcing cross-frame alignment, tracking identities and structures over time, and designing automatic repair strategies for drift and occlusions.
• Create active data-selection strategies that rank and curate raw data based on informativeness, novelty, or scarcity, and implement scalable sampling and replay policies.
• Integrate models into the training loop so they improve continuously with updated pseudo-labels, while also automating evaluation gates and rollback procedures.
• Build supporting tools and infrastructure, including reliable ETL processes, versioned datasets, lineage tracking, and lightweight dashboards for monitoring data health and coverage.
• Produce clear documentation covering assumptions, failure modes, decision rules, and research-to-production handoff details.
What you have accomplished
• At least three years of experience in machine learning with a focus on computer vision or representation learning, along with strong skills in Python and PyTorch, TensorFlow, or JAX.
• Delivered at least one system that uses self-supervised, weakly supervised, or pseudo-labeling methods at scale.
• Practical experience with uncertainty estimation and calibration methods such as Monte Carlo dropout, ensembles, temperature scaling, and automated quality filters.
• Experience designing data-selection or active-learning loops and evaluating their effect on downstream metrics.
• Strong software engineering habits, including reproducible training workflows, data and version control, and CI practices for machine learning such as unit and integration tests for data and metrics.
Nice to have
• Experience with video understanding, including temporal models, tracking, and cross-frame consistency.
• Background in synthetic data, student-teacher distillation, or constraint-based labeling.
Key Skills
Ranked by relevanceReady to apply?
Join microagi and take your career to the next level!
Application takes less than 5 minutes

