Track This Job
Add this job to your tracking list to:
- Monitor application status and updates
- Change status (Applied, Interview, Offer, etc.)
- Add personal notes and comments
- Set reminders for follow-ups
- Track your entire application journey
Save This Job
Add this job to your saved collection to:
- Access easily from your saved jobs dashboard
- Review job details later without searching again
- Compare with other saved opportunities
- Keep a collection of interesting positions
- Receive notifications about saved jobs before they expire
AI-Powered Job Summary
Get a concise overview of key job requirements, responsibilities, and qualifications in seconds.
Pro Tip: Use this feature to quickly decide if a job matches your skills before reading the full description.
Role Overview
We are looking for a highly skilled Full-Stack Engineer with a focus on Python, Node.js, and TypeScript to contribute to our AI training project. This role will be pivotal in building and optimizing a sophisticated data ecosystem designed for AI model training and evaluation. You will engage in a variety of tasks, including designing APIs, developing web applications, and optimizing data processing workflows in a collaborative and innovation-driven environment.
NOTE: In this project you will not build a project, but specifically generate data to improve model performance for one of the biggest foundational model companies.
Duration: 3 months
Commitment: 40h/week, 4h/day overlap with PST
Model: Contract, time and material
Location: 100% Remote (LATAM, Europe, MENA)
Key Responsibilities
- Write, review, and debug code across multiple languages
- Design tasks and evaluation scenarios for coding, reasoning, and debugging
- Investigate LLM outputs and identify hallucinations, regressions, and failure modes
- Build reproducible dev environments using Docker + automation tools
- Develop scripts, pipelines, and tools for data generation, scoring, and validation
- Produce structured annotations, judgments, and high-quality datasets
- Run systematic evaluations that help improve model reliability and reasoning
- Experience using LLM coding tools (Cursor, Copilot, CodeWhisperer)
- Strong hands-on coding experience (professional or research-based) in one or more of:
- Python, JavaScript / Node.js, TypeScript
- (Additional languages like Go, Java, C++, C#, Rust, SQL, R, Dart, etc. are a plus)
- Solid experience with Linux + Bash, scripting, and automation
- Strong with Docker, reproducible environments, and dev containers
- Advanced Git skills (branching, diffs, patches, conflict resolution)
- Solid understanding of testing and QA (unit, integration, negative, edge-case focused)
- Ability to reliably overlap with 8am-12pm PST
- Experience with dataset creation, annotation, evaluation, or ML pipelines
- Familiarity with benchmarks like SWE Bench or Terminal Bench
- Background in QA automation, DevOps, ML systems, or data engineering
- Work in a fully remote environment
- Opportunity to work on cutting-edge AI projects with leading LLM companies
Key Skills
Ranked by relevanceReady to apply?
Join Gramian Consulting and take your career to the next level!
Application takes less than 5 minutes

