Track This Job
Add this job to your tracking list to:
- Monitor application status and updates
- Change status (Applied, Interview, Offer, etc.)
- Add personal notes and comments
- Set reminders for follow-ups
- Track your entire application journey
Save This Job
Add this job to your saved collection to:
- Access easily from your saved jobs dashboard
- Review job details later without searching again
- Compare with other saved opportunities
- Keep a collection of interesting positions
- Receive notifications about saved jobs before they expire
AI-Powered Job Summary
Get a concise overview of key job requirements, responsibilities, and qualifications in seconds.
Pro Tip: Use this feature to quickly decide if a job matches your skills before reading the full description.
Location: fully remote
Start: ASAP
Long-term contract
Tourism Domain
Role Overview:
We are looking for A Machine Learning Engineer (Agentic AI) to own the full lifecycle of our proprietary Vision-Language Model (VLM) that powers a web-based agent for online bookings. You will train and deploy the model, build end-to-end data pipelines, and lead the RLHF process to ensure the agent generalizes across multiple booking platforms.
Key Responsibilities:
Train and deploy our proprietary Vision-Language Model (VLM) that powers a web agent for booking workflows.
Design, build, and maintain end-to-end data pipelines: data collection, cleaning, labeling, feature engineering, and model-ready datasets from several booking platforms and user interaction logs.
Own and manage the RLHF process:
- Define feedback schemas and annotation guidelines.
- Work with labelers/annotators to collect preference data and corrections.
- Train reward models and run policy optimization (e.g. PPO/DPO or similar).
- Continuously evaluate and improve model generalization across different booking platforms and UI variations (A/B tests, offline metrics, human eval).
- Work closely with product and engineering to translate business needs into ML requirements and ship reliable, user-facing features.
- Set up and maintain monitoring, experimentation, and observability for models in production (drift, quality, latency, failures).
Skills:
Must-have
- Proven hands-on experience training and deploying deep learning models (LLMs and/or VLMs) in production.
- Strong experience building end-to-end data pipelines in Python (data ingestion, transformation, labeling, storage; familiarity with workflow/orchestration tools is a plus).
- Practical RLHF experience: collecting human feedback, training reward models, and running RL-based fine-tuning (e.g. PPO, DPO, or similar methods).
Technical
- Strong proficiency in Python and modern ML stack (e.g. PyTorch /TensorFlow, Hugging Face ecosystem or similar).
- Solid understanding of deep learning for language and/or vision-language models: pretraining, finetuning, evaluation, and prompt engineering.
- Experience with MLOps practices: experiment tracking, reproducibility, model/version management, CI/CD for ML.
- Comfortable working with cloud environments and containers (Docker; any major cloud provider).
- Good knowledge of data structures, algorithms, and software engineering best practices (testing, code reviews, clean architecture).
Nice-to-have
- Experience with agentic frameworks (e.g. LangChain, LlamaIndex, custom tool-calling agents).
- Background in recommendation systems, ranking, or conversational agents.
- Experience in travel/booking/marketplace domains.
Key Skills
Ranked by relevanceReady to apply?
Join BetterQA and take your career to the next level!
Application takes less than 5 minutes

