Track This Job
Add this job to your tracking list to:
- Monitor application status and updates
- Change status (Applied, Interview, Offer, etc.)
- Add personal notes and comments
- Set reminders for follow-ups
- Track your entire application journey
Save This Job
Add this job to your saved collection to:
- Access easily from your saved jobs dashboard
- Review job details later without searching again
- Compare with other saved opportunities
- Keep a collection of interesting positions
- Receive notifications about saved jobs before they expire
AI-Powered Job Summary
Get a concise overview of key job requirements, responsibilities, and qualifications in seconds.
Pro Tip: Use this feature to quickly decide if a job matches your skills before reading the full description.
<Expectation for the role>
- Near Real-Time Conversational Agents – design and deploy multi-modal, tool-using agents that classify inquiries, ask clarifying questions, and draft estimates (RAG pipelines, function-calling, etc.).
- Vector search & knowledge graphs – build and tune semantic search over Firestore +
- Weaviate, exploring graph-based representations where useful.
- Model evaluation – establish repeatable benchmarks, offline/online metrics, and automated regressions so we know when a new prompt or fine-tune is truly better.
- Prototyping → Production – craft PoCs in notebooks, then convert the winners to clean, tested services running on Cloud Run (Python FastAPI, occasional Go/Rust helpers).
- Collaboration – pair closely with product, and design to ship features end-to-end.
<Must Have Requirements>
- Preferably graduated from a top Vietnamese university (HCMUT, HCMUS, HUST, UIT, etc.) and/ or top university around the world.
- Have 4–5 yrs building ML or data-intensive systems (industry, or advanced grad work).
- Write clean Python and are fluent in at least one deep-learning framework (PyTorch
- preferred).
- Understand the maths enough to debug when a model or retrieval step misbehaves.
- Deeply understand the mechanics of RAG, vector databases, and prompt engineering, and can debug when a retrieval step or LLM response is suboptimal.
- Communicate clearly in English (Japanese a plus) and enjoy explaining trade-offs to non-ML teammates.
- Like the idea of being the first dedicated ML hire and setting best practices from scratch.
- Language: Working-level proficiency in English.
<Nice to Have>
- Hands-on with Google's AI stack (Vertex AI, Gemini Models, Agent Builder, BigQuery).
- Experience developing AI applications using AWS Bedrock
- Experience developing AI applications using Google Cloud AI
- Experience developing with AI-assisted editors or agents such as Claude Code, Cursor, or Windsurf
- Experience building applications that utilize Generative AI or Large Language Models (LLMs)
- Experience in fine-tuning or distillation of LLMs for multilingual tasks (e.g., Japanese
- and English)
- Experience operating vector databases and evaluation tools such as Weaviate
- Proven record of public technical contributions, such as open-source projects or technical blog publications
- LLM / Agents: OpenAI, Ollama (local LLMs), LangChain
- Retrieval: Firestore + vector DB, Cloud Storage
- Serving: Python FastAPI, Cloud Run, Docker/Podman
- CI/CD: GitHub Actions, Terraform (coming)
- Monitoring: Cloud Logging, Prometheus, OpenTelemetry
<Benefits>
- Paid Vacations
- Annual Bonus: 1-month salary
<Note>
- This is a full-time position requiring 40 hours per week, but it will be structured as contractor work.
- Devices: You will be expected to use your own computer to perform the work.
- Sole Employment: No second job is permitted.
_______________________________
If you are interested in this position, please:
- Email [email protected] if you have any questions
- Send your CV through application form:
- https://sheets.qlay.ai/dashboard/#/nc/form/5658d2ad-01db-4fe5-bce2-0833cf3de095
Key Skills
Ranked by relevanceReady to apply?
Join Canaan Advisors and take your career to the next level!
Application takes less than 5 minutes