Artificial Intelligence Engineer

Elliott Moss ConsultingSingapore1 day ago

ContractInformation Technology

Track This Job

Add this job to your tracking list to:

Monitor application status and updates
Change status (Applied, Interview, Offer, etc.)
Add personal notes and comments
Set reminders for follow-ups
Track your entire application journey

Save This Job

Add this job to your saved collection to:

Access easily from your saved jobs dashboard
Review job details later without searching again
Compare with other saved opportunities
Keep a collection of interesting positions
Receive notifications about saved jobs before they expire

AI-Powered Job Summary

Get a concise overview of key job requirements, responsibilities, and qualifications in seconds.

Pro Tip: Use this feature to quickly decide if a job matches your skills before reading the full description.

Job Summary

We are seeking a skilled AI Engineer with a minimum of 3+ years of hands-on experience in designing, building, and deploying Large Language Model (LLM)-based solutions. The ideal candidate will be responsible for the end-to-end lifecycle of AI applications, from high-performance model inference and optimization to the development of advanced Agentic AI workflows using RAG and CAG patterns. This role requires close collaboration with product, data, and engineering teams to translate business requirements into scalable, reliable, and cost-efficient AI systems.

Mandatory Skills & Qualifications

Bachelor’s degree in Information Technology, Computer Science, Finance, or a related field.
Minimum 3+ years of experience working with Large Language Models (LLMs) in production environments.
Hands-on expertise with vLLM and model quantization techniques such as AWQ and GPTQ.
Strong proficiency in Apache Airflow for scheduling and orchestrating complex data and AI pipelines.
Experience with RAGFlow or similar deep-document Retrieval-Augmented Generation (RAG) frameworks.
Practical experience with vector databases (e.g., FAISS, Milvus, Pinecone, Weaviate).
Proven ability to design and implement multi-agent systems that leverage tools and external APIs to perform multi-step tasks.
Advanced proficiency in Python, Docker, and Kubernetes.
Experience using AI observability and monitoring tools to track latency, cost, throughput, and hallucination rates.

Key Responsibilities

Configure, deploy, and optimize vLLM and other inference frameworks to ensure low-latency, high-throughput LLM serving.
Design and implement RAG pipelines using vector databases and Cache-Augmented Generation (CAG) strategies to reduce redundant computation and improve response quality.
Deploy and tune vLLM clusters to support scalable, production-grade API endpoints for multiple open-source LLMs.
Design, implement, and maintain Apache Airflow DAGs and RAGFlow pipelines to automate the AI lifecycle, including data ingestion, indexing, evaluation, and prompt/version management.
Develop, version-control, and continuously refine system prompts, applying techniques such as Chain-of-Thought (CoT) to improve reasoning accuracy and consistency.
Implement CAG strategies to optimize KV cache reuse and minimize compute costs for long-context and multi-step AI tasks.
Build and refine Agentic AI workflows, enabling autonomous task planning, tool usage, and API orchestration across different LLM backends.
Monitor and analyze AI system performance using observability tools, ensuring reliability, cost efficiency, and controlled hallucination rates.
Collaborate with cross-functional teams to align AI solutions with business objectives, security standards, and scalability requirements.

Experience Level

3+ years of relevant experience in AI/ML engineering, with demonstrated production experience in LLM-based systems.

Key Skills

Ranked by relevance

Ready to apply?

Join Elliott Moss Consulting and take your career to the next level!

Application takes less than 5 minutes

Apply