Track This Job
Add this job to your tracking list to:
- Monitor application status and updates
- Change status (Applied, Interview, Offer, etc.)
- Add personal notes and comments
- Set reminders for follow-ups
- Track your entire application journey
Save This Job
Add this job to your saved collection to:
- Access easily from your saved jobs dashboard
- Review job details later without searching again
- Compare with other saved opportunities
- Keep a collection of interesting positions
- Receive notifications about saved jobs before they expire
AI-Powered Job Summary
Get a concise overview of key job requirements, responsibilities, and qualifications in seconds.
Pro Tip: Use this feature to quickly decide if a job matches your skills before reading the full description.
We're looking for a AI Architect to join our growing AI delivery team. You'll design and build large language model (LLM) systems that move beyond experimentation and into real-world production—powering search, summarization, knowledge assistants, and automation for enterprise clients.
This is a hands-on, execution-focused role. You'll work closely with product managers, engineers, and AI specialists to ship scalable solutions. You won't be buried in research or building theoretical models—you'll be deploying actual systems that users rely on every day.
Requirements
What You'll Do
- Architect end-to-end GenAI systems, including prompt chaining, memory strategies, token budgeting, and embedding pipelines
- Design and optimize RAG (Retrieval-Augmented Generation) workflows using tools like LangChain, LlamaIndex, and vector databases (FAISS, Pinecone, Qdrant)
- Evaluate tradeoffs between zero-shot prompting, fine-tuning, LoRA/QLoRA, and hybrid approaches, aligning solutions with user goals and constraints
- Integrate LLMs and APIs (OpenAI, Anthropic, Cohere, Hugging Face) into real-time products and services with latency, scalability, and observability in mind
- Collaborate with cross-functional teams—translating complex GenAI architectures into stable, maintainable features that support product delivery
- Write and review technical design documents and remain actively involved in implementation decisions
- Deploy to production with industry best practices around version control, API lifecycle management, and monitoring (e.g., hallucination detection, prompt drift)
- Proven experience building and deploying GenAI-powered applications, ideally in enterprise or regulated environments
- Deep understanding of LLMs, vector search, embeddings, and GenAI design patterns (e.g., RAG, prompt injection protection, tool use with agents)
- Proficiency in Python and fluency with frameworks and libraries like LangChain, Transformers, Hugging Face, and OpenAI SDKs
- Experience with vector databases such as FAISS, Qdrant, or Pinecone
- Familiarity with cloud infrastructure (AWS, GCP, or Azure) and core MLOps concepts (CI/CD, monitoring, containerization)
- A delivery mindset—you know how to balance speed, quality, and feasibility in fast-moving projects
- Experience building multi-tenant GenAI platforms
- Exposure to enterprise-grade AI governance and security standards
- Familiarity with multi-modal architectures (e.g., text + image or audio)
- Knowledge of cost-optimization strategies for LLM inference and token usage
- ML researchers focused on academic model development without delivery experience
- Data scientists unfamiliar with vector search, LLM prompt engineering, or system architecture
- Engineers who haven't shipped GenAI products into production environments
Benefits & Growth Opportunities:
- Competitive salary and performance bonuses
- Comprehensive health insurance
- Professional development and certification support
- Opportunity to work on cutting-edge AI projects
- International exposure and travel opportunities
- Flexible working arrangements
- Career advancement opportunities in a rapidly growing AI company
Ready to apply?
Join Deeplight AI and take your career to the next level!
Application takes less than 5 minutes