We are seeking a highly skilled and innovative Lead Machine Learning Engineer to design, deploy, and optimize advanced AI systems for real-time, agent-based interactions. In this leadership role, the ideal candidate will combine deep technical expertise in machine learning engineering, advanced model orchestration, and real-time data processing with a strong ability to mentor and collaborate across teams to deliver impactful AI-driven solutions.
Responsibilities
- Architect and optimize streaming agent frameworks (LangChain, LangGraph, or custom) for dialogue, task planning, and tool use
- Implement concurrency & session‑management strategies to support thousands of simultaneous agent interactions
- Design safeguards, fallback chains, and monitoring for reliability and safety
- Update, fine‑tune, and distill large language, vision, or speech models on domain‑specific data (using LoRA, PEFT, QLoRA, etc.)
- Build automated pipelines for continuous evaluation, drift detection, and online re‑training
- Develop adapters or retrieval-augmented generation (RAG) loops to augment models with private knowledge bases
- Ingest high‑volume event streams (Kafka, Pulsar, Pub/Sub, WebRTC) and serve model outputs with sub‑second latency
- Implement vector‑DB search (Weaviate, PGVector, Milvus) and caching layers tuned for real‑time workloads
- Optimize GPU/CPU allocation across streaming endpoints
- Deploy containerized micro‑services (Docker, K8s, ECS) with CI/CD and infrastructure‑as‑code
- Instrument metrics (Prometheus, Grafana) and tracing (OpenTelemetry) for cost/performance insights
- Champion best practices for security, privacy, and policy compliance (NCA, PDPL)
- Mentor junior engineers, review pull requests, and evangelize modern Agentic AI patterns
- Translate business ideas into practical and timely project plans that result in scalable production systems
- Present findings and roadmap updates to executives and cross‑functional stakeholders
Requirements
- Master’s degree in areas related to Machine Learning, Artificial Intelligence, or Data Science
- 5+ years of software or ML engineering experience, with at least 2+ years in production LLM or large-scale DL environments
- At least 1 year of relevant leadership experience
- Deep proficiency in Python
- Hands‑on mastery of LangChain/LangGraph (or equivalent agent frameworks) in live services
- Proven record in shipping low‑latency streaming ML systems (websockets, gRPC, Kafka, WebRTC)
- Good knowledge of audio/voice or multimodal pipelines, VAD, and real‑time noise suppression
- Expertise in fine‑tuning & optimizing foundation models (transformers, diffusers, SAM)
- Strong background in MLOps (Docker, K8s, MLFlow, Weights & Biases, Ray Serve)
- Comfortable with vector databases, embeddings, and retrieval-augmented generation
- Solid grounding in distributed systems, concurrency, and cloud GPUs (AWS, GCP, Azure)
- Excellent communication and mentorship skills
- Strong command of written and spoken English (B2+ level)
Nice to have
- Expertise in utilizing low-code and no-code Intelligent Automation tools such as n8n, Zapier, and Make
We offer
- CONTINUOUS UPSKILLING, LEARNING & DEVELOPMENT
- Diversity of tasks and projects
- Assessment center for objective review of competency level
- Personal development plan
- Mentoring programs and leadership development
- Certification and professional development support
- Access to learning platforms including more than 2,500 internal courses and the LinkedIn Learning library with 20,000+ courses
- English courses taught by certified teachers
- CORPORATE BENEFITS
- Extra leave days
- Referral bonuses
- COMPENSATION PACKAGE
- Competitive compensation paid in USD
- Regular salary and performance reviews
- MEDICAL & HEALTHCARE
- Private health insurance
- Well-being events
- WORKING ENVIRONMENT
- Recreation areas and kitchens
- Tea, coffee, and snacks
- Well-being events
- Sports equipment and game consoles
- IT Equipment
- Microsoft's Software Assurance Home Use Program (HUP)
EPAM is a leading global provider of digital platform engineering and development services. We are committed to having a positive impact on our customers, our employees, and our communities. We embrace a dynamic and inclusive culture. Here you will collaborate with multi-national teams, contribute to a myriad of innovative projects that deliver the most creative and cutting-edge solutions, and have an opportunity to continuously learn and grow. No matter where you are located, you will join a dedicated, creative, and diverse community that will help you discover your fullest potential.
Key Skills
Ranked by relevance
Related Jobs
3 roles aligned with this opportunity
Lead AI Engineer
2026-05-26
Lead Generative AI Data Scientist
2026-05-24
Full-stack .NET Software Engineer (React/Angular)
2026-05-27
- Posted
- Aug 26, 2025
- Type
- Full-time
- Level
- Mid-Senior
- Location
- Türkiye
- Company
- EPAM Systems
Industries
Categories
Related Jobs
3 roles aligned with this opportunity
Lead AI Engineer
2026-05-26
Lead Generative AI Data Scientist
2026-05-24
Full-stack .NET Software Engineer (React/Angular)
2026-05-27