-
EPAM Systems

Lead Machine Learning Engineer – Real‑Time Agentic AI

EPAM Systems
Turkey · Full-time · Mid-Senior

We are seeking a highly skilled and innovative Lead Machine Learning Engineer to design, deploy, and optimize advanced AI systems for real-time, agent-based interactions. In this leadership role, the ideal candidate will combine deep technical expertise in machine learning engineering, advanced model orchestration, and real-time data processing with a strong ability to mentor and collaborate across teams to deliver impactful AI-driven solutions.

 

Responsibilities

  • Architect and optimize streaming agent frameworks (LangChain, LangGraph, or custom) for dialogue, task planning, and tool use
  • Implement concurrency & session‑management strategies to support thousands of simultaneous agent interactions
  • Design safeguards, fallback chains, and monitoring for reliability and safety
  • Update, fine‑tune, and distill large language, vision, or speech models on domain‑specific data (using LoRA, PEFT, QLoRA, etc.)
  • Build automated pipelines for continuous evaluation, drift detection, and online re‑training
  • Develop adapters or retrieval-augmented generation (RAG) loops to augment models with private knowledge bases
  • Ingest high‑volume event streams (Kafka, Pulsar, Pub/Sub, WebRTC) and serve model outputs with sub‑second latency
  • Implement vector‑DB search (Weaviate, PGVector, Milvus) and caching layers tuned for real‑time workloads
  • Optimize GPU/CPU allocation across streaming endpoints
  • Deploy containerized micro‑services (Docker, K8s, ECS) with CI/CD and infrastructure‑as‑code
  • Instrument metrics (Prometheus, Grafana) and tracing (OpenTelemetry) for cost/performance insights
  • Champion best practices for security, privacy, and policy compliance (NCA, PDPL)
  • Mentor junior engineers, review pull requests, and evangelize modern Agentic AI patterns
  • Translate business ideas into practical and timely project plans that result in scalable production systems
  • Present findings and roadmap updates to executives and cross‑functional stakeholders

 

Requirements

  • Master’s degree in areas related to Machine Learning, Artificial Intelligence, or Data Science
  • 5+ years of software or ML engineering experience, with at least 2+ years in production LLM or large-scale DL environments
  • At least 1 year of relevant leadership experience
  • Deep proficiency in Python
  • Hands‑on mastery of LangChain/LangGraph (or equivalent agent frameworks) in live services
  • Proven record in shipping low‑latency streaming ML systems (websockets, gRPC, Kafka, WebRTC)
  • Good knowledge of audio/voice or multimodal pipelines, VAD, and real‑time noise suppression
  • Expertise in fine‑tuning & optimizing foundation models (transformers, diffusers, SAM)
  • Strong background in MLOps (Docker, K8s, MLFlow, Weights & Biases, Ray Serve)
  • Comfortable with vector databases, embeddings, and retrieval-augmented generation
  • Solid grounding in distributed systems, concurrency, and cloud GPUs (AWS, GCP, Azure)
  • Excellent communication and mentorship skills
  • Strong command of written and spoken English (B2+ level)

 

Nice to have

  • Expertise in utilizing low-code and no-code Intelligent Automation tools such as n8n, Zapier, and Make

 

We offer

  • CONTINUOUS UPSKILLING, LEARNING & DEVELOPMENT
    • Diversity of tasks and projects
    • Assessment center for objective review of competency level
    • Personal development plan
    • Mentoring programs and leadership development
    • Certification and professional development support
    • Access to learning platforms including more than 2,500 internal courses and the LinkedIn Learning library with 20,000+ courses
    • English courses taught by certified teachers
  • CORPORATE BENEFITS
    • Extra leave days
    • Referral bonuses
  • COMPENSATION PACKAGE
    • Competitive compensation paid in USD
    • Regular salary and performance reviews
  • MEDICAL & HEALTHCARE
    • Private health insurance
    • Well-being events
  • WORKING ENVIRONMENT
    • Recreation areas and kitchens
    • Tea, coffee, and snacks
    • Well-being events
    • Sports equipment and game consoles
    • IT Equipment
    • Microsoft's Software Assurance Home Use Program (HUP)

 

EPAM is a leading global provider of digital platform engineering and development services. We are committed to having a positive impact on our customers, our employees, and our communities. We embrace a dynamic and inclusive culture. Here you will collaborate with multi-national teams, contribute to a myriad of innovative projects that deliver the most creative and cutting-edge solutions, and have an opportunity to continuously learn and grow. No matter where you are located, you will join a dedicated, creative, and diverse community that will help you discover your fullest potential.

 

Key Skills

Ranked by relevance

machine learning ai docker kafka artificial intelligence technical expertise microservices prometheus grafana mlflow cloud mlops grpc cicd aws gcp ecs
Login to Apply
Posted
Aug 26, 2025
Type
Full-time
Level
Mid-Senior
Location
Türkiye

Industries

Software Development IT Services IT Consulting

Categories

Engineering Information Technology Research

Related Jobs

3 roles aligned with this opportunity

View all jobs
View Job Details
EPAM Systems
Related

Lead AI Engineer

2026-05-26

Full-time
Mid-Senior
Turkey
Software Development
Information Technology
View Job Details
EPAM Systems
Related

Lead Generative AI Data Scientist

2026-05-24

Full-time
Mid-Senior
Ukraine
Software Development
Business Development
View Job Details
EPAM Systems
Related

Full-stack .NET Software Engineer (React/Angular)

2026-05-27

Full-time
Associate
Ukraine
Software Development
Information Technology