-
EPAM Systems

Senior ML Infrastructure Engineer

EPAM Systems
Argentina · Full-time · Mid-Senior

We are seeking a Senior ML Infrastructure Engineer to bolster our MLOps team, overseeing the development and maintenance of our enterprise machine learning platform while driving innovation in scalable ML infrastructure and deployment practices.

Responsibilities


  • Provide expert guidance on ML technologies, tools, and MLOps best practices focused on model observability, tracking, and deployment
  • Build and maintain robust batch processing and ML inference pipelines to enable efficient model execution
  • Automate ML model deployment processes with CI/CD pipelines to streamline production workflows
  • Monitor the health, performance, reliability, and scalability of deployed models and infrastructure
  • Integrate ML inference services seamlessly with other applications or systems
  • Enable scalable, high-performance deployments of ML models that perform well under production load
  • Collaborate directly with client stakeholders and team members to ensure requirements are met and tasks are completed effectively
  • Implement infrastructure solutions that support data processing pipelines and batch inferencing
  • Create comprehensive unit tests for ML deployment, inference, and post-processing methods
  • Maintain clear and proactive communication with team members and stakeholders to ensure alignment


Requirements


  • 3+ years of experience with AWS services and MLOps-related infrastructure, focusing on scalable ML model deployment
  • Expertise in infrastructure-as-code tools, enabling efficient and consistent infrastructure setup
  • Strong background in setting up and monitoring infrastructure for data pipelines and ML inference pipelines
  • Demonstrated task ownership abilities, with experience working directly with client stakeholders and cross-functional teams
  • Skills in writing unit tests for ML deployment, inference, and related methods to ensure code reliability
  • Clear and effective communication skills with the ability to seek clarifications when needed


Nice to have


  • Experience with Google Cloud Platform (GCP) and its ML-related services
  • Competency in working with Snowflake as a data platform for ML workflows
  • Familiarity with Feature Store platforms to improve feature management
  • Background in using Spark and AWS Elastic MapReduce (EMR) for distributed data processing
  • Understanding of data curation best practices for ML model training and enabling high-quality datasets
  • Flexibility to participate in on-call rotations, ensuring system reliability in production environments


We offer


  • International projects with top brands
  • Work with global teams of highly skilled, diverse peers
  • Healthcare benefits
  • Employee financial programs
  • Paid time off and sick leave
  • Upskilling, reskilling and certification courses
  • Unlimited access to the LinkedIn Learning library and 22,000+ courses
  • Global career opportunities
  • Volunteer and community involvement opportunities
  • EPAM Employee Groups
  • Award-winning culture recognized by Glassdoor, Newsweek and LinkedIn


Key Skills

Ranked by relevance

mlops aws google cloud platform machine learning cloud spark cicd gcp
Login to Apply
Posted
Jul 16, 2025
Type
Full-time
Level
Mid-Senior
Location
Argentina

Industries

Software Development IT Services IT Consulting Travel Arrangements

Categories

Engineering Information Technology Research

Related Jobs

3 roles aligned with this opportunity

View all jobs
View Job Details
EPAM Systems
Related

DevOps Engineer (AWS)

2026-05-27

Full-time
Associate
Argentina
Software Development
Engineering
View Job Details
EPAM Systems
Related

Senior Software Engineer (Node.js)

2026-05-17

Full-time
Mid-Senior
Argentina
Software Development
Information Technology
View Job Details
EPAM Systems
Related

Lead AI Engineer

2026-05-26

Full-time
Mid-Senior
Turkey
Software Development
Information Technology