-
EPAM Systems

Lead Site Reliability Engineer

EPAM Systems
Argentina · Full-time · Mid-Senior

We are looking for an experienced Lead Site Reliability Engineer to join our team and drive the development of reliable and scalable infrastructure.

In this role, you will work closely with software and operations teams to ensure seamless integration between infrastructure and applications. You will be instrumental in maintaining high system reliability, optimizing scalability, and driving operational excellence using modern tools and technologies.

 

Responsibilities

  • Partner with software teams to ensure smooth integration of infrastructure and application systems
  • Implement SRE principles and engineering practices to build, monitor, and operate complex infrastructure solutions
  • Utilize automation tools to improve operational workflows and enhance system reliability
  • Architect and maintain scalable web systems and cloud-based platforms
  • Develop efficient and maintainable code using languages like Golang, Python, Ruby, and Scala
  • Diagnose and resolve issues under high-pressure situations, ensuring timely resolution
  • Monitor system performance and implement measures to guarantee consistent uptime and reliability

 

Requirements

  • At least 5 years of experience in developing, managing, or supporting large-scale Linux-based web application systems
  • Minimum of one year of experience leading and managing development teams
  • Proficiency in UNIX systems administration with expertise in scripting languages such as Python, PHP, or Bash
  • Practical experience running Docker with orchestration tools like Nomad, Kubernetes, or Amazon ECS
  • Familiarity with configuration management tools like Ansible, Chef, or Puppet (Puppet experience preferred)
  • Strong communication skills and ability to work effectively with distributed teams
  • Ability to produce clean, well-documented, and easy-to-understand systems and scripts
  • Eagerness to continuously learn and work with new technologies and programming languages
  • Fluent English communication skills, both written and verbal, at a B2+ level or higher

 

Nice to have

  • Knowledge of observability and performance monitoring tools such as ELK, Prometheus, New Relic, Sentry, or Lightstep
  • Proficiency in Ruby or Scala for development and scripting tasks

 

We offer

  • International projects with top brands
  • Work with global teams of highly skilled, diverse peers
  • Healthcare benefits
  • Employee financial programs
  • Paid time off and sick leave
  • Upskilling, reskilling and certification courses
  • Unlimited access to the LinkedIn Learning library and 22,000+ courses
  • Global career opportunities
  • Volunteer and community involvement opportunities
  • EPAM Employee Groups
  • Award-winning culture recognized by Glassdoor, Newsweek and LinkedIn

Key Skills

Ranked by relevance

python puppet ruby configuration management kubernetes prometheus ansible docker golang scala linux cloud unix php elk
Login to Apply
Posted
Sep 26, 2025
Type
Full-time
Level
Mid-Senior
Location
Argentina

Industries

Software Development IT Services IT Consulting Technology Information Internet

Categories

Engineering Information Technology Business Development

Related Jobs

3 roles aligned with this opportunity

View all jobs
View Job Details
EPAM Systems
Related

DevOps Engineer

2026-05-27

Full-time
Associate
Argentina
Software Development
Engineering
View Job Details
EPAM Systems
Related

Chief Software Engineer (AdTech)

2026-05-17

Full-time
Mid-Senior
Argentina
Software Development
Information Technology
View Job Details
EPAM Systems
Related

Senior Software Engineer (Node.js)

2026-05-17

Full-time
Mid-Senior
Argentina
Software Development
Information Technology