-
EPAM Systems

Senior Site Reliability Engineer

EPAM Systems
Argentina · Full-time · Mid-Senior

We are actively seeking a Senior Site Reliability Engineer to join our team, with a focus on independently managing complex tasks, including infrastructure enhancements and automating development and deployment processes.

The ideal candidate will have several years of experience, demonstrate deep troubleshooting skills, and excel at resolving platform-related issues. This position provides the chance to participate in sprint planning and story grooming sessions, contributing insights on implementation challenges. Under the direction of the Engineering Manager, this role is crucial for maintaining high operational standards and improving our engineering processes.

Responsibilities


  • Independently investigate and resolve platform-related issues
  • Analyze, create, and improve automation processes
  • Craft scripts that automate different tasks
  • Offer active participation in sprint meetings and story estimations
  • Monitor production APM tools, such as Datadog
  • Handle the collection and analysis of application logs
  • Supervise alerts associated with application and instance for site reliability
  • Engage in architecture discussions regarding infrastructure
  • Keep applications and libraries used on the platform updated
  • Manage methods and servers for code deployment
  • Provide mentorship and support to fellow engineers
  • Undertake code reviews


Requirements


  • A minimum of 3 years in a Site Reliability Engineer role or similar
  • Proficiency in TypeScript, NodeJS/NestJS, React Native
  • Strong background in Python/Django, familiarity with PostgreSQL, Redis
  • Competency in CircleCI, Spinnaker, Expo
  • Background in administering production application workloads on AWS Cloud
  • Understanding of Cloud networks and VPC peering
  • Skills in Cloud computing including EC2, SNS/SQS, RDS
  • Knowledge of containerization and orchestration with Docker, Kubernetes, EKS
  • Expertise in provisioning and configuration tools like Terraform, Ansible
  • Proficiency in Linux or Windows server administration
  • Capability to integrate monitoring, logging, and alerting into systems
  • Excellent at collaboratively debugging complex issues
  • Experience with HIPAA compliance and similar standards
  • Flexibility to quickly learn and adapt to new changes


Nice to have


  • Knowledge of monitoring tools similar to Datadog
  • Familiarity with scripting languages such as Python, Groovy, PowerShell, or Ruby
  • Passion or experience in health services industry


We offer


  • International projects with top brands
  • Work with global teams of highly skilled, diverse peers
  • Healthcare benefits
  • Employee financial programs
  • Paid time off and sick leave
  • Upskilling, reskilling and certification courses
  • Unlimited access to the LinkedIn Learning library and 22,000+ courses
  • Global career opportunities
  • Volunteer and community involvement opportunities
  • EPAM Employee Groups
  • Award-winning culture recognized by Glassdoor, Newsweek and LinkedIn


Key Skills

Ranked by relevance

cloud containerization windows server kubernetes typescript postgresql powershell terraform circleci python docker groovy server react hipaa linux excel aws
Login to Apply
Posted
Dec 17, 2024
Type
Full-time
Level
Mid-Senior
Location
Argentina

Industries

Software Development IT Services IT Consulting Pharmaceutical Manufacturing

Categories

Engineering Information Technology Business Development

Related Jobs

3 roles aligned with this opportunity

View all jobs
View Job Details
EPAM Systems
Related

DevOps Engineer

2026-05-27

Full-time
Associate
Argentina
Software Development
Engineering
View Job Details
EPAM Systems
Related

Senior Software Engineer (Node.js)

2026-05-17

Full-time
Mid-Senior
Argentina
Software Development
Information Technology
View Job Details
EPAM Systems
Related

Node.js Developer

2026-05-17

Full-time
Associate
Argentina
Software Development
Information Technology