We are seeking a highly skilled and motivated Senior Site Reliability Engineer to be a key member of our team, driving operational excellence and improving the reliability, scalability, and performance of our infrastructure and product services.
Responsibilities
- Provide L3 on-call support, ensuring rapid response to incidents
- Define and implement effective SLI/SLO metrics for product monitoring
- Perform detailed root cause analysis to resolve critical issues
- Conduct postmortems and organize drills to improve readiness
- Analyze product performance, scalability, and reliability to optimize service delivery
- Automate operational tasks to reduce manual intervention
- Implement CI/CD pipelines using tools like Jenkins, Gitlab-CI, or Azure DevOps
- Manage cloud infrastructure and configurations to support Infrastructure-as-Code initiatives
- Utilize configuration management tools such as Ansible to maintain consistency across environments
- Collaborate closely with cross-product teams and business stakeholders to align reliability goals with project objectives
Requirements
- 5+ years of experience working in Site Reliability Engineering or similar roles
- Intermediate knowledge of scripting languages such as Python, Go, Bash, or Powershell
- Solid knowledge of cloud platforms, including AWS, Azure, or GCP
- Familiarity with observability tools such as Prometheus, Grafana, DataDog, ELK, or Zabbix
- Expertise in cloud infrastructure management tools, including Terraform and one of the cloud CLIs (gcloud, az, aws)
- Proficiency in containerization technologies like Docker and Kubernetes (K8s)
- Capability to define and monitor SLI/SLO metrics for system reliability
- Thorough understanding of postmortem and drill procedures to enhance incident handling processes
- B2-level English proficiency in both speaking and writing
Nice to have
- Showcase of implementing CI/CD pipelines using Groovy SDK or Jenkinsfile
- Background in working with large-scale production systems requiring high availability
- Familiarity with advanced monitoring practices using tools such as Dynatrace
- Skills in scaling Kubernetes clusters and optimizing containerized applications
- Flexibility to use diverse scripting languages to automate complex workflows
We offer
- With us you can:
- Work on a flexible schedule remotely or from any of our comfortable offices or coworking spaces in Ukraine
- Receive the necessary equipment to perform your work tasks
- Change projects and technology stacks within EPAM
- Gain experience in various business domains (Insurance, E-commerce, Healthcare, Finance, Travelling, Media, Artificial Intelligence, and more)
- Relocation opportunities may be available for eligible candidates, depending on the role and openings at other EPAM locations
- Participate in volunteer, charity programs and communities (both technical and interest-based)
- We focus on your professional growth:
- You can plan your individual career path together with your manager
- Receive regular feedback from colleagues
- Improve your English for free with certified teachers (Speaking Clubs, client interview preparation courses, etc.)
- Get the opportunity to undergo free training and certification in AWS, GCP, or Azure Clouds
- Use the internal E-learn training program (18,200+ specialized training and mentoring programs)
- Access corporate accounts on LinkedIn Learning, Get Abstract and other partner resources
- Study at EPAM Solution Architecture School with the instructors who are practicing architects
- Develop as a leader, join Delivery Management, Resource Management, Leadership Essentials school and more
- Participate in internal communities (500+ meetups, technical discussions, brainstorming sessions, online events and conferences annually)
- What we offer:
- Vacation and sick leave (including a sick leave without a medical certificate)
- A wide range of Voluntary Medical Insurance programs providing both medical treatment and various preventive options (including sports activities)
- Medical insurance for family members at corporate rates
- Company support during significant life events (childbirth or adoption, marriage, etc.)
- Support for psychological comfort: discounts on services from mental health specialists or coaches, thematic training
- E-kids program - a free programming language training program for EPAMers' children
Kindly note that this role supports remote work, but only from within Ukraine.
Kindly be advised that the set of benefits, including learning, certification, and other opportunities, may vary depending on the role you apply for. Our recruiter will be able to share more details about the specific opportunity during your general interview.
EPAM strives to provide its global team of over 61,700 professionals in more than 55 countries with opportunities for professional growth from day one of collaboration. Our colleagues are the source of EPAM's success, so we value cooperation, strive to always understand our clients' business and aim for the highest quality standards. No matter where you are, you will join a dedicated, diverse community that will help you realize your potential to the fullest.
The remote option applies only to the Candidates who will be working from any location in Ukraine.
Key Skills
Ranked by relevance
Related Jobs
3 roles aligned with this opportunity
Junior Java Developer
2026-05-20
DevOps Engineer
2026-05-27
DevOps Engineer (AWS)
2026-05-27
- Posted
- Aug 27, 2025
- Type
- Full-time
- Level
- Mid-Senior
- Location
- Ukraine
- Company
- EPAM Systems
Industries
Categories
Related Jobs
3 roles aligned with this opportunity
Junior Java Developer
2026-05-20
DevOps Engineer
2026-05-27
DevOps Engineer (AWS)
2026-05-27