-
View all jobs
We are seeking a highly skilled Lead Site Reliability Engineer to join our team, focused on enhancing the reliability and scalability of software and systems in Azure cloud environments.
The role centers on supporting and advancing a platform that facilitates comprehensive interactions with IoT devices, from device registration to bi-directional messaging and integration with other cloud services.
Feel free to work remotely from anywhere across Latvia or connect with colleagues at our Riga office.
Responsibilities
EPAM is a leading global provider of digital platform engineering and development services. For over 30 years, our team has helped leading brands navigate the waves of digital transformation, building solutions that help them stay competitive through constant market disruption.
With offices in 55+ countries, EPAM has grown in Latvia to over 150+ talented innovators in 3 years. We foster creativity and unconventional ways of doing things, welcoming like-minded professionals to join us.
Salary range €4K-€6K gross, based on your experience and interview results.
The role centers on supporting and advancing a platform that facilitates comprehensive interactions with IoT devices, from device registration to bi-directional messaging and integration with other cloud services.
Feel free to work remotely from anywhere across Latvia or connect with colleagues at our Riga office.
Responsibilities
- Design, implement, and maintain systems with a focus on high availability and scalability across Azure cloud multi-region architectures
- Establish and regularly test disaster recovery plans
- Enhance monitoring and alerting processes using tools such as Prometheus, Grafana, Alertmanager, and OpsGenie
- Develop dashboards for visualizing system performance and reliability metrics
- Employ Terraform for infrastructure provisioning and management while implementing continuous deployment best practices
- Collaborate with the development team and communicate with the customer’s DevOps team to support development and elaborate on requirements
- Advance release management and CI/CD processes using Jenkins
- Design system support processes and write and test runbooks for operational tasks and incident management
- Manage and optimize Kubernetes, Docker/Linux environments, and Azure Networking
- Handle data management using Cosmos DB and MS SQL Server, and work with messaging systems like RabbitMQ, Kafka, and EventHub
- 5+ years of experience as a DevOps or SRE engineer
- Proven experience with multi-region Azure cloud architectures
- Proficiency in Kubernetes, Docker, and Linux environments
- Strong knowledge of Cosmos DB, MS SQL Server
- Familiarity with Prometheus, Grafana, Alertmanager, OpsGenie monitoring tools
- Experience in .NET Core, ASP.NET Core
- Expertise in using Terraform for infrastructure as code
- Experience deploying and managing CI/CD tools
- Solid understanding of Azure Networking
- Strong self-motivation and ability to self-manage tasks
- Fluent English communication skills at a B2+ level
- Experience with Azure IoT Hub and EventHub
- Engineering Heritage: Best-in-class experts sharing a culture of engineering excellence and tackling complex engineering challenges for over 30 years.
- Advanced Tech Stack: Innovative projects where you can apply or enhance your expertise in Cloud, Data, AI, and other emerging technologies.
- World-Class Clients: Work closely with 295+ of the Forbes Global 2000 on creating disruptive solutions that make a global impact.
- Professional Growth: Exceptional support for career development with comprehensive resources for upskilling or reskilling in pioneering practices.
- GenAI Community: Strong AI competencies with 600+ experts across 55+ locations driving GenAI-enabled transformation journeys.
- Entrepreneurial Culture: If you're passionate and dedicated to improving business transformation, we provide the support you need to bring your ideas to life.
- Hybrid Setup: The flexibility to work from any location in Latvia, whether it's your home or our office in Riga.
- Other Benefits: Additional vacation and trust days, private health insurance, Employee Stock Purchase Plan and more.
EPAM is a leading global provider of digital platform engineering and development services. For over 30 years, our team has helped leading brands navigate the waves of digital transformation, building solutions that help them stay competitive through constant market disruption.
With offices in 55+ countries, EPAM has grown in Latvia to over 150+ talented innovators in 3 years. We foster creativity and unconventional ways of doing things, welcoming like-minded professionals to join us.
Salary range €4K-€6K gross, based on your experience and interview results.
Key Skills
Ranked by relevance
cloud
kubernetes
prometheus
terraform
grafana
devops
cicd
sql
ai
continuous deployment
high availability
sql server
rabbitmq
docker
server
kafka
linux
Related Jobs
3 roles aligned with this opportunity
View Job Details
Related
DevOps Engineer
2026-05-27
Full-time
Associate
Argentina
Software Development
Engineering
View Job Details
Related
Lead DevOps Engineer (Azure)
2026-05-16
Full-time
Mid-Senior
Turkey
Software Development
Engineering
View Job Details
Related
DevOps Engineer (AWS)
2026-05-27
Full-time
Associate
Argentina
Software Development
Engineering
Login to Apply
- Posted
- Dec 27, 2024
- Type
- Full-time
- Level
- Mid-Senior
- Location
- Latvia
- Company
- EPAM Systems
Industries
Software Development
IT Services
IT Consulting
Categories
Engineering
Information Technology
Business Development
Related Jobs
3 roles aligned with this opportunity
View Job Details
Related
DevOps Engineer
2026-05-27
Full-time
Associate
Argentina
Software Development
Engineering
View Job Details
Related
Lead DevOps Engineer (Azure)
2026-05-16
Full-time
Mid-Senior
Turkey
Software Development
Engineering
View Job Details
Related
DevOps Engineer (AWS)
2026-05-27
Full-time
Associate
Argentina
Software Development
Engineering