-
EPAM Systems

Senior Site Reliability Engineer - DevOps

EPAM Systems
Portugal · Full-time · Mid-Senior

We are seeking a Senior Site Reliability Engineer to support a global execution platform and deliver high-quality solutions to trading desks and clients.

You will work closely with top specialists, developing your skills in system management, monitoring, and low-latency technology. Apply now to be part of a team driving innovation in financial technology.

Please note that working from the customer's office in Lisbon is required 2-3 days per week.

 

Responsibilities

  • Develop and implement monitoring, alerting, and incident response strategies
  • Automate routine tasks and processes to improve efficiency
  • Collaborate with software engineering teams to design and deploy reliable, scalable systems
  • Deploy production changes with precision to maintain platform integrity
  • Manage incidents including detailed analysis and reporting to ensure high service levels
  • Participate in on-call rotations to support critical systems and services
  • Communicate effectively with team members to resolve issues promptly
  • Maintain documentation for operational procedures and system configurations
  • Continuously improve system reliability and performance through proactive measures

 

Requirements

  • Strong knowledge of Unix/Linux systems and networking with 3+ years experience
  • Proficiency in Unix/Linux shell scripting and programming languages such as Python, Perl, C, C++, or Java
  • Experience with monitoring and observability tools like ITRS Geneos, Dynatrace, Prometheus, and Grafana
  • Ability to troubleshoot complex systems and resolve issues efficiently
  • Experience working in high-availability, high-traffic environments
  • Bachelor’s or Master’s degree in IT engineering or related field
  • Ability to work effectively in a team and adapt to new environments
  • Self-motivated with strong problem-solving and issue follow-up skills
  • Excellent written and verbal communication skills with English level B2+

 

Nice to have

  • Experience with log management tools such as Splunk, ELK, Graylog, or Loki
  • Knowledge of network monitoring tools like Corvil
  • Familiarity with databases including Oracle, PostgreSQL, MySQL/MariaDB, or KDB/q
  • Experience with messaging systems such as IBM MQ, Tibco, Solace, LBM, or Kafka
  • Familiarity with Infrastructure as Code tools like Ansible or Terraform

 

We offer

  • International projects with top brands
  • Work with global teams of highly skilled, diverse peers
  • Healthcare benefits
  • Employee financial programs
  • Paid time off and sick leave
  • Upskilling, reskilling and certification courses
  • Unlimited access to the LinkedIn Learning library and 22,000+ courses
  • Global career opportunities
  • Volunteer and community involvement opportunities
  • EPAM Employee Groups
  • Award-winning culture recognized by Glassdoor, Newsweek and LinkedIn

Key Skills

Ranked by relevance

c infrastructure as code incident response shell scripting postgresql prometheus ansible python oracle splunk perl elk
Login to Apply
Posted
Dec 27, 2025
Type
Full-time
Level
Mid-Senior
Location
Lisbon

Industries

Software Development IT Services IT Consulting Banking

Categories

Engineering Information Technology Business Development

Related Jobs

3 roles aligned with this opportunity

View all jobs
View Job Details
EPAM Systems
Related

DevOps Engineer

2026-05-27

Full-time
Associate
Argentina
Software Development
Engineering
View Job Details
EPAM Systems
Related

DevOps Engineer (AWS)

2026-05-27

Full-time
Associate
Argentina
Software Development
Engineering
View Job Details
EPAM Systems
Related

Senior Software Engineer (Node.js)

2026-05-17

Full-time
Mid-Senior
Argentina
Software Development
Information Technology