Agile Lab
Site Reliability Engineer II (Remote)
Agile LabItaly11 hours ago
Full-timeRemote FriendlyEngineering, Information Technology
Agile Lab is a company founded in 2014 with the mission to create value for its customers in data-intensive environments through customisable solutions that establish performance-driven processes, sustainable architectures and automated platforms based on data governance best practices.

Having delivered over 100 successful Elite Data Engineering initiatives, we have used this experience to create Witboost: a modular, technology-agnostic platform that enables modern organisations to discover, value and produce their data in both traditional environments and fully compliant Data Mesh architectures.

With a highly skilled team of over 260 data engineers based in Europe, Agile Lab helps organisations with their data-driven transformation.

Take a look at our handbook to discover our core values and processes.

💼 The opportunity:

We are looking for a Site Reliability Engineer II (SRE II) to join our growing team. You will play a key role in maintaining the reliability, observability, and operational efficiency of enterprise-level distributed systems.

In this role, you’ll coordinate a small technical team (3–4 people) in managing microservices in complex production environments. You will be involved in monitoring, incident management, release coordination, and performance tuning, with a strong focus on OpenShift platforms.

You’ll also work closely with multiple cross-functional teams to ensure high availability and performance of our cloud-native services.

This role includes on-call availability.

💰 RAL:

38.5K-48.5K

💻 Responsibilities:

  • Ensure high reliability of microservices running in OpenShift environments
  • Lead and coordinate a technical team of 3–4 engineers for operational excellence
  • Manage incident resolution and ticketing workflows via ServiceNow
  • Collaborate with development teams to drive performance optimization and tuning
  • Design, configure and maintain monitoring dashboards (Grafana, Prometheus, etc.)
  • Coordinate with Service Control Room to maintain effective alerting and response
  • Oversee release processes of new features, hotfixes, and updates in production

🛠️ Requirements:

  • Degree in Computer Engineering, Computer Science, or a related field
  • Proven experience in Application Maintenance Services (AMS): minimum 2 years
  • In-depth knowledge of OpenShift and microservices in cloud-native environments
  • Ability to technically and operationally lead a team of 3–4 people
  • Experience in release management, monitoring, and incident resolution
  • Excellent communication and cross-functional coordination skills
  • Strong initiative, operational autonomy, and results-oriented mindset
  • Fluency in Italian (mandatory requirement)
  • Monitoring & Observability: Grafana, Prometheus, Kibana, Jaeger, Datadog, OpenTelemetry
  • Cloud/DevOps: OpenShift, GitLab, Jenkins
  • Data & Messaging: Kafka, MongoDB, Ignite
  • Ticketing & ITSM: ServiceNow

🙌🏻 We offer:

  • Full Remote or hybrid working in our offices: Milan, Turin, Padua, Bologna, Catania and Rende;
  • Real work life balance;
  • Training monthly budget (time and money);
  • Support of a buddy in the first week of work;
  • Benefits and corporate welfare programs: company prizes and welcome pack with all the equipment you need to work;
  • Agile Nomads Experience: opportunity to work for 2 weeks abroad;
  • Referral bonus, if you bring people as talented as you;
  • The opportunity to attend one conference per year;
  • A company rated 4.8 out of 5 for employee satisfaction on Glassdoor and certified as a Great Place to Work
  • Inclusive environment where you can be who you really are;
  • Stimulating environment oriented to growth, both professional and personal.

😊 How we work:

  • We don't like hierarchies: we work as a team;
  • We don't like bureaucracies, we prefer sense of responsibility;
  • We like data, certainly, so anything that is measurable;
  • We want to make a positive change in our industry;
  • Empathy, humility, collaboration, and willingness to challenge ourselves are the basis of our work.

Please note:

Only candidates based in European time zones (CEST or similar) will be considered for this position;

Key Skills

Ranked by relevance