-
Atos

Freelance Senior Site Reliability Engineer

Atos
Belgium · Contract · Mid-Senior

As part of the redesign of its execution model around the New platform, Our client has established a Site Reliability Coordination (SRC) team within LEV1 to enhance reliability, responsiveness, and incident coordination in a multi-provider context (LOT1, LOT2, LOT3).

The Senior SysOps Engineer plays a key role in the technical analysis of incidents, event correlation, and performance optimization using Site Reliability Engineering (SRE) and ITIL practices. They serve as a technical facilitator between support teams (LEV1, LEV2, LEV3), providers, and technical governance.

Main Responsibilities

Supervision and Correlation of Technical Incidents

  • Conduct in-depth analysis of logs, metrics, and alerts from various components (middleware, infrastructure, applications).
  • Ensure proactive monitoring of service performance and availability (centralized monitoring).
  • Facilitate root cause identification by collaborating with LEV2/LEV3 teams from providers (LOT1, LOT2, LOT3).
  • Correlate incidents across different layers of the system (e.g., application issue affecting infrastructure).
  • Alert and escalate to the appropriate teams when necessary.

Multi-Provider Technical Coordination

  • Participate in investigation meetings with technical experts from providers.
  • Ensure each party adheres to SLAs and contractual commitments.
  • Coordinate technical escalations and ensure clear tracking of actions taken.
  • Centralize and document technical exchanges in a structured way (runbooks, incident reports).

Continuous Improvement and Performance Optimization

  • Contribute to technical postmortems, analyzing causes and suggesting improvements.
  • Recommend monitoring and observability improvements to providers.
  • Track key performance indicators (SLI, SLO, SLA, MTTD, MTTR) and anticipate risks.
  • Conduct technological monitoring on SRE/DevOps tools and practices to enhance diagnostic capabilities.

Documentation and Knowledge Sharing

  • Maintain and enrich incident management and escalation runbooks.
  • Write technical guides for LEV1 to improve initial diagnosis.
  • Participate in training sessions to enhance the skills of the LEV1 teams.
  • Help develop the skills of the junior SysOps engineer on the team.

Participation in Committees and ITSM Governance

  • Attend operational follow-up committees (CAB, Incident Review, Performance Review) as a technical expert.
  • Share recommendations on critical incident management and change management.
  • Propose adjustments to ITIL and SRE processes to improve coordination effectiveness.


Technical Skills


  • Systems: Strong knowledge of Linux environments.
  • 5 to 10 years of experience in a similar role (SysOps, SRE, Operations Engineer, Incident Manager, Observability Engineer).
  • Proven experience in Virtualization & Containers: Experience with IaaS technologies, Kubernetes, Docker, OpenShift.
  • Middleware & Messaging: Knowledge of solutions such as Kafka, JBoss, SpringBoot, HAProxy, etc.
  • Observability and Monitoring: Proficiency with tools like Prometheus, Grafana, Loki.
  • Databases-Experience with diagnostics on Oracle and PostgreSQL.
  • Automation & Scripting: Strong practice in Bash, Python, Ansible, Terraform to analyze and optimize operations.
  • SRE Methodology: Good understanding of SLI, SLO concepts, postmortems, and advanced monitoring.
  • ITIL v4: Good understanding of Incident, Problem, and Change processes.

Organizational and Interpersonal Skills

  • Analytical and synthesis skills to correlate technical incidents and anticipate risks.
  • Collaborative mindset to facilitate communication between technical teams and providers.
  • Strong written and oral communication skills, especially for documenting and simplifying incidents.
  • Autonomy and proactivity in incident management and continuous improvement.
  • Stress resistance, ability to handle critical incidents and prioritize effectively.

Experience and Education

  • ITIL v3/v4 certification is appreciated.
  • Kubernetes certification (CKA, CKAD), AWS/GCP/Azure, or Red Hat is a plus.
  • Experience in critical environments (high availability, high volume, SLA constraints).
  • Language: Mission in a bilingual French/Dutch environment. Fluency in one of the local language is required for this role

Key Skills

Ranked by relevance

itil sla high availability virtualization kubernetes prometheus terraform ansible grafana python docker oracle kafka linux bash
Login to Apply
Posted
Mar 05, 2025
Type
Contract
Level
Mid-Senior
Location
Brussels Metropolitan Area
Company
Atos

Industries

IT Services IT Consulting

Categories

Information Technology

Related Jobs

3 roles aligned with this opportunity

View all jobs
View Job Details
Atos
Related

Senior Kubernetes / Container Platform Engineer

2026-05-24

Contract
Mid-Senior
Belgium
IT Services
Information Technology
View Job Details
Resource Corner
Related

DevOps Engineer

2026-05-27

Contract
Entry
Australia
IT Services
Engineering
View Job Details
Egov Select
Related

Network and Systems Engineer

2026-05-28

Full-time
Not Applicable
Belgium
IT Services
Information Technology