Confidential
Reliability Engineer (SRE) – Devsecops
ConfidentialUnited Arab Emirates23 hours ago
Full-timeRemote FriendlyEngineering

Role Summary:

We are looking for a Site Reliability Engineer (SRE) to maintain the availability, scalability, and performance of critical services deployed across cloud and on-premise environments. This role combines software engineering and systems engineering to automate operations and improve reliability in CI/CD and production environments.


Key Responsibilities:

  • Maintain uptime and performance of applications deployed across hybrid infrastructure
  • Implement observability (logging, metrics, tracing) using Prometheus, Grafana, ELK, Azure Monitor
  • Troubleshoot production issues, participate in incident response, and root cause analysis
  • Automate infrastructure, monitoring, and runbooks using IaC tools and scripting
  • Implement and track SLOs, SLIs, and error budgets
  • Build self-healing systems and resilient deployments
  • Collaborate with developers, security teams, and cloud engineers to enforce reliability practices


Required Skills:

  • Experience with Azure/AWS/GCP monitoring tools and on-prem observability stacks
  • Strong in Linux/Unix administration, scripting (Python, Bash)
  • Hands-on with CI/CD pipelines, Kubernetes, and Helm
  • Good understanding of load balancing, failover, HA architecture
  • Familiar with incident management, postmortem writing, and runbook creation


Preferred Qualifications:

  • Experience with Terraform, Ansible, or Pulumi
  • Knowledge of service mesh (Istio, Linkerd) and API gateway configurations
  • Certifications: SRE Foundation, Azure/AWS Cloud Practitioner, or Kubernetes Administrator (CKA)
  • Awareness of compliance standards (CIS, NIST, ISO 27001)

Key Skills

Ranked by relevance