-
Ampstek

Site Reliability Engineer

Ampstek
United Arab Emirates · Contract · Mid-Senior

Site Reliability Engineer (SRE)

From designing fault-tolerant architectures to leading incident responses, you’ll have the freedom to

shape how we deliver stable, secure, and high-performance banking services.

About the Role

We’re looking for a talented Site Reliability Engineer (SRE) to keep our systems running smoothly,

reliably, and at scale. Through smart automation, deep observability, and a calm head in a crisis, you’ll

help us balance speed, compliance, and stability, working alongside DevOps, Cloud, Quality

Engineering, and Product teams to drive continuous improvements in performance, security, and

resilience.

You’ll play a key role in enhancing reliability, accelerating delivery, and ensuring seamless digital

experiences for ADCB customers.

This role reports directly to our Lead SRE / Tribe Executive Manager.

What You Will Be Doing


• Define and implement SLIs / SLOs and error budgets for business-critical digital

banking services.

• Build actionable observability (metrics, logs, traces, dashboards, and alerts) using

Dynatrace, Prometheus, Grafana, and ELK, while reducing alert fatigue.

• Leverage AI-driven insights and anomaly detection (Dynatrace Davis AI or equivalent

AIOps platform) to proactively predict and resolve reliability issues before impact.

• Lead incident management — from on-call triage and root-cause analysis to blameless

postmortems with actionable follow-ups.

• Improve deployment safety with robust rollout / rollback strategies, canary and blue-

green deployments, and production readiness reviews.

• Support and optimize microservices-based architectures, ensuring service reliability,

scalability, and inter-service resilience.

• Conduct capacity planning, performance tuning, and resilience testing, optimizing for

both reliability and cost efficiency.

• Automate operational toil — from runbooks and remediation scripts to proactive health

checks and self-healing workflows.

• Collaborate with DevOps to embed reliability gates and validations into CI / CD

pipelines (GitHub Actions, Jenkins, GitLab CI / CD or Azure DevOps).

• Own and evolve the observability and AIOps stack, driving intelligent automation and

predictive alerting capabilities.

• Maintain high-quality documentation, playbooks, and operational standards across

environments.

• Ensure operational compliance and security alignment with internal controls and

regulatory standards.

• Analyze system performance, availability, and cost data to continually optimize

operations.

• Provide reliability support and escalation guidance for critical production systems

during major incidents.

Experience and Qualifications


• 5+ years of experience in SRE or DevOps roles, building and managing large-scale,

high-availability systems across banking, fintech, e-commerce, or other data-intensive

digital ecosystems.


• Bachelor’s degree in Computer Science or equivalent technical experience.

• Strong experience with Linux environments and performance troubleshooting.

• Proven expertise in Terraform and Infrastructure as Code (IaC) methodologies.

• Proficiency with Kubernetes and container orchestration in microservices

environments.

• Hands-on experience with AWS (preferred); exposure to Azure or GCP is an advantage.

• Deep knowledge of Dynatrace (AIOps, Davis AI), Prometheus, Grafana, and the ELK

stack.

• Experience implementing AI / ML-driven reliability or automation solutions (AIOps,

anomaly detection, predictive alerting).

• Practical understanding of CI / CD pipelines (GitHub Actions, Jenkins, GitLab CI / CD

or Azure DevOps).

• Experience with Kafka, RabbitMQ, Redis, Aurora, and RDS databases.

• Strong scripting or programming skills in Python, Bash, or Go. The Ideal Candidate

• Organized, structured, and meticulous in approach.

• Experienced in cross-functional collaboration and working with distributed teams.

• Strong analytical mindset with excellent troubleshooting skills for complex production

systems.

• Calm and composed communicator under pressure, capable of leading during high-

impact incidents.

• Proactive problem-solver who anticipates issues and drives preventive improvements.

• Passionate about AI-driven automation, observability, and reliability engineering.

• Continuously learning, keeping up-to-date with cloud-native, microservices, and SRE

best practices.

• A collaborative and adaptable team player who thrives in a fast-paced, regulated

environment and is passionate about building reliable, scalable systems that empower

digital banking innovation.

Key Skills

Ranked by relevance

ai devops microservices prometheus gitlab ci jenkins grafana gitlab cloud infrastructure as code kubernetes terraform rabbitmq python redis kafka linux bash aws gcp elk
Login to Apply
Posted
Nov 06, 2025
Type
Contract
Level
Mid-Senior
Location
Abu Dhabi Emirate
Company
Ampstek

Industries

IT Services IT Consulting

Categories

Information Technology

Related Jobs

3 roles aligned with this opportunity

View all jobs
View Job Details
Everience Benelux
Related

Java DevOps Engineer (m/w/d)

2026-05-22

Full-time
Mid-Senior
Netherlands
Software Development
Information Technology
View Job Details
Danske Bank
Related

Senior Machine Learning Engineer

2026-06-04

Full-time
Not Applicable
Lithuania
Financial Services
Engineering
View Job Details
Fruition Group Ireland
Related

Artificial Intelligence Engineer

2026-05-29

Contract
Mid-Senior
Ireland
IT Services
Information Technology