-
noon

DevOps/ Site Reliability Engineer

noon
United Arab Emirates · Full-time · Mid-Senior

Job title: DevOps/ Site Reliability Engineer

Location: Dubai, UAE

Reporting to: Head of Dev Sec Ops


About noon

We’re building an ecosystem of digital products and services that power everyday life across the Middle East—fast, scalable, and deeply customer-centric. Our mission is to deliver to every door every day. We want to redefine what technology can do in this region, and we’re looking for a

DevOps/ Site Reliability Engineer who can help us move even faster.


noon’s mission: Every door, every day.


What you'll do:

Team noon has some of the fastest, smartest, and hardest-working people we've encountered. With a young, aggressive, and talented team, we're driving major missions forward. As a DevOps/ Site Reliability Engineer at noonpayments, you’ll be the backbone of infrastructure stability and performance. You will drive automation, reliability, and observability across mission-critical services—primarily in Azure (VMSS) and GCP (MIG), with a strong emphasis on Terraform, Azure DevOps, Shell scripting, and Datadog.


Your toolkit will be code—not manual clicks. Your playground: production. Your mission: eliminate toil and chase the 9s. You will:


Cloud & Linux Infrastructure

  • Administer and tune Linux-based VM workloads (Ubuntu/RHEL) across Azure VMSS and GCP MIG.
  • Harden, scale, and monitor VMs for critical payment flows and backend services.


Infrastructure as Code (IaC)

  • Define and manage infrastructure using Terraform with modular, reusable patterns.
  • Own the infrastructure lifecycle from provisioning to teardown with GitOps principles.


CI/CD Automation

  • Build and manage Azure DevOps Pipelines for automated provisioning, deployment, and config drift checks.
  • Write and maintain Shell scripts for system bootstrapping, diagnostics, log scraping, and ad-hoc ops automation.


Monitoring & Observability

  • Build and maintain Datadog monitors, dashboards, and traces.
  • Define SLOs/SLIs and drive proactive alerting to detect issues before impact.


Middleware & DB Ops

  • Operate and maintain RabbitMQ clusters for high-throughput messaging.
  • Tune and monitor MongoDB instances for latency, failover, and capacity.


Incident Response

  • Participate in 24/7 on-call with ownership of reliability, fast mitigation, and RCA.
  • Run post-mortems, reduce MTTR, and automate fixes.


Performance & Capacity

  • Analyze usage patterns and forecast capacity requirements.
  • Identify and fix system bottlenecks, memory leaks, I/O contention, or misconfigurations.


Cross-functional Collaboration

  • Partner with product, platform, and security teams to roll out resilient architectures.
  • Conduct infrastructure reviews, audits, and chaos testing.


Documentation & Runbooks

  • Maintain detailed runbooks, IaC diagrams, and incident playbooks.


What you'll need:

  • 6+ years experience in DevOps / SRE roles with production ownership.
  • Advanced Linux administration and troubleshooting skills.
  • Mastery in Terraform, with deep understanding of state, modules, and secrets management.
  • Proven delivery of CI/CD pipelines using Azure DevOps, YAML-first mindset.
  • Shell scripting ninja—can write, debug, and optimize scripts in Bash/Zsh/sh.
  • In-depth monitoring and tracing skills using Datadog, including custom metrics and integrations.
  • Experience running and tuning RabbitMQ and MongoDB at scale.
  • Familiarity with Azure VMSS, GCP MIG, and VM auto-healing strategies.
  • Comfortable with 24/7 on-call, SLOs, SLIs, and incident-driven culture.
  • Bonus: Experience in payment systems or financial-grade uptime environments.


Who will excel?

  • We’re looking for people with high standards, who understand that hard work matters.
  • You need to be relentlessly resourceful and operate with a deep bias for action.
  • We need people with the courage to be fiercely original.
  • noon is not for everyone; readiness to adapt, pivot, and learn is essential.

Key Skills

Ranked by relevance

devops terraform linux gcp rabbitmq datadog vm shell scripting cicd
Login to Apply
Posted
May 29, 2025
Type
Full-time
Level
Mid-Senior
Location
Dubai
Company
noon

Industries

Internet Marketplace Platforms

Categories

Engineering Management

Related Jobs

3 roles aligned with this opportunity

View all jobs
View Job Details
eXalt
Related

DevOps @eXalt Lille

2026-05-21

Full-time
Not Applicable
France
IT Services
Management
View Job Details
Atruvia AG
Related

System Engineer/Site Reliability Engineer (m/w/d)

2026-06-09

Full-time
Not Applicable
Germany
IT Services
Engineering
View Job Details
Jobgether
Related

Staff Software Engineer, Backend

2026-05-24

Full-time
Not Applicable
United Arab Emirates
Internet Marketplace Platforms
Engineering