Site Reliability Engineer (SRE)

Dicetek LLC

United Arab Emirates · Contract · Entry

Job Summary

We are looking for a Site Reliability Engineer (SRE) to ensure the reliability, scalability, and performance of our production systems. The SRE will work closely with engineering, DevOps, and product teams to build highly available systems, automate operations, and improve system observability while maintaining service level objectives (SLOs).

Key Responsibilities

Reliability & Operations

Ensure high availability, reliability, and performance of production systems.
Define, monitor, and manage SLIs, SLOs, and SLAs.
Lead incident response, root cause analysis (RCA), and post-incident reviews.
Implement proactive monitoring and alerting to prevent outages.

Automation & Engineering

Automate repetitive operational tasks using scripting and infrastructure-as-code.
Improve system reliability through engineering solutions rather than manual intervention.
Reduce toil by building tools, automation, and self-healing systems.

Cloud & Infrastructure

Design and manage scalable infrastructure on cloud platforms (AWS / Azure / GCP).
Manage containerized workloads using Docker and Kubernetes.
Implement and maintain CI/CD pipelines for safe and frequent deployments.

Monitoring & Observability

Build and maintain observability solutions using tools such as:
Prometheus, Grafana
ELK / OpenSearch
Datadog, New Relic
Track system performance, capacity planning, and error budgets.

Security & Compliance

Ensure reliability best practices aligned with security standards.
Participate in on-call rotations and ensure secure system operations.
Collaborate with security teams to implement secure infrastructure practices.

Required Skills & Qualifications

Bachelor’s degree in Computer Science, Engineering, or related field.
Strong experience in Linux/Unix system administration.
Proficiency in at least one scripting or programming language:
Python, Go, Bash, or Java
Experience with cloud platforms (AWS / Azure / GCP).
Hands-on experience with Kubernetes and container orchestration.
Knowledge of networking fundamentals (TCP/IP, DNS, load balancing).
Experience with monitoring, alerting, and incident management.

Preferred / Nice-to-Have Skills

Experience implementing SRE best practices from Google SRE principles.
Knowledge of Terraform, Ansible, or CloudFormation.
Experience with service mesh (Istio, Linkerd).
Understanding of chaos engineering tools (Gremlin, Chaos Mesh).
Experience in fintech, banking, or high-availability systems.

Key Skills

Ranked by relevance

cloud aws incident response high availability kubernetes terraform ansible docker devops istio bash cicd dns

Related Jobs

3 roles aligned with this opportunity

View all jobs

Platform Site Reliability Engineer

2026-05-16

Contract

Not Applicable

United Arab Emirates

IT Services

Engineering

DevOps Engineer

2026-06-13

Contract

Not Applicable

United Arab Emirates

IT Services

Engineering

Java Developer

2026-05-26

Contract

Not Applicable

United Arab Emirates

IT Services

Engineering

🇦🇪

Country Guide

United Arab Emirates

Tax-friendly regional tech hub

Posted: Jan 15, 2026
Type: Contract
Level: Entry
Location: Dubai
Company: Dicetek LLC

Industries

IT Services IT Consulting

Related Jobs

3 roles aligned with this opportunity

View all jobs

Platform Site Reliability Engineer

2026-05-16

Contract

Not Applicable

United Arab Emirates

IT Services

Engineering

DevOps Engineer

2026-06-13

Contract

Not Applicable

United Arab Emirates

IT Services

Engineering

Java Developer

2026-05-26

Contract

Not Applicable

United Arab Emirates

IT Services

Engineering

Site Reliability Engineer (SRE)

Key Skills

Related Jobs

Platform Site Reliability Engineer

DevOps Engineer

Java Developer

Related Jobs

Platform Site Reliability Engineer

DevOps Engineer

Java Developer

Cookie Settings