Site Reliability Engineer

Thrive

United Arab Emirates · Full-time · Entry

About This Role

As a Site Reliability Engineer within the SRE team, you’ll be focused on monitoring and supporting our AWS environments for platforms and tools utilised by our customers.

The SRE team specialises in giving delivery squads visibility of the performance of their services in production and support to investigate and contain potential problems.

You’ll have freedom to help research and recommend solutions for hosting applications at scale.

You’ll be fundamental in incident response, troubleshooting and containing issues.

You’ll collaborate with a highly experienced technical team to drive forward best practice as we implement and enhance our tools and services utilising cutting edge technology.

Key Responsibilities

Configuration and ongoing management of environments and services on AWS.
Enhancing tools and processes for monitoring scalable applications on AWS.
Maintaining high availability through proactive measures.
Troubleshooting and resolving complex technical issues.
Documentation of Standard Operating Procedures.
Automation of SOPs and Run Books.
Raise, investigate and resolve problems and known errors.
Respond to issues outside of working hours as per on call rota.

Basic Qualifications

Experience implementing environments for web-based microservices.
Experience of supporting MongoDB based web applications.
Experience of engineering, architecting, or supporting AWS solutions.
Familiarity with cloud virtualisation tools such as ECS and/or Docker containers.
Experience working with automated deployment systems (eg. CloudFormation. CodeBuild).
Familiarity with any monitoring tool. for eg : NewRelic, DataDog, Prometheus, Grafana etc.
Experience in automation of workloads using a scripting language like Python or JavaScript
Strong problem-solving skills and the ability to troubleshoot complex issues.
Good understanding of incident response best practices, post-incident reviews, and continuous improvement.
Ability and willingness to proactively improve ways of working and processes.
Desire to continually grow, develop and improve.
Experience debugging NodeJS applications.

Useful Skills

Understanding of REST, GraphQL and asynchronous messaging
Experience of using Git for version control.
Experience of Continuous Integration and Deployment advantageous.
Familiarity with core SRE principles encompassing areas such as monitoring, alerting, error budgets, fault analysis, and other prevalent concepts in the realm of reliability engineering.
Excellent written and verbal communication skills.
Familiarity with IT compliance and risk management requirements (eg. security, privacy, GDPR etc.)

Key Skills

Ranked by relevance

incident response aws continuous integration high availability cloudformation prometheus graphql grafana datadog python docker cloud gdpr git ecs

Related Jobs

3 roles aligned with this opportunity

View all jobs

CloudOps Engineer

2026-06-17

Full-time

Not Applicable

India

Software Development

Engineering

Desenvolvedor (a) Back end - Python | AWS

2026-06-19

Full-time

Not Applicable

Brazil

Software Development

Information Technology

Python Software Engineer

2026-06-15

Full-time

Not Applicable

United Kingdom

Software Development

Engineering

🇦🇪

Country Guide

United Arab Emirates

Tax-friendly regional tech hub

Posted: Apr 05, 2025
Type: Full-time
Level: Entry
Location: Dubai
Company: Thrive

Industries

Software Development

Related Jobs

3 roles aligned with this opportunity

View all jobs

CloudOps Engineer

2026-06-17

Full-time

Not Applicable

India

Software Development

Engineering

Desenvolvedor (a) Back end - Python | AWS

2026-06-19

Full-time

Not Applicable

Brazil

Software Development

Information Technology

Python Software Engineer

2026-06-15

Full-time

Not Applicable

United Kingdom

Software Development

Engineering

Site Reliability Engineer

Key Skills

Related Jobs

CloudOps Engineer

Desenvolvedor (a) Back end - Python | AWS

Python Software Engineer

Related Jobs

CloudOps Engineer

Desenvolvedor (a) Back end - Python | AWS

Python Software Engineer

Cookie Settings