Site Reliability Engineer (SRE)
Location: Dubai
Duration: Permanent
We’re currently partnered with a leading technology consultancy who are scaling their tech team. They offer a diverse work environment that provide services in the UAE impacting millions of lives. We're currently helping them search for a Site Reliability Engineer to join their ever growing team.
Responsibilities:
- Architect, implement, and oversee scalable, high-performance AI and data infrastructure across cloud (AWS) and on-prem environments.
- Utilise automation tools (e.g., Terraform, Ansible) for provisioning, monitoring, and infrastructure optimisation.
- Design robust monitoring, alerting, and logging solutions to detect and mitigate potential failures before they impact operations.
- Develop and maintain seamless CI/CD pipelines to accelerate the deployment of AI models and data-driven applications.
- Optimise workflows to enhance efficiency, reduce deployment friction, and maintain system stability.
- Partner with AI researchers, data engineers, and developers to align infrastructure with project needs.
- Act as a bridge between AI, data, and infrastructure teams, ensuring smooth communication and technical alignment.
- Rapidly diagnose and resolve system incidents, conducting thorough root-cause analyses to prevent future issues.
- Establish and refine disaster recovery frameworks to safeguard AI and data assets.
- Implement stringent security protocols to protect AI and data infrastructure, ensuring compliance with industry regulations.
- Perform regular security evaluations, proactively addressing vulnerabilities.
- Identify opportunities to improve system scalability, efficiency, and resilience.
- Stay ahead of emerging trends in AI infrastructure, site reliability engineering, and cloud technologies.
Qualifications & skills:
- Bachelor’s or Master’s degree in Computer Science, Software Engineering, or a related field.
- 3-5 years experience in a similar role
- Experience with on-premise and cloud platforms (AWS, GCP, Azure) and container orchestration (Kubernetes, Docker).
- Experience with AI and data-specific infrastructure (e.g., GPU clusters, data lakes)
- Understanding of machine learning frameworks and data processing tools (e.g., TensorFlow, PyTorch, Apache Spark).
Key Skills
Ranked by relevance
Related Jobs
3 roles aligned with this opportunity
Back End Developer
2026-04-10
Senior Cloud Architect, ML/AI
2026-04-12
Senior Lead Backend Engineer
2026-04-11
- Posted
- Apr 08, 2025
- Type
- Full-time
- Level
- Mid-Senior
- Location
- Dubai
- Company
- Discovered MENA
Industries
Categories
Related Jobs
3 roles aligned with this opportunity
Back End Developer
2026-04-10
Senior Cloud Architect, ML/AI
2026-04-12
Senior Lead Backend Engineer
2026-04-11