MBZUAI (Mohamed bin Zayed University of Artificial Intelligence)
Head of Engineering
MBZUAI (Mohamed bin Zayed University of Artificial Intelligence)United Arab Emirates12 hours ago
Full-timeRemote FriendlyInformation Technology, Engineering +1

The Head of Engineering at MBZUAI’s Institute of Foundation Models (IFM) leads the design, deployment, and operation of large-scale AI systems that meet world-class standards in scalability, safety, and performance. This role bridges cutting-edge research with enterprise-grade infrastructure, driving excellence across product development, cloud and on-premises operations, and AI model integration.


The ideal candidate combines deep technical expertise with strategic vision and a strong understanding of the AI ecosystem including large language models (LLMs), multimodal systems, infrastructure orchestration, and model safety. Through innovation and leadership, the role reinforces IFM’s position as a global pioneer in sovereign, responsible, and resilient AI infrastructure for the UAE and its international partners.


The Head of Engineering will ensure that MBZUAI’s Institute of Foundation Models operates at world-class standards in scalability, safety, and performance. By uniting infrastructure excellence, AI safety, and strategic alignment, this role secures IFM’s position as a global leader in sovereign, responsible, and robust AI infrastructure for the UAE and beyond.


Key Responsibilities


Product Engineering & System Architecture

  • Lead the end-to-end development and operation of IFM’s AI product platforms.
  • Design and maintain scalable, secure, and high-availability architectures across hybrid cloud and on-premise environments.
  • Implement modern DevOps pipelines (Kubernetes, Docker, CI/CD, GitHub Actions) for reliable model deployment and service uptime.
  • Oversee API design, data flow, and backend system integration to support model training, inference, and analytics.

AI Model Integration & Infrastructure

  • Collaborate with AI research teams to productionize and scale foundation models (LLMs, multimodal, reasoning, and domain-specialized models).
  • Manage large-scale GPU clusters and distributed inference environments using tools such as vLLM, Ray, DeepSpeed, and Triton Inference Server.
  • Oversee model deployment pipelines, ensuring efficient scaling and resource optimization across AWS, on-premises, and at co-location data centers.
  • Support ongoing experimentation, model fine-tuning, and benchmark validation in collaboration with global research teams.

Product Testing & Security Assurance

  • Establish and lead model red teaming, adversarial testing, and guardrail evaluation workflows to ensure AI model robustness.
  • Design and oversee safety assurance pipelines, stress-testing models against prompt injection, jailbreaking, data leakage, and bias exploits.
  • Collaborate with AI safety researchers to integrate content filters, alignment layers, and policy enforcement systems into production stacks.
  • Implement automated regression tests, monitoring dashboards, and security incident protocols to maintain responsible AI standards.
  • Ensure compliance with MBZUAI’s AI ethics and governance frameworks, aligning with international safety guidelines.

Cloud and Data Center Management

  • Manage hybrid compute infrastructure combining AWS cloud, on-prem GPU clusters, and co-location data centers.
  • Optimize resource allocation across compute, storage, and network layers.
  • Oversee security, IAM policies, and network zoning, maintaining full data sovereignty and operational resilience.
  • Ensure system observability through centralized logging, metrics, and alerting frameworks (e.g., CloudWatch, Grafana, Prometheus).

Data Engineering & Pipeline Operations

  • Supervise the design and optimization of data ingestion and retrieval pipelines supporting AI training and evaluation.
  • Ensure data versioning, integrity, and accessibility across structured and unstructured data formats.
  • Support compliance with internal and external data governance policies.

Team Leadership & Strategic Reporting

  • Lead, mentor, and scale a multidisciplinary engineering team across product, infrastructure, and operations domains.
  • Translate leadership priorities into actionable technical roadmaps and deliverables.
  • Coordinate execution of strategic projects and institutional initiatives, ensuring alignment with IFM’s mission and MBZUAI’s strategic objectives.
  • Provide regular progress and performance reports to MBZUAI leadership and key stakeholders, highlighting milestones, risks, and innovation outcomes.
  • Foster a culture of collaboration, accountability, and continuous improvement across distributed teams (Abu Dhabi, Paris, US, etc.).


Qualifications & Technical Experience Required

  • Bachelor’s or Master’s in Computer Science, Engineering, or a related discipline (PhD preferred)
  • 8-10 years in software engineering, infrastructure management, or AI systems and at least 3 years in a leadership capacity
  • Proven experience in: Kubernetes, Docker, CI/CD and systems observability tools.
  • AWS Cloud Management: EKS, EC2, S3, RDS, IAM, CloudWatch, Route 53
  • On-prem & Co-location Data Center Operations: GPU cluster orchestration, network configuration, and storage optimization
  • AI Ecosystem & LLMs: hands-on familiarity with open-source and commercial models (Llama, Qwen, Jais, Mistral, OpenAI, Anthropic, etc.)
  • MLOps & Model Serving: vLLM, Triton, MLflow, Ray, or Hugging Face Transformers.
  • AI Security & Red Teaming: prompt defense, guardrail systems, toxicity and bias detection, model evaluation frameworks.
  • Strong foundation in Python, Bash and system-level automation
  • Demonstrated success in managing complex technical programs and delivering strategic outcomes.
  • Excellent communication, stakeholder management, and leadership skills.

Key Skills

Ranked by relevance