ai71
Senior DevOps Engineer
ai71United Arab Emirates1 day ago
Full-timeEngineering, Information Technology
About AI71:

AI71 is an industry leader in artificial intelligence, delivering innovative solutions that empower developers, businesses and governments to solve complex challenges. AI71 builds secure, enterprise-ready applications powered by cutting-edge technology—tailored for knowledge workers and sector-specific needs. AI71 bridges the gap between advanced AI and real-world impact. Guided by a strong commitment to research and responsibility, we create transformative solutions that drive progress and empower communities.

The Role:

We are seeking a highly motivated and skilled DevOps Engineer to join our team. You'll play a crucial role in building, deploying, and maintaining scalable and reliable systems and infrastructure, working closely with development teams to ensure operational efficiency and smooth deployment pipelines.

What You'll Do:

  • Design, implement, and maintain CI/CD pipelines to streamline development workflows.
  • Build and manage scalable infrastructure for AI model deployment and lifecycle management.
  • Automate infrastructure provisioning and management using tools like Terraform, Ansible, or CloudFormation.
  • Optimize cloud-based and on-premises resources for scalability and cost efficiency.
  • Manage and fine-tune queuing systems and real-time streaming architectures.
  • Monitor and troubleshoot production systems to ensure uptime and performance.
  • Implement logging, monitoring, and alerting solutions using tools such as Prometheus, Grafana, ELK stack, etc.
  • Set up comprehensive monitoring for both system metrics and ML model performance.
  • Conduct root cause analyses and post-mortems to improve system reliability.
  • Collaborate with development and QA teams to deploy new features into production seamlessly.
  • Promote best practices in system architecture, security, and performance.
  • Participate in a rotating on-call schedule for production system support.
  • Ensure infrastructure complies with security and compliance standards (e.g., SOC2, ISO27001).
  • Securely manage secrets and credentials using tools like Vault or AWS Secrets Manager.

What You'll Bring:

  • Bachelor's degree in Computer Science, Engineering, or a related field (or equivalent experience).
  • Proficiency in at least one scripting language: Python, Bash, or Go.
  • Hands-on experience with cloud platforms like AWS, Azure, or Google Cloud.
  • Skilled in containerization and orchestration with Docker and Kubernetes.
  • Experience using CI/CD tools such as Azure DevOps, Jenkins, GitLab CI/CD, or CircleCI.
  • Knowledge of monitoring and observability tools like Prometheus, Datadog, New Relic, Grafana, or PagerDuty.
  • Understanding of networking fundamentals including DNS, load balancing, and firewalls.
  • Familiarity with real-time streaming architectures for AI and data applications.

Great Pluses / Preferred Experience

  • Experience with Infrastructure as Code (IaC) tools like Terraform or Pulumi.
  • Understanding of service mesh technologies like Istio or Linkerd.
  • Familiarity with database scaling and administration, including VectorDBs, SQL, and NoSQL systems.
  • Previous experience in a high-traffic production environment.

Why AI71:

  • Mission-Driven Work: Work on cutting-edge AI applications with a talented and passionate team, solving real-world challenges in critical sectors.
  • Unparalleled Opportunity: This is a chance to innovate and solve real-world challenges using AI at a company with unique access to world-leading models and resources.
  • Career Growth: We offer competitive compensation, benefits, and significant career growth opportunities as a foundational member of the team.
  • World-Class Environment: Enjoy a flexible working environment and the latest tools & technologies needed to do your best work.

Key Skills

Ranked by relevance