The Lead Data Ops Engineer will play a critical role in building, scaling, automating, and maintaining the company's Big Data infrastructure and Machine Learning operations (MLOps). This individual will work in close collaboration with our Data Architect, Data Science, Data Engineering, and IT teams to ensure the development, deployment and scale of robust, high-performance data processing systems and ML models.
KEY ACCOUNTABILITIES:
• Big Data Infrastructure: Design, build, and maintain high-performance, cloud based, fault-tolerant, scalable distributed Data infrastructure that supports the company’s data-intensive applications. (Real time/Batch/LLM’s). This includes developing strategies for data storage (TB’s), processing, and analysis, and implementing high-performance, scalable data pipelines for ML models and data products, supporting up to 50-60 usecases a year and thousands of IoT devices.
• Create infrastructure as Code, perform configuration and set up managed data services. Build and deploy a data science playground for research and prototyping for the professional and citizen data science program being rolled out and supporting 15-20 citizen data scientists/ ambassadors.
• Machine Learning Operations (MLOps): Develop and manage the ML operational process, working closely with the data science team to implement ML models into production Including edge. This includes streamlining the ML lifecycle, from model development and testing to deployment and implementing the monitoring and alerting strategy.
• Automation and Scalability: Implement automation tools and frameworks to manage system updates/changes. Ensure that all systems and infrastructures can scale effectively with the increase in IoT sensors and devices.
• Continuous Integration and Deployment (CI/CD):
Oversee continuous integration and continuous deployment practices for the data and ML pipeline, ensuring that software can be reliably released at any time.
• System Monitoring and Reliability: Monitor system performance and reliability to ensure high levels of performance, availability, and security. This includes identifying and fixing potential and existing system issues.
• Collaboration and Communication: Strong collaboration with Data architect/Engineer, data scientists for the implementation and testing of new data services to provide an elastic data infrastructure
• Security: Oversee and ensure that all Big Data and ML Ops platforms comply with the company's security standards and policies.
• Mentorship and Leadership: Act as a mentor to junior data members, providing guidance and support in their professional development.
• Innovation and Continuous Improvement: Stay up-to-date with industry trends and new technologies. Continuously explore innovative solutions and enhancements to the existing data architecture to improve its scalability, reliability, and efficiency.
• Problem Solving: Anticipate and resolve technical issues before they become roadblocks, maintaining the continuity of data flow and ensuring the highest levels of data quality and integrity.
AUTHORITY/ DECISION MAKING:
▪ Infrastructure Design: Decide on the most effective design and implementation of the company's Big Data infrastructure.
▪ ML Ops Process: Make key decisions on the ML operational process, ensuring that ML models can be effectively integrated into production.
▪ Automation Tools: Choose the most appropriate automation tools and frameworks for the company's needs.
▪ CI/CD Practices: Determine the best practices for continuous integration and deployment in the context of the company's operations.
▪ System Monitoring: Make decisions on system monitoring strategies, including the selection of tools and responses to system performance metrics.
▪ Security Policies: Have a say in the implementation of security policies as they pertain to the Big Data and ML Ops platforms. ▪ Budget and costing: Taking ownership of managing data platform costs and relevant data services.
QUALIFICATIONS & SKILLS:
▪ Bachelor’s degree required, MS or PhD preferred
▪ Bachelor’s in Data Science, Computer Science, Engineering, Statistics and 10+ years of relevant experience.
▪ Experience: A minimum of 5-7 years of experience in a DevOps role, with a focus on managing Big Data infrastructures and MLOps.
▪ Technical Skills:
▪ Strong experience with Big Data technologies such as Hadoop, Spark, Kafka, etc.
▪ Proven expertise in managing and deploying ML models into production.
▪ Proficient in using CI/CD tools like Jenkins, Travis CI, CircleCI, etc.
▪ Proficient in using infrastructure automation tools like Terraform, Chef, Puppet, Ansible, etc.
▪ Strong knowledge of cloud platforms such as Azure (AWS, GCP)
▪ Experience with containerization technologies like Docker, Kubernetes, etc.
▪ Familiarity with various database technologies, both SQL and NoSQL.
▪ Proficiency in programming languages such as Python, Java, or Scala.
▪ Experience of leveraging MS/Azure ecosystem to manage the development and maintenance of cloud platform operations
Key Skills
Ranked by relevance
Related Jobs
3 roles aligned with this opportunity
Data & ML Engineer - Newcastle
2026-06-19
Head of AI
2026-06-16
Entry-Level Developer roles (Software, Web, AI, Data, IT, Product)
2026-06-18
- Posted
- Apr 09, 2025
- Type
- Full-time
- Level
- Associate
- Location
- Dubai
- Company
- Emirates Global Aluminium (EGA)
Industries
Categories
Related Jobs
3 roles aligned with this opportunity
Data & ML Engineer - Newcastle
2026-06-19
Head of AI
2026-06-16
Entry-Level Developer roles (Software, Web, AI, Data, IT, Product)
2026-06-18