Deeplight AI
DataOps Engineer - Lakehouse Operations
Deeplight AIUnited Arab Emirates4 hours ago
Full-timeOther
DeepLight AI is a specialist AI and data consultancy with extensive experience implementing intelligent enterprise systems across multiple industries, with particular depth in financial services and banking. Our team combines deep expertise in data science, statistical modeling, AI/ML technologies, workflow automation, and systems integration with a practical understanding of complex business operations.

As a DataOps Engineer, you will be the guardian of our modern Lakehouse environment, ensuring that our complex data ecosystem remains reliable, transparent, and high-performing. Positioned at the vital intersection of Data Engineering and Operations, you will lead the charge in proactive monitoring and rapid incident response for mission-critical pipelines, including AWS Glue, dbt, and Kafka.

This is a high-visibility role where you won't just react to issues—you will own the end-to-end incident lifecycle, drive deep root-cause analysis (RCA), and collaborate with development teams to build permanent automated preventions. Whether you are managing real-time operational dashboards or leading daily reviews of platform health, your mission is to guarantee data freshness and platform integrity, turning operational stability into a competitive advantage for our data-driven decision-making.

Your responsibilities as a DataOps Engineer will include, but not be limited to;

  • Monitoring & Alerting
    • Monitor and act on incidents related to:
      • AWS Glue job executions
      • Current DMS and dbt pipelines
      • Kafka lag and streaming health
      • Data freshness SLA breaches
      • Data quality issues
      • Platform health alerts
    • Perform L1 triage or distribute incidents to L2/L3 teams as needed
  • Incident Management
    • Own the incident response process:
      • Initial triage and severity assessment
      • Coordinate with development teams for resolution
      • Create and assign JIRA tickets with full context
      • Track incident resolution and closure
    • Escalate high-priority or long-running incidents to management
  • Root Cause Analysis & Prevention
    • Conduct post-incident root cause analysis
    • Maintain incident logs and post-mortem documentation
    • Implement preventive measures for recurring issues in collaboration with dev teams
  • Operational Reporting
    • Manage the TV dashboard providing real-time status of critical flows (with color-coded indicators)
    • Deliver a daily 10-15 min operational review of previous day's executions, open incidents, and follow-ups
    • Share daily summary emails with the team
  • Continuous Awareness
    • Stay up-to-date on the status of all critical flows and remediation efforts
    • Ensure proactive communication on risks and delays
As an AI consultancy, our greatest asset is the expertise of our people.

While technical mastery is the foundation of what we do, the ability to bridge the gap between complex data science and actionable business value is what defines your success with Deeplight.

We're looking for individuals who are not only world-class in their fields of specialism, but also compelling communicators and persuasive advocates for their own skills.

You will be the face of our firm, tasked with building trust, articulating the "why" behind your technical decisions, and effectively "selling" your vision to high-level stakeholders.

If you thrive on the challenge of presenting cutting-edge solutions as much as you do on building them, you will fit right in.

Requirements

You will have experience in:

  • DataOps, DevOps, or data engineering roles, with a minimum of 5 years
  • AWS Glue, DMS, dbt, and Kafka monitoring

You should also have knowledge of:

  • data freshness SLAs, data quality frameworks, and platform health monitoring
  • incident management tools (e.g., JIRA) and alerting systems
  • identifying ways to automate their work / repetitive tasks
  • troubleshooting and triage process.
  • managing multiple incidents and prioritize effectively
  • root cause analysis and preventive action planning
  • communicating and coordination
  • working under pressure and maintain operational discipline

Benefits

Benefits & Growth Opportunities:

  • Competitive salary and performance bonuses
  • Comprehensive health insurance
  • Professional development and certification support
  • Opportunity to work on cutting-edge AI projects
  • Flexible working arrangements
  • Career advancement opportunities in a rapidly growing AI company

This role is based 4 days per week on client site in Abu Dhabi, with 1 day (Friday) working from home.

This position offers a unique opportunity to shape the future of AI implementation while working with a talented team of professionals at the forefront of technological innovation. The successful candidate will play a crucial role in driving our company's success in delivering transformative AI solutions to our clients.

At DeepLight AI, we recognise that diversity drives innovation. We are committed to fostering an inclusive environment where individuals with different thinking styles can thrive and contribute their unique strengths to our specialised AI and data solutions.

Our goal is to ensure our application and interview process is accessible, predictable, and fair for all candidates.

If you require any specific adjustments to the application process, or if you require any reasonable adjustments should you be successful in being processed to the interview stage, please do let us know. This information will be kept strictly confidential and will not impact hiring decisions.

By applying to Deeplight, you also agree for us to share your profile, where necessary, with external clients.

Key Skills

Ranked by relevance