-
Technology Innovation Institute

Senior Data Engineer

Technology Innovation Institute
United Arab Emirates · Full-time · Mid-Senior

Technology Innovation Institute (TII) is a publicly funded research institute, based in Abu Dhabi, United Arab Emirates. It is home to a diverse community of leading scientists, engineers, mathematicians, and researchers from across the globe, transforming problems and roadblocks into pioneering research and technology prototypes that help move society ahead.


Artificial Intelligence Cross-Center Unit

The Artificial Intelligence Cross-Center Unit is the machine learning powerhouse of TII, working in close collaboration with our other research centers to harness the full benefits of AI across our projects – and drive innovation from new computing paradigms, designing and delivering new AI methodologies, technologies, solutions, and systems that address challenging issues across multiple sectors of the economy – from technology to healthcare, cybersecurity, and government, among others.


We incorporate core elements of intelligence (perception, sensing, planning, and language) in the ideation, design, and prototyping of next-generation systems with human-like intelligence. We build advanced AI computing and scalable AI-based software stacks and hardware systems to deliver significant enhancements in systems infrastructure.


Our AI researchers, scientists, and engineers collaborate to ensure innovative outcomes, from AI theory to AI technologies towards better intelligence.


Senior Data Engineer


Qualifications

To qualify for this position, you will need to meet the following requirements:

  • An MSc degree in Software Engineering, Analytics, Data Science, Machine Learning or related field with 5+ years of experience in a related role Proven experience as a Data Engineer with a focus on large-scale data processing and analytics, with a track record of successfully delivering multiple projects.
  • Strong proficiency in one or more common languages (e.g., Python, C++, Scala, Java) with a focus on object-oriented programming;
  • Contributions to open-source projects is a plus
  • Experienced with version control systems (e.g., Git) to manage code and data workflows, ensuring collaborative development and tracking of changes.
  • Comfortable with Linux administration
  • Experience with data pipeline frameworks and ETL tools (e.g., Apache Spark, SQL DB , Object Storage )
  • Proficient in data cleaning, filtering, and deduplication techniques.
  • Experienced with cloud computing platforms such as AWS, GCP, and Azure.
  • Familiar with processing text, images, audio, and video, which is a good plus
  • Skilled in web scraping, web crawling, and optimizing large-scale data processing.
  • Familiarity with large computing infrastructures and high-performance computing (HPC), including tools like SLURM, SageMaker and Google Cloud Batch, Azure Batch
  • Experience in AI, NLP and Knowledge of common ML tools and technics like RAG, LLMs, Prompt engineering, and machine learning models



Responsibilities

  • Design, develop, and maintain scalable data pipelines to support LLM training and inference.
  • Implement ETL processes to ingest, transform, and store data from various sources, including text corpora and user-generated content.
  • Develop well-structured, maintainable code and create comprehensive unit tests to ensure functionality and reliability.
  • Monitor and troubleshoot data pipeline performance and implement improvements as needed.
  • Collaborate with data scientists and ML engineers to understand data requirements and ensure data availability and quality.
  • Managing the data infrastructure ensuring its reliability and scalability,
  • Lead by example, guiding a team of data professionals to ensure the timely and effective delivery of data solutions while promoting collaboration and innovation.
  • Design and implement CI/CD pipelines to automate the testing and deployment processes, ensuring efficient and reliable delivery.
  • Stay updated on industry trends and best practices related to data engineering and LLMs.


Soft skills

  • Strong problem-solving skills and attention to detail.
  • Eager to learn and adapt to new technologies and methodologies as needed.
  • Excellent communication skills to effectively share insights with technical and non-technical stakeholders.
  • Ability to work collaboratively in a multicultural environment.

Key Skills

Ranked by relevance

c ai ha scala ui machine learning lan artificial intelligence cloud etl cybersecurity prototyping prototypes storage python apache unity linux excel spark java nist git sql aws gcp esp ux kf
Login to Apply
Posted
Dec 11, 2024
Type
Full-time
Level
Mid-Senior
Location
Abu Dhabi Emirate

Industries

Research Services

Categories

Information Technology

Related Jobs

3 roles aligned with this opportunity

View all jobs
View Job Details
TNO
Related

Network Engineer - Den Haag/Hybdride - 32-40u

2026-05-19

Full-time
Not Applicable
Netherlands
Research Services
Information Technology
View Job Details
Barcelona Supercomputing Center
Related

AI - Compilers engineer (RE1)

2026-05-23

Full-time
Not Applicable
Spain
Research Services
Engineering
View Job Details
Luxembourg Institute of Science and Technology (LIST)
Related

FA-26021 SENIOR INFORMATION SECURITY & GRC EXPERT

2026-05-19

Full-time
Not Applicable
Luxembourg
Research Services
Information Technology