-
NTT

Develop a Data Lake Infrastructure, from zero to hero (6 months internship)

NTT
Luxembourg · Internship · Entry

Make an impact with NTT DATA

Join a company that is pushing the boundaries of what is possible. We are renowned for our technical excellence and leading innovations, and for making a difference to our clients and society. Our workplace embraces diversity and inclusion – it’s a place where you can grow, belong and thrive.

Your day at NTT DATA

The primary objective of this internship is to design, implement, and exploit a data lake for storing and analyzing heterogeneous data, particularly data from monitoring and log systems. Initially, the intern will be tasked with designing a data lake infrastructure to centralize these data sources, and in a second phase, to exploit the collected data to generate reports that can be used by business teams.

What You'll Be Doing

Key Roles and Responsibilities:

Study and understanding of key concepts: Grasp the fundamentals of monitoring and logging, as well as their role in collecting infrastructure and application data.

Data Lake Design

Define the architecture of the data lake from the scratch , taking into account the specificities of heterogeneous data.

Design a metadata management system to catalog and structure the stored information.

Select the appropriate technologies for the creation of the data lake (e.g., Hadoop, Spark, AWS S3, etc.).

Build the infrastructure (ex: Spark, Kafka...) for the Data Lake.

Technical Implementation

Deploy the data lake infrastructure.

Design and implement the associated metadata model.

Data Ingestion Into The Lake

  • Collect and load raw primary data (monitoring, logs) into the lake.
  • Enrich the lake with documentation and descriptions of the data to facilitate future exploitation.

Data Exploitation And Valorization

  • Implement reporting tools to generate standardized reports usable by business teams.
  • Automate data workflows and report generation if possible.

Required Skills

Last year of Masters degree in Big Data/Data Science or any relevant degree

Familiarity with Linux

Familiarity with virtualization

Familiarity with building and implementing infrastructure (Spark, Kafka...)

Familiarity with large-scale data processing (Big Data) and related technologies (Hadoop, Spark, etc.).

Knowledge of databases (SQL/NoSQL).

Experience or interest in data lakes and metadata technologies.

Ability to work with large volumes of heterogeneous data (logs, monitoring, etc.).

Preferred Skills

Devops competencies

An experience in containerization with, for example, Docker

Regular use of a Homelab

Familiarity with data visualization and reporting tools (Tableau, Power BI, etc.).

Workplace Type

Hybrid Working

About NTT DATA

NTT DATA is a $30+ billion trusted global innovator of business and technology services. We serve 75% of the Fortune Global 100 and are committed to helping clients innovate, optimize and transform for long-term success. We invest over $3.6 billion each year in R&D to help organizations and society move confidently and sustainably into the digital future. As a Global Top Employer, we have diverse experts in more than 50 countries and a robust partner ecosystem of established and start-up companies. Our services include business and technology consulting, data and artificial intelligence, industry solutions, as well as the development, implementation and management of applications, infrastructure, and connectivity. We are also one of the leading providers of digital and AI infrastructure in the world. NTT DATA is part of NTT Group and headquartered in Tokyo.

Equal Opportunity Employer

NTT DATA is proud to be an Equal Opportunity Employer with a global culture that embraces diversity. We are committed to providing an environment free of unfair discrimination and harassment. We do not discriminate based on age, race, colour, gender, sexual orientation, religion, nationality, disability, pregnancy, marital status, veteran status, or any other protected category. Join our growing global team and accelerate your career with us. Apply today.

Key Skills

Ranked by relevance

spark hadoop ai s3 aws sql kafka tableau containerization
Login to Apply
Posted
Oct 13, 2024
Type
Internship
Level
Entry
Location
Capellen
Company
NTT

Industries

IT Services IT Consulting

Categories

Information Technology

Related Jobs

3 roles aligned with this opportunity

View all jobs
View Job Details
Qaracter - Beyond your Challenge
Related

Data Scientist – AI & Machine Learning (100% remoto)

2026-06-16

Full-time
Associate
Spain
Financial Services
Engineering
View Job Details
HP
Related

AI/ML Engineer

2026-06-17

Full-time
Mid-Senior
Spain
Computer Hardware Manufacturing
Engineering
View Job Details
Qualco Group
Related

Knowledge and Data Management Analyst

2026-06-15

Full-time
Not Applicable
Luxembourg
IT Services
Information Technology