Data Scientist

Capgemini

United States · Full-time · Mid-Senior

Data Scientist

Choosing Capgemini means choosing a company where you will be empowered to shape your career in the way you’d like, where you’ll be supported and inspired by a collaborative community of colleagues around the world, and where you’ll be able to reimagine what’s possible. Join us and help the world’s leading organizations unlock the value of technology and build a more sustainable, more inclusive world.

Job Description

As a Data Scientist , you will lead the development and implementation of advanced data engineering solutions to support the deployment and optimization of Generative AI models. Your role will involve leveraging your extensive experience to design robust, scalable, and innovative data architectures that align with the unique requirements of General Artificial Intelligence (GenAI) applications.

Key Responsibilities

The Machine Learning Engineer will be responsible for architectural design and planning, advanced data pipelines, model integration and optimization, scalability, performance and research and innovation supporting production generative AI systems.
Production level ML workloads for customers using Databricks platform, including end-to-end ML pipelines, training/inference optimization, integration with cloud-native services and MLOps
Build and maintain data engineering solutions on cloud platforms using hyperscaler services.
Develop production-grade cloud (AWS/Azure/GCP) infrastructure that supports the deployment of ML applications, including drift monitoring
Design, develop, and maintain data pipelines to efficiently collect, process, and load data from various sources into data storage systems (e.g., data warehouses, data lakes).
Understanding indexing and vectorization to use with Generative AI prompt engineering.
Strong understanding of fundamental data science concepts in NLP, including selection and understanding of embedding models.
Use hyperscaler technologies to support data needs for expansion of Machine Learning/Data Science capabilities including generative AI.
Design, develop, and implement scalable data pipelines and ETL/ELT processes using Python, PySpark and API integrations.

Required Skill and Experience

Bachelor's degree in computer science, data engineering, or a related field with 3+ year's experience (Master's preferred).
Proven experience in data engineering, MLOps, ETL, and database management, QL and data manipulation languages.
Azure, Python, Java, or Scala.
data warehousing platforms (e.g., Databricks, Amazon Redshift, Snowflake) and big data technologies (e.g., Hadoop, Spark).
highly scalable Data stores, Data Lake, Data Warehouse, Lakehouse, and unstructured datasets

Key Skills

Ranked by relevance

ai cloud python artificial intelligence machine learning data warehouse big data storage hadoop scala mlops java etl

Related Jobs

3 roles aligned with this opportunity

View all jobs

GenAI Developer

2026-07-08

Full-time

Not Applicable

India

IT Services

Engineering

DevOps Engineer

2026-07-08

Full-time

Mid-Senior

Canada

IT Services

Information Technology

AI Engineer (bank och finans)

2026-07-06

Full-time

Mid-Senior

Sweden

IT Services

Information Technology

🇺🇸

Country Guide

United States

World’s deepest and highest-paying tech market

Posted: Feb 24, 2026
Type: Full-time
Level: Mid-Senior
Location: Seattle
Company: Capgemini

Industries

IT Services IT Consulting

Related Jobs

3 roles aligned with this opportunity

View all jobs

GenAI Developer

2026-07-08

Full-time

Not Applicable

India

IT Services

Engineering

DevOps Engineer

2026-07-08

Full-time

Mid-Senior

Canada

IT Services

Information Technology

AI Engineer (bank och finans)

2026-07-06

Full-time

Mid-Senior

Sweden

IT Services

Information Technology

Data Scientist

Key Skills

Related Jobs

GenAI Developer

DevOps Engineer

AI Engineer (bank och finans)

Related Jobs

GenAI Developer

DevOps Engineer

AI Engineer (bank och finans)

Cookie Settings