Software Engineer (Python, C++)

NVIDIA

United States · Full-time · Associate

About The Company

NVIDIA is a global leader in visual computing and artificial intelligence, renowned for pioneering innovative technologies that transform industries and elevate user experiences. With a strong focus on research and development, NVIDIA develops cutting-edge hardware and software solutions that power everything from gaming and entertainment to data centers and autonomous vehicles. The company's commitment to innovation, excellence, and diversity has positioned it as one of the most desirable employers in the technology sector. NVIDIA's culture fosters creativity, collaboration, and continuous learning, enabling its employees to push the boundaries of what is possible and make a meaningful impact on the world.

About The Role

We are seeking a highly experienced Principal Systems Engineer to join our team focused on NVIDIA Dynamo, a high-throughput, low-latency inference framework designed for serving generative AI and reasoning models across multi-node distributed environments. In this strategic role, you will define the vision and roadmap for memory management and storage systems that support large-scale language model (LLM) inference. Your expertise will drive the design and implementation of a unified memory layer that seamlessly integrates GPU memory, host memory, SSD tiers, and remote storage, enabling efficient deployment of cutting-edge LLM workloads. You will collaborate closely with cross-functional teams, including GPU architecture, networking, and platform engineering, to optimize low-latency data sharing and access across heterogeneous accelerators. Mentoring engineers and representing your team in technical forums will be integral to your responsibilities, ensuring the continuous evolution of our high-performance systems infrastructure.

Qualifications

Masters or PhD in Computer Science, Electrical Engineering, or related field, or equivalent professional experience
15+ years of experience in building large-scale distributed systems, high-performance storage, or machine learning infrastructure
Proficiency in C/C++ and Python programming languages
Deep understanding of memory hierarchies including GPU HBM, host DRAM, SSD, and remote/object storage
Experience designing multi-tier systems for performance and cost efficiency
Hands-on experience with distributed caching or key-value systems optimized for low latency and high concurrency
Strong knowledge of networked I/O technologies such as RDMA, NVMe-oF, NVLink, and related protocols
Expertise in profiling and optimizing systems across CPU, GPU, memory, and network components
Excellent communication skills and experience leading cross-disciplinary teams

Responsibilities

Design and develop a unified memory layer that spans GPU memory, host memory, SSD tiers, and remote storage to support large-scale LLM inference
Architect and implement deep integrations with leading LLM serving engines such as vLLM, SGLang, and TensorRT-LLM, focusing on KV-cache offload, reuse, and remote sharing
Co-design interfaces and protocols for disaggregated prefill, peer-to-peer KV-cache sharing, and multi-tier KV-cache storage solutions
Partner with GPU architecture, networking, and platform teams to leverage technologies like GPUDirect, RDMA, and NVLink for low-latency data sharing
Mentor senior and junior engineers, set technical direction, and ensure best practices in memory and storage subsystem development
Represent the team in internal reviews, open-source communities, conferences, and customer engagements

Benefits

Competitive salary range of $272,000 to $425,500 USD, commensurate with experience and location
Eligibility for equity options and comprehensive benefits package
Opportunities for professional growth through innovative projects and collaborative environment
Access to cutting-edge technology and resources for research and development
Inclusive and diverse work environment that values creativity and initiative

Equal Opportunity

NVIDIA is committed to fostering a diverse and inclusive workplace. We are proud to be an equal opportunity employer and do not discriminate based on race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability, or any other characteristic protected by law. We believe that diverse perspectives and backgrounds drive innovation and excellence, and we welcome applicants from all backgrounds to join our team.

Key Skills

Ranked by relevance

storage artificial intelligence machine learning python ai

Related Jobs

3 roles aligned with this opportunity

View all jobs

Sr. Software Engineer - Full Stack

2026-07-10

Full-time

Mid-Senior

Canada

Technology

Engineering

Full Stack Engineer, Total Media

2026-07-10

Full-time

Not Applicable

Canada

Technology

Engineering

Senior Software Engineer

2026-07-10

Full-time

Mid-Senior

Canada

Technology

Information Technology

🇺🇸

Country Guide

United States

World’s deepest and highest-paying tech market

Posted: Dec 25, 2025
Type: Full-time
Level: Associate
Location: United States
Company: NVIDIA

Industries

Technology Information Internet

Related Jobs

3 roles aligned with this opportunity

View all jobs

Sr. Software Engineer - Full Stack

2026-07-10

Full-time

Mid-Senior

Canada

Technology

Engineering

Full Stack Engineer, Total Media

2026-07-10

Full-time

Not Applicable

Canada

Technology

Engineering

Senior Software Engineer

2026-07-10

Full-time

Mid-Senior

Canada

Technology

Information Technology

Software Engineer (Python, C++)

Key Skills

Related Jobs

Sr. Software Engineer - Full Stack

Full Stack Engineer, Total Media

Senior Software Engineer

Related Jobs

Sr. Software Engineer - Full Stack

Full Stack Engineer, Total Media

Senior Software Engineer

Cookie Settings