NVIDIA
Software Engineer (Python, C++)
NVIDIAUnited States9 hours ago
Full-timeRemote FriendlyInformation Technology
About The Company

NVIDIA is a global leader in visual computing and artificial intelligence, renowned for pioneering innovative technologies that transform industries and elevate user experiences. With a strong focus on research and development, NVIDIA develops cutting-edge hardware and software solutions that power everything from gaming and entertainment to data centers and autonomous vehicles. The company's commitment to innovation, excellence, and diversity has positioned it as one of the most desirable employers in the technology sector. NVIDIA's culture fosters creativity, collaboration, and continuous learning, enabling its employees to push the boundaries of what is possible and make a meaningful impact on the world.

About The Role

We are seeking a highly experienced Principal Systems Engineer to join our team focused on NVIDIA Dynamo, a high-throughput, low-latency inference framework designed for serving generative AI and reasoning models across multi-node distributed environments. In this strategic role, you will define the vision and roadmap for memory management and storage systems that support large-scale language model (LLM) inference. Your expertise will drive the design and implementation of a unified memory layer that seamlessly integrates GPU memory, host memory, SSD tiers, and remote storage, enabling efficient deployment of cutting-edge LLM workloads. You will collaborate closely with cross-functional teams, including GPU architecture, networking, and platform engineering, to optimize low-latency data sharing and access across heterogeneous accelerators. Mentoring engineers and representing your team in technical forums will be integral to your responsibilities, ensuring the continuous evolution of our high-performance systems infrastructure.

Qualifications

  • Masters or PhD in Computer Science, Electrical Engineering, or related field, or equivalent professional experience
  • 15+ years of experience in building large-scale distributed systems, high-performance storage, or machine learning infrastructure
  • Proficiency in C/C++ and Python programming languages
  • Deep understanding of memory hierarchies including GPU HBM, host DRAM, SSD, and remote/object storage
  • Experience designing multi-tier systems for performance and cost efficiency
  • Hands-on experience with distributed caching or key-value systems optimized for low latency and high concurrency
  • Strong knowledge of networked I/O technologies such as RDMA, NVMe-oF, NVLink, and related protocols
  • Expertise in profiling and optimizing systems across CPU, GPU, memory, and network components
  • Excellent communication skills and experience leading cross-disciplinary teams

Responsibilities

  • Design and develop a unified memory layer that spans GPU memory, host memory, SSD tiers, and remote storage to support large-scale LLM inference
  • Architect and implement deep integrations with leading LLM serving engines such as vLLM, SGLang, and TensorRT-LLM, focusing on KV-cache offload, reuse, and remote sharing
  • Co-design interfaces and protocols for disaggregated prefill, peer-to-peer KV-cache sharing, and multi-tier KV-cache storage solutions
  • Partner with GPU architecture, networking, and platform teams to leverage technologies like GPUDirect, RDMA, and NVLink for low-latency data sharing
  • Mentor senior and junior engineers, set technical direction, and ensure best practices in memory and storage subsystem development
  • Represent the team in internal reviews, open-source communities, conferences, and customer engagements

Benefits

  • Competitive salary range of $272,000 to $425,500 USD, commensurate with experience and location
  • Eligibility for equity options and comprehensive benefits package
  • Opportunities for professional growth through innovative projects and collaborative environment
  • Access to cutting-edge technology and resources for research and development
  • Inclusive and diverse work environment that values creativity and initiative

Equal Opportunity

NVIDIA is committed to fostering a diverse and inclusive workplace. We are proud to be an equal opportunity employer and do not discriminate based on race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability, or any other characteristic protected by law. We believe that diverse perspectives and backgrounds drive innovation and excellence, and we welcome applicants from all backgrounds to join our team.

Key Skills

Ranked by relevance