NVIDIA
Software Engineer (C++, Python)
NVIDIAUnited States9 hours ago
Full-timeRemote FriendlyInformation Technology

About The Company

NVIDIA is a global leader in the technology industry, renowned for its innovative graphics processing units (GPUs) and advanced computing solutions. As a pioneer in AI, deep learning, and high-performance computing, NVIDIA continuously pushes the boundaries of technology to enable the future of digital experiences. The company's commitment to research and development has established it as a key player in various sectors including gaming, professional visualization, data centers, and autonomous vehicles. NVIDIA's culture emphasizes innovation, collaboration, and diversity, fostering an environment where talented professionals can thrive and contribute to groundbreaking projects that shape the future of technology.

About The Role

NVIDIA Dynamo is a high-throughput, low-latency inference framework designed for serving generative AI and reasoning models across multi-node distributed environments. Built with performance in mind using Rust, and designed for extensibility with Python, Dynamo orchestrates GPU shards, manages shared KV cache, and routes requests efficiently across heterogeneous clusters. As large language models (LLMs) continue to grow beyond the memory and compute capabilities of individual GPUs, this platform facilitates the scalable, resilient deployment of cutting-edge LLM workloads. We are seeking a Principal Systems Engineer to lead the vision and development of memory management strategies for large-scale LLM and storage systems, ensuring optimal performance, scalability, and integration across diverse hardware and software components.

Qualifications

  • Masters or PhD in Computer Science, Electrical Engineering, or a related field, or equivalent experience
  • 15+ years of experience in building large-scale distributed systems, high-performance storage, or ML systems infrastructure
  • Proficiency in C/C++ and Python with a proven track record of delivering production-grade services
  • Deep understanding of memory hierarchies including GPU HBM, host DRAM, SSD, and remote/object storage
  • Experience designing systems that span multiple tiers for performance and cost efficiency
  • Hands-on experience with distributed caching or key-value systems optimized for low latency and high concurrency
  • Strong skills in networked I/O, RDMA, NVMe-oF, NVLink, and related technologies
  • Expertise in profiling and system optimization across CPU, GPU, memory, and network layers
  • Excellent communication skills and experience leading cross-functional teams and initiatives

Responsibilities

  • Design and evolve a unified memory layer that integrates GPU memory, pinned host memory, RDMA-accessible memory, SSD tiers, and remote storage to support large-scale LLM inference
  • Architect and implement deep integrations with leading LLM serving engines such as vLLM, SGLang, and TensorRT-LLM, focusing on KV-cache offload, reuse, and remote sharing
  • Co-design interfaces and protocols enabling disaggregated prefill, peer-to-peer KV-cache sharing, and multi-tier KV-cache storage for high-throughput, low-latency inference
  • Partner with GPU architecture, networking, and platform teams to leverage technologies like GPUDirect, RDMA, NVLink for low-latency cache access and sharing
  • Mentor senior and junior engineers, set technical direction for memory and storage subsystems, and represent the team in internal and external forums
  • Conduct performance profiling, system tuning, and validation to ensure optimal throughput and latency in distributed environments

Benefits

  • Competitive salary package aligned with experience and location
  • Equity options and comprehensive health benefits
  • Opportunities for professional growth and development in a pioneering technology environment
  • Access to cutting-edge tools and resources for research and innovation
  • Inclusive and diverse workplace culture that values creativity and collaboration

Equal Opportunity

NVIDIA is committed to fostering a diverse and inclusive work environment. We are proud to be an equal opportunity employer and do not discriminate based on race, religion, color, national origin, gender, gender identity or expression, sexual orientation, age, marital status, veteran status, disability, or any other characteristic protected by law.

Key Skills

Ranked by relevance