Track This Job
Add this job to your tracking list to:
- Monitor application status and updates
- Change status (Applied, Interview, Offer, etc.)
- Add personal notes and comments
- Set reminders for follow-ups
- Track your entire application journey
Save This Job
Add this job to your saved collection to:
- Access easily from your saved jobs dashboard
- Review job details later without searching again
- Compare with other saved opportunities
- Keep a collection of interesting positions
- Receive notifications about saved jobs before they expire
AI-Powered Job Summary
Get a concise overview of key job requirements, responsibilities, and qualifications in seconds.
Pro Tip: Use this feature to quickly decide if a job matches your skills before reading the full description.
We are an innovation team on a mission to transform how enterprises harness AI. Operating with the agility of a startup and the focus of an incubator, we’re building a tight-knit group of AI and infrastructure experts driven by bold ideas and a shared goal: to rethink systems from the ground up and deliver breakthrough solutions that redefine what's possible — faster, leaner, and smarter.
We thrive in a fast-paced, experimentation-rich environment where new technologies aren’t just welcome — they’re expected. Here, you'll work side-by-side with seasoned engineers, architects, and thinkers to craft the kind of iconic products that can reshape industries and unlock entirely new models of operation for the enterprise.
If you're energized by the challenge of solving hard problems, love working at the edge of what's possible, and want to help shape the future of AI infrastructure — we'd love to meet you.
IMPACT
Cisco is seeking a forward-thinking AI Infrastructure Engineer to help design and implement the next-generation AI products. This role will focus on delivering high-performance, efficient, and reliable solutions that power AI workloads across Cisco's ecosystem.
As an AI Infrastructure Engineer at Cisco, you will play a pivotal role in shaping the AI systems that enable cutting-edge innovations. Your work will directly impact:
- The performance and efficiency of AI workloads on the node.
- The reliability and availability of AI systems for Cisco’s customers.
- Advancements in AI and machine learning infrastructure, enabling better utilization and improving efficiency for applications across industries.
- Collaboration across internal teams to bring system level innovation across different cisco products.
Key Responsibilities
- Design and develop node-level infrastructure components to support high-performance AI workloads.
- Benchmark, analyze, and optimize the performance of AI infrastructure, including CUDA kernels and memory management for GPUs.
- Minimize downtime through seamless config and upgrade architecture for software components.
- Manage the installation and deployment of AI infrastructure on Kubernetes clusters, including the use of CRDs and operators.
- Develop and deploy efficient telemetry collection systems for nodes and hardware components without impacting workload performance.
- Work with distributed system fundamentals to ensure scalability, resilience, and reliability.
- Collaborate across teams and time zones to shape the overall direction of AI infrastructure development and achieve shared goals.
- Proficiency in programming languages such as C/C++, Golang, Python, or eBPF.
- Strong understanding of Linux operating systems, including user space and kernel-level components.
- Experience with Linux user space development, including packaging, logging, telemetry and lifecycle management of processes.
- Strong understanding of Kubernetes (K8s) and related technologies, such as custom resource definitions (CRDs).
- Strong debugging and problem-solving skills for complex system-level issues.
- Bachelor’s degree+ and relevant 8-12 years of Engineering work experience.
- Linux kernel and device driver hands-on expertise is a plus.
- Experience in GPU programming and optimization, including CUDA, UCX is a plus.
- Experience with high-speed data transfer technologies such as RDMA.
- Use of Nvidia GPU operators and Nvidia container toolkit and Nsight, CUPTI.
- Nvidia MIG and MPS concepts for managing GPU consumption.
#WeAreCisco where every individual brings their unique skills and perspectives together to pursue our purpose of powering an inclusive future for all.
Our passion is connection—we celebrate our employees’ diverse set of backgrounds and focus on unlocking potential. Cisconians often experience one company, many careers where learning and development are encouraged and supported at every stage. Our technology, tools, and culture pioneered hybrid work trends, allowing all to not only give their best, but be their best.
We understand our outstanding opportunity to bring communities together and at the heart of that is our people. One-third of Cisconians collaborate in our 30 employee resource organizations, called Inclusive Communities, to connect, foster belonging, learn to be informed allies, and make a difference. Dedicated paid time off to volunteer—80 hours each year—allows us to give back to causes we are passionate about, and nearly 86% do!
Our purpose, driven by our people, is what makes us the worldwide leader in technology that powers the internet. Helping our customers reimagine their applications, secure their enterprise, transform their infrastructure, and meet their sustainability goals is what we do best. We ensure that every step we take is a step towards a more inclusive future for all. Take your next step and be you, with us!
Key Skills
Ranked by relevanceReady to apply?
Join Cisco and take your career to the next level!
Application takes less than 5 minutes