Data Centre Networking SME

IBMIreland23 hours ago

Full-timeInformation Technology

Track This Job

Add this job to your tracking list to:

Monitor application status and updates
Change status (Applied, Interview, Offer, etc.)
Add personal notes and comments
Set reminders for follow-ups
Track your entire application journey

Save This Job

Add this job to your saved collection to:

Access easily from your saved jobs dashboard
Review job details later without searching again
Compare with other saved opportunities
Keep a collection of interesting positions
Receive notifications about saved jobs before they expire

AI-Powered Job Summary

Get a concise overview of key job requirements, responsibilities, and qualifications in seconds.

Pro Tip: Use this feature to quickly decide if a job matches your skills before reading the full description.

Introduction

At IBM, we’re reimagining how data centers think, reason, and optimize themselves for the age of artificial intelligence. Our Network Intelligence (INI) product brings AI-driven reasoning and automation to enterprise and telecom operations—combining time-series foundation models, agentic frameworks, and domain-aware knowledge graphs. Building on this foundation, IBM’s AI Data Centre Networking initiative applies the same intelligence to network infrastructure, enabling high-performance, self-optimizing fabrics for GPU/TPU clusters and distributed AI workloads. The result is a new class of AI-native data centers—resilient, adaptive, and designed to power the world’s most demanding AI systems.

We’re looking for an AI Data Centre Networking Subject Matter Expert (SME) to help shape and guide this transformation. You will collaborate with IBM’s engineering, research, and product teams to identify, validate, and architect solutions for next-generation networking use cases — including AI inference traffic optimization, priority-based flow control for GPU/TPU clusters, multi-tenant isolation, and fabric scalability challenges. Your insights will directly influence how IBM builds and evolves intelligent infrastructure for AI at scale.

Your Role And Responsibilities

Position Summary

The ideal candidate brings hands-on experience operating AI/ML infrastructure and understands the networking challenges of high-performance computing workloads. While deep expertise across all domains is valuable, we welcome emerging practitioners with strong foundational knowledge and genuine curiosity about solving real-world problems. You will serve as the bridge between real-world data centre operations and our product roadmap, ensuring we build solutions that address genuine infrastructure challenges faced by organizations running distributed AI workloads at scale.

You will help define next-generation networking architectures that enable efficient, scalable, and resilient AI compute fabrics across distributed environments.

Key Responsibilities

Product Advisory & Strategy: Act as the technical voice of the customer, translating operational pain points from AI data centre environments into actionable product requirements.
Cross-Functional Collaboration: Work with engineering and product teams to validate use cases, refine features, and align solutions with real-world AI infrastructure needs.
Technical Validation: Evaluate networking architectures for distributed AI workloads including training, inference, and GPU/TPU communication.
Use Case Development: Define reference architectures for high-bandwidth interconnects, congestion management, workload isolation, and multi-tenant segmentation.
Technology Assessment: Track emerging AI networking technologies, standards, and trends — focusing on performance and energy efficiency.
Requirements Translation: Convert operational insights into clear, implementable technical specifications.
Customer & Partner Engagement: Support customer discussions, PoC validations, and feedback loops.
Benchmarking & Market Insight: Analyze competitor and hyperscaler AI DC architectures to guide product differentiation.
Knowledge Sharing: Contribute to technical documentation and internal knowledge bases.

Preferred Education

Master's Degree

Required Technical And Professional Expertise

10+ years’ experience in data centre networking, HPC systems, or AI infrastructure operations.
Understanding of AI/ML workload behavior including distributed training, inference, and data pipelines.
Familiarity with RDMA (RoCE/InfiniBand), VXLAN, EVPN, and Data Center Bridging (DCB).
Experience with GPU clusters or NVIDIA DGX-class AI infrastructure preferred.
Ability to articulate technical concepts clearly to technical and non-technical audiences.
Demonstrated capability to influence product design and translate operational experience into actionable insights.

Preferred Technical And Professional Experience

Experience with Kubernetes, Kubeflow, or MLOps environments for AI workloads.
Knowledge of telemetry, observability, and automation tools in DC environments.
Experience in product development or solutions architecture roles.
Familiarity with SDN, composable infrastructure, or disaggregated data centre architectures.
Exposure to AI fabric orchestration and high-performance interconnect management.
Contributions to AI/Networking technical communities or open-source initiatives

Key Skills

Ranked by relevance

Ready to apply?

Join IBM and take your career to the next level!

Application takes less than 5 minutes

Apply