Track This Job
Add this job to your tracking list to:
- Monitor application status and updates
- Change status (Applied, Interview, Offer, etc.)
- Add personal notes and comments
- Set reminders for follow-ups
- Track your entire application journey
Save This Job
Add this job to your saved collection to:
- Access easily from your saved jobs dashboard
- Review job details later without searching again
- Compare with other saved opportunities
- Keep a collection of interesting positions
- Receive notifications about saved jobs before they expire
AI-Powered Job Summary
Get a concise overview of key job requirements, responsibilities, and qualifications in seconds.
Pro Tip: Use this feature to quickly decide if a job matches your skills before reading the full description.
Software Engineer - AI/HPC Specialist Responsibilities:
- Work on collective communications stacks to optimise networking operations, leading to improved AI inference and training model performance
- Drive implementation of latency and bandwidth critical networking operations, as well as out-of-band signalling
- Debug custom and third party multi-host, accelerator enabled AI platforms
- Software development using C++/C and Python
- Work closely with other teams to deliver impact
- develop & improve features and innovations
- Extend and optimize large scale learning collective operations
- 3+ years of experience developing in C++/C and Python
- Experience with High Performance Computing/Networking or AI systems applications frameworks
- Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience
- Specialized experience in one or more of the following machine learning/deep learning domains: Hardware accelerators, AI Infrastructure, or high performance networking
- Solid experience in debugging of distributed systems, revision control systems, testing, and CI pipelines
- Experience and understanding of AI/HPC systems
- Deep understanding of the transport stack (e.g. RDMA/RoCE, Infiniband, TCP/IP), its constraints and performance measures and how transport considerations enable the collective communications stack
- Experience in one or more of the following machine learning/deep learning domains: hardware accelerators, AI Infrastructure, and/or high performance computing (HPC), particularly pertaining to interconnect and collective communications stacks
- Familiarity with relevant tools, libraries, and frameworks (like PyTorch, NCCL, MPI, CUDA)
Meta builds technologies that help people connect, find communities, and grow businesses. When Facebook launched in 2004, it changed the way people connect. Apps like Messenger, Instagram and WhatsApp further empowered billions around the world. Now, Meta is moving beyond 2D screens toward immersive experiences like augmented and virtual reality to help build the next evolution in social technology. People who choose to build their careers by building with us at Meta help shape a future that will take us beyond what digital connection makes possible today—beyond the constraints of screens, the limits of distance, and even the rules of physics.
Individual compensation is determined by skills, qualifications, experience, and location. Compensation details listed in this posting reflect the base hourly rate, monthly rate, or annual salary only, and do not include bonus, equity or sales incentives, if applicable. In addition to base compensation, Meta offers benefits. Learn more about benefits at Meta.
Key Skills
Ranked by relevanceReady to apply?
Join Meta and take your career to the next level!
Application takes less than 5 minutes

