-
View all jobs
Overview
Pluralis Research carries out foundational research on Protocol Learning: multi-participant training of foundation models where no single participant has, or can ever obtain, a full copy of the model. The purpose of Protocol Learning is to facilitate the creation of community-trained and community-owned frontier models with self-sustaining economics.
We're looking for Senior/Staff engineers with 5+ years of experience in distributed systems and ML large-scale training. You'll be implementing a novel substrate for training distributed ML models that work under consumer grade internet connection.
Responsibilities
Distributed Training Architecture & Optimization
Pluralis Research carries out foundational research on Protocol Learning: multi-participant training of foundation models where no single participant has, or can ever obtain, a full copy of the model. The purpose of Protocol Learning is to facilitate the creation of community-trained and community-owned frontier models with self-sustaining economics.
We're looking for Senior/Staff engineers with 5+ years of experience in distributed systems and ML large-scale training. You'll be implementing a novel substrate for training distributed ML models that work under consumer grade internet connection.
Responsibilities
Distributed Training Architecture & Optimization
- Design and implement large-scale distributed training systems optimized for heterogeneous hardware operating under low-bandwidth, high-latency conditions.
- Develop and optimize model-parallel training strategies (data, tensor, pipeline parallelism) with custom sharding techniques that minimize communication overhead.
- Optimize GPU utilization, memory efficiency, and compute performance across distributed nodes.
- Implement robust checkpointing, state synchronization, and recovery mechanisms for long-running, fault-prone training jobs.
- Build monitoring and metrics systems to track training progress, model quality, and system bottlenecks.
- Architect resilient training systems where nodes can fail, networks can partition, and participants can dynamically join or leave.
- Design and optimize peer-to-peer topologies for decentralized coordination across non-co-located nodes.
- Implement NAT traversal, peer discovery, dynamic routing, and connection lifecycle management.
- Profile and optimize communication patterns to reduce latency and bandwidth overhead in multi-participant environments.
- Strong experience building and operating distributed systems in production.
- Hands-on expertise with distributed training frameworks (FSDP, DeepSpeed, Megatron, or similar).
- Deep understanding of model parallelism (data, tensor, pipeline parallelism).
- Expert-level Python with production experience (concurrency, error handling, retry logic, clean architecture).
- Strong networking fundamentals: P2P systems, gRPC, routing, NAT traversal, distributed coordination.
- Experience optimizing GPU workloads, memory management, and large-scale compute efficiency.
- Equity-heavy compensation with meaningful ownership in a mission-driven company
- Competitive base salary for senior engineering roles in Australia
- Visa sponsorship available for exceptional candidates
- Remote-first with optional access to our Melbourne hub
- World-class team — team mates were previously at at Google, Amazon, Microsoft, and leading startups
Key Skills
Ranked by relevance
nat
deepspeed
python
grpc
Related Jobs
3 roles aligned with this opportunity
View Job Details
Related
Machine Learning Engineer - ML Training Platform
2026-02-23
Full-time
Entry
Australia
Technology
Engineering
View Job Details
Related
Machine Learning Engineer - ML Training Platform
2026-02-23
Full-time
Entry
Australia
Technology
Engineering
View Job Details
Related
Machine Learning Engineer
2026-02-10
Full-time
Entry
Australia
Technology
Engineering
Login to Apply
- Posted
- Feb 23, 2026
- Type
- Full-time
- Level
- Entry
- Location
- Melbourne
- Company
- Pluralis Research
Industries
Technology
Information
Internet
Categories
Engineering
Information Technology
Related Jobs
3 roles aligned with this opportunity
View Job Details
Related
Machine Learning Engineer - ML Training Platform
2026-02-23
Full-time
Entry
Australia
Technology
Engineering
View Job Details
Related
Machine Learning Engineer - ML Training Platform
2026-02-23
Full-time
Entry
Australia
Technology
Engineering
View Job Details
Related
Machine Learning Engineer
2026-02-10
Full-time
Entry
Australia
Technology
Engineering