-
View all jobs
Imagine a future where everyone has instant, low-cost access to intelligence. We’re building a fully featured European AI cloud - with everything one needs to train, experiment with, and deploy AI models. In addition, our GPUs run on 100% renewable energy.
We’re ambitious, curious, and gutsy doers. We practice a low hierarchy across the company and high morale in our teams. We’ve already achieved a lot, yet we’re only getting started. Now it’s your chance to join the ride. We offer more than just the job - we offer a career-defining opportunity to be part of building something big!
Join Verda while it’s still being built - not once it’s finished.
About The Role
We’re seeking a Senior or Staff Site Reliability Engineer (SRE) to strengthen and scale our HPC and cloud infrastructure in Europe. You’ll work closely with ML, data, and platform teams to ensure our systems remain reliable, observable, and highly performant. In this role, you’ll design and operate GPU-accelerated clusters, build automation and monitoring tooling, improve CI/CD and deployment workflows, and contribute to long-term infrastructure strategy.
Why Verda
We’re ambitious, curious, and gutsy doers. We practice a low hierarchy across the company and high morale in our teams. We’ve already achieved a lot, yet we’re only getting started. Now it’s your chance to join the ride. We offer more than just the job - we offer a career-defining opportunity to be part of building something big!
Join Verda while it’s still being built - not once it’s finished.
About The Role
We’re seeking a Senior or Staff Site Reliability Engineer (SRE) to strengthen and scale our HPC and cloud infrastructure in Europe. You’ll work closely with ML, data, and platform teams to ensure our systems remain reliable, observable, and highly performant. In this role, you’ll design and operate GPU-accelerated clusters, build automation and monitoring tooling, improve CI/CD and deployment workflows, and contribute to long-term infrastructure strategy.
Why Verda
- Generous cash + equity compensation along with various fringe benefits (e.g., healthcare, lunch, wellbeing, etc.).
- Profitable operations, in addition to fast growth.
- A small, high-performing team of around 70 people representing 27 nationalities
- Work mode: Remote (EU)
- Employment type: Full-time, permanent
- Start date: As soon as possible
- Ensure the reliability, scalability, and performance of HPC and cloud systems.
- Build and maintain automation, observability, and monitoring frameworks for compute clusters.
- Collaborate with ML, data, and infrastructure teams to deliver high-availability systems.
- Develop and enhance CI/CD pipelines, deployment workflows, and on-call processes.
- Participate in architecture design and long-term infrastructure strategy discussions.
- Participate in a 24/7 on-call rotation, with at least one full on-call week per month.
- 7+ years in SRE, DevOps, or Infrastructure Engineering—preferably in HPC or large-scale distributed systems.
- Linux expertise (Ubuntu or Debian preferred).
- Strong experience with scripting and automation (Python, Go, Bash).
- Proven ability with cloud platforms (AWS, GCP, Azure, or modern HPC providers such as CoreWeave, Lambda, Nebius).
- Deep understanding of networking (DNS/TCP) and infrastructure-as-code tools (Terraform, Ansible).
- Experience managing Slurm-based HPC GPU clusters, diagnosing performance issues, and designing efficient HPC jobs.
- Intro chat with our Talent Acquisition Partner - an initial online conversation to learn more about you and share details about the role.
- Technical assignment - a short task (around 15 minutes) to understand your approach and problem-solving style.
- Online technical interview with the Hiring Manager - a deeper discussion about your technical experience and ways of working.
- In-person interview with one of our team members - a chance to get to know the team and our culture.
- Final interview with our CTO & CEO – to align on vision and expectations.
Key Skills
Ranked by relevance
cloud
cicd
ai
terraform
python
devops
aws
gcp
Related Jobs
3 roles aligned with this opportunity
View Job Details
Related
Senior / Principal Site Reliability Engineer
2026-04-25
Full-time
Associate
Finland
Technology
Information Technology
View Job Details
Related
Machine Learning Engineer, ML Ops
2026-05-06
Full-time
Not Applicable
Finland
Technology
Engineering
View Job Details
Related
Senior Back-end Developer
2026-05-06
Full-time
Not Applicable
Finland
Technology
Engineering
Login to Apply
- Posted
- May 06, 2026
- Type
- Full-time
- Level
- Not Applicable
- Location
- Helsinki
- Company
- Verda
Industries
Technology
Information
Internet
Categories
Engineering
Information Technology
Related Jobs
3 roles aligned with this opportunity
View Job Details
Related
Senior / Principal Site Reliability Engineer
2026-04-25
Full-time
Associate
Finland
Technology
Information Technology
View Job Details
Related
Machine Learning Engineer, ML Ops
2026-05-06
Full-time
Not Applicable
Finland
Technology
Engineering
View Job Details
Related
Senior Back-end Developer
2026-05-06
Full-time
Not Applicable
Finland
Technology
Engineering