Track This Job
Add this job to your tracking list to:
- Monitor application status and updates
- Change status (Applied, Interview, Offer, etc.)
- Add personal notes and comments
- Set reminders for follow-ups
- Track your entire application journey
Save This Job
Add this job to your saved collection to:
- Access easily from your saved jobs dashboard
- Review job details later without searching again
- Compare with other saved opportunities
- Keep a collection of interesting positions
- Receive notifications about saved jobs before they expire
AI-Powered Job Summary
Get a concise overview of key job requirements, responsibilities, and qualifications in seconds.
Pro Tip: Use this feature to quickly decide if a job matches your skills before reading the full description.
DevOps / Platform Engineer
Platform Engineer Level II — SWENG
Europe · Full-Time · GCP Primary / AWS Secondary
The Mission
We are not looking for someone to “run scripts.” We are looking for a Platform Architect who understands that in our environment, a single configuration change propagates across 98 services and 55 products. Every decision has blast radius. Every action must be preceded by assessment.
You will join a team of three Platform Engineers responsible for a massive, shared global estate across GCP and AWS. This role is about building the “paved road” for our software engineers — designing scalable, secure, and automated environments where safety is built-in, not bolted on.
The Operating Reality
SWENG operates a shared platform delivered by 5 engineers. Every engineer operates with full autonomy and full accountability from day one. There is no onboarding ramp that absorbs mistakes at this scale.
Scale
98 services · 22 environments · 55 products · 80+ edge locations across GCP and AWS Team
5 engineers total (3 Platform Engineers). No supervisory capacity. No error correction buffer. Autonomy
You assess context, analyze failure modes, and communicate structured decisions before touching the keyboard. Philosophy
Process discipline is what allows us to move fast. You are a process-oriented engineer who treats infrastructure as a product.
Core Responsibilities
Architectural Ownership
Design and implement highly available, secure infrastructure on GCP (primary) and AWS. You are not just building it — you are ensuring it is cost-effective, scalable, and relevant to 55 products simultaneously.
Infrastructure as Code (IaC)
Treat the entire estate as software using Terraform. Manage complex state files and ensure modularity across all 22 environments. Every infrastructure change is code-reviewed, not clicked.
Guardrail Engineering
Build and maintain CI/CD pipelines (GitHub Actions / Jenkins) that do not just deploy code — they enforce security and governance automatically. The pipeline is the last line of defence before 98 services are affected.
Systems Thinking & Advisory
Act as a consultant to the Software Engineering team. Challenge decisions that are not scalable. Communicate tradeoffs using a structured Impact → Options → Recommendation framework. A well-reasoned advisory is as valuable as the implementation.
Observability
Build the Prometheus / Grafana / Stackdriver telemetry that predicts outages — not just reacts to them. Instrument proactively; alert meaningfully.
MLOps Scaling
Support the scaling of machine learning products (Kubeflow Pipelines) to meet global demand across all environments.
Who You Are — Requirements
Experience
5+ years in DevOps / SRE with a proven track record in Platform Engineering — managing shared infrastructure for multiple teams simultaneously.
GCP Mastery
Deep, production-level experience with Google Cloud Platform and Kubernetes (GKE).
You have operated GCP at scale — not just provisioned resources.
The “Architect” Mindset
This is the most critical requirement. You must demonstrate:
Structured communication: Problem → Impact → Options → Recommendation, without supervision.
Blast radius awareness: You do not say “it might break.” You explain how it breaks, what is affected, and what the recovery path is.
Context-first approach: Before any action, you assess what exists, what is affected, who needs to know, and what the downstream consequences are across the shared estate.
Failure mode thinking: You anticipate failure scenarios and design for graceful degradation, not just happy-path operation.
Governance-First
You understand that in a global environment with 55 products, following procedural processes is not overhead — it is a survival requirement.
You operate within change management frameworks and onboard others into them effectively.
You distinguish between urgency and risk — a CVSS 9.8 vulnerability requires contextual assessment (exposure, exploitability, blast radius), not a reflexive “drop everything.”
Automation Obsessed
Expert-level Python scripting and a delete-manual-tasks mentality.
You automate detection, not just remediation. Manual checking is a process gap, not a strategy.
Accountability Orientation
You take ownership of outcomes, not just task completion.
You surface risks proactively to your team lead, with structured status: what is done, what the risk is, what you need.
You do not patch silently. You communicate clearly before, during, and after changes that affect shared infrastructure.
Location
Based in Europe for time zone alignment with the team.
Nice to Have
Hands-on experience with MLOps and Data Science tooling (Kubeflow, Vertex AI).
Deep knowledge of AWS (EC2, S3, RDS, Lambda) to manage our secondary environment.
Advanced Log Management (ELK / Splunk).
Compensation : 60.000 - 70.000 EUR (B2B Contract)
Languages : Fluent English
PLEASE DON'T APPLY IF YOU ARE USING AI DURING JOB INREVIEW OR YOU ARE NOT A REAL PERSON.
Key Skills
Ranked by relevanceReady to apply?
Join INSUS - AI Solutions for Sustainable Transformation and take your career to the next level!
Application takes less than 5 minutes

