-
SR2 | Socially Responsible Recruitment | Certified B Corporation™

Backend engineer

SR2 | Socially Responsible Recruitment | Certified B Corporation™
Germany · Full-time · Mid-Senior

Backend python engineer | ML | Infrastructure | Reliability |

Full remote - ideally European based.

We’re hiring a Backend Software Engineer to own and operate mission-critical Django services that orchestrate large-scale ML inference workflows in production.


About the Role

This is a hands-on, end-to-end ownership role focused on building reliable, high-throughput backend systems — not a research role, not a pure infra role, and not a ticket-driven support position.


Responsibilities

  • Design, build, and run Django services in production
  • Own high-throughput async workflows using queues, workers, and schedulers
  • Implement safe orchestration patterns: retries, idempotency, rate limiting, backpressure
  • Define and operate SLOs, error budgets, alerts, and on-call
  • Lead incident response and write postmortems that drive real improvements
  • Build end-to-end observability (metrics, logs, traces, dashboards, runbooks)
  • Improve reliability of service integrations using timeouts, circuit breakers, and fallbacks
  • Work closely with ML engineers to productionise inference pipelines
  • Own CI/CD and deployment workflows for backend services
  • Use Infrastructure as Code (Terraform) to support reliability and scale
  • Optimise performance and cost across compute, storage, databases, and external APIs



Qualifications

  • Strong experience as a Python backend engineer owning production systems
  • Hands-on experience running Django in production (ORM, migrations, performance tuning)
  • Experience building and operating asynchronous job systems (Celery, RQ, Arq, or similar)
  • Experience with workflow/orchestration systems (Temporal, Prefect, Airflow, Step Functions, etc.)
  • Solid understanding of distributed systems reliability (timeouts, retries, idempotency, rate limiting, backpressure)
  • Experience defining and operating SLOs/SLAs and participating in on-call
  • Strong Linux, networking, and debugging fundamentals
  • Experience with AWS and/or GCP
  • Practical experience using Terraform as part of a wider system


Required Skills

  • Experience running ML inference or training systems at scale
  • Familiarity with MLOps tooling (SageMaker, Vertex AI, Kubeflow, MLflow, Argo)
  • Experience with observability stacks (OpenTelemetry, Prometheus, Grafana, ELK/Loki)
  • Experience operating Postgres and Redis in high-throughput environments
  • Startup or greenfield system ownership experience


Preferred Skills

  • Experience running ML inference or training systems at scale
  • Familiarity with MLOps tooling (SageMaker, Vertex AI, Kubeflow, MLflow, Argo)
  • Experience with observability stacks (OpenTelemetry, Prometheus, Grafana, ELK/Loki)
  • Experience operating Postgres and Redis in high-throughput environments
  • Startup or greenfield system ownership experience

Key Skills

Ranked by relevance

django prometheus terraform kubeflow grafana python mlflow redis mlops ai infrastructure as code incident response storage linux cicd aws
Login to Apply
Posted
Feb 03, 2026
Type
Full-time
Level
Mid-Senior
Location
Germany

Industries

IT Services IT Consulting

Categories

Engineering

Related Jobs

3 roles aligned with this opportunity

View all jobs
View Job Details
Egov Select
Related

Network and Systems Engineer

2026-05-28

Full-time
Not Applicable
Belgium
IT Services
Information Technology
View Job Details
Scandit
Related

Senior Embedded Machine Learning Engineer (C++)

2026-05-28

Full-time
Mid-Senior
Finland
Software Development
Information Technology
View Job Details
Nokia
Related

AI Engineer Trainee

2026-05-28

Full-time
Not Applicable
Finland
IT Services
Engineering