Site Reliability Engineer

Join a diverse team of approx. 2000 professionals across four continents, driving innovation and growth within intive’s technology hubs. Work alongside industry experts trusted by leading brands like Audi, BMW, Deichmann, Meta, NewsCorp, Tandem, Paramount, Vorwerk, and Warner Bros. Discovery to create pioneering, sustainable digital experiences.

At intive, agile thinking and deep industry expertise come together across Automotive & Mobility, Commerce, Financial Services, Healthcare & Life Sciences, and Technology, Media & Communication. Be part of a team that’s shaping the future of digital innovation.

We’re looking for a Site Reliability Engineering to drive the stability, scalability, and security of our digital sports streaming platform.

This Is a Hands-on Leadership Role Where You’ll

Ensure the reliability of our AWS-based infrastructure
Strengthen observability, automation, and security
Support high-performance systems that power live and on-demand video streaming

The ideal candidate combines expertise in site reliability, automation, and security with a strong background in digital video streaming. You’ll work across teams to resolve incidents, build resilient systems, and enable continuous innovation. Please take into account the time zone: EST.

What You Will Be Doing

Take ownership of platform reliability, performance, and security
Lead and mentor a small technical team while remaining a hands-on contributor
Build and maintain monitoring, logging, and alerting systems for visibility and rapid response
Define and enforce best practices in disaster recovery, redundancy, and failover strategies
Troubleshoot complex issues across infrastructure, APIs, video delivery, and playback
Lead incident response efforts and participate in on-call rotations during peak traffic (typically evenings EST)
Partner with Product and Engineering to guide architectural decisions around resilience, scalability, and security
Collaborate with Operations and Customer Care to resolve incidents and eliminate recurring issues
Oversee platform security practices, including IAM, secrets management, and AWS hardening
Research and adopt new tools and technologies to improve reliability
Track and optimize SLAs, SLOs, and KPIs for uptime, latency, playback quality, and security

You Are a Good Match If You Have

7+ years in SRE, DevOps, or infrastructure roles
Proven experience running and scaling production systems in AWS (CloudFront, Lambda, S3, API Gateway, CloudWatch, etc.)
AWS certification (Solutions Architect, DevOps Engineer, or equivalent hands-on expertise)
Strong background in observability (Datadog, CloudWatch, Conviva, etc.)
Skilled in scripting/automation (Python, Bash) and infrastructure-as-code (Terraform, CloudFormation)
Experience leading security initiatives (IAM, token management, service hardening)
Solid understanding of video streaming technologies (HLS/DASH, CDNs, DRM, SSAI, multi-platform delivery)
Experience improving CI/CD pipelines and supporting safe production releases
Strong problem-solving skills across application, network, and video delivery layers
Excellent communication and collaboration skills, including vendor management

Nice To Have

Leadership capacity

Site Reliability Engineer

Key Skills

Related Jobs

Senior Machine Learning Engineer

Junior FullStack Engineer

Middle/Senior .Net Developer

Related Jobs

Senior Machine Learning Engineer

Junior FullStack Engineer

Middle/Senior .Net Developer

Cookie Settings