Senior DevOps / SRE Engineer

Tata Technologies

Sweden · Full-time · Mid-Senior

At Tata Technologies we make product development dreams a reality by designing, engineering and validating the products of tomorrow for the world’s leading manufacturers. Due to our continued growth, we are now recruiting for a Senior DevOps / SRE Engineer, to strengthen our team in Gothenburg.

Scope of role

We are seeking a Senior DevOps / SRE Technical Engineer to serve as a key technical owner for cloud infrastructure, observability, reliability engineering, and cloud cost optimization across AWS and GCP.

This role carries clear accountability and measurable outcomes in the following areas:

1. End-to-end observability (design → implementation → continuous improvement)

2. Systematic cloud cost optimization across AWS & GCP (FinOps)

3. Production reliability governance and risk reduction

4. Root cause analysis (RCA) and systemic improvement of major incidents

You will be expected not only to design but also to deliver, operate, and be assessed against concrete results.

Responsibilities

1) End-to-End Observability

What you will own:

Independently design and implement a comprehensive end-to-end observability system covering:

• Infrastructure (AWS/GCP, Kubernetes, network, storage)

• Platform (message queues, databases, caches, API gateways)

• Application layer (microservices, critical business flows)

• Business layer (key business metrics)

You will be expected to produce:

1.Unified Observability Architecture Document

• Overall architecture diagram (Metrics + Logs + Traces)

• Data flow diagram (collection → processing → storage → visualization)

• Tooling selection and justification (e.g., Prometheus, Datadog, OpenTelemetry)

2.Standardized Observability Data Model

• Unified metrics naming conventions

• Standardized tracing model (Trace ID, Span, sampling strategy)

• Structured logging standard (JSON schema)

3.Operational Dashboards

• Infrastructure health dashboard

• Platform services health dashboard

• Business API check of KPI dashboard

4.Alerting System

• Defined P0/P1/P2 alert levels

• Alert noise reduction strategy

• Automated alert routing by team/service 5.SLI / SLO / SLA Framework

• At least 5 critical business SLOs defined and tracked

• Clear error budget policy

2) Cloud Cost Optimization – FinOps (Core Requirement)

What you will own:

Lead systematic cost optimization across AWS and GCP without compromising performance, reliability, or user experience.

You will implement:

1.Unified Cost Visibility System

• Combined AWS + GCP cost dashboards

• Cost breakdown by: Team/Product/Service/Environment (Dev/Test/Stage/Prod)

2.Actionable Cost Optimization Plan

• Compute (EKS/GKE, EC2/Compute Engine, Serverless)

• Storage (S3/GCS tiering, lifecycle policies)

• Databases (RDS/Cloud SQL sizing, connection pooling, caching)

• Network costs (egress, cross-region traffic)

3.Cost Shift-Left Mechanisms

• Cost checks integrated into CI/CD

• Mandatory resource ownership and budget limits

• Quarterly cost reviews

3) Production Reliability & Incident Governance

What you will own: Move from reactive “firefighting” to systematic reliability engineering.

Required Deliverables:

1.Incident Management Framework

• Standard P0/P1 incident response process

• RCA template and follow-up tracking mechanism

2.Reliability Governance Framework

• Error budget policy

• Standardized canary/gradual rollout process

• Automated rollback mechanisms

3.Risk Register

• Identified systemic risks and technical debt

• Prioritized remediation roadmap

4) Kubernetes & Multi-Cloud Platform Optimization

What you will deliver:

• Optimize EKS/GKE cluster architecture

• Improve stability (reduce OOMs, node instability, network issues)

• Improve resource utilization

Knowledge/Experience

Experience

• 5+ years of DevOps / SRE / Cloud Platform experience

• At least 3 years in a Staff/Principal or Tech Lead role

• Experience operating large-scale distributed systems in production

Cloud Expertise

• Deep expertise in both AWS and GCP

• Ability to design cross-cloud architectures

• Strong experience with Terraform / Pulumi / CDK

Observability Expertise

• Proven experience designing and implementing observability from scratch

• Deep hands-on experience with Prometheus/Grafana/Loki/Elastic/Kibana

Kubernetes

• Deep understanding of Kubernetes internals (Scheduler, Controllers, etcd, CNI, CRI)

• Experience managing large-scale production clusters

Programming

• Proficiency in Java or Python/Go

Strong Plus

• Google SRE background or deep SRE practice

• Experience with Chaos Engineering

• Proven FinOps success cases

• Knowledge of eBPF and performance profiling

• Open-source contributions

• Experience designing multi-cloud disaster recovery (Active-Active or Active-Passive)

If you are passionate about bringing innovation to the projects, you work on then we would love to hear from you.

Tata Technologies: Engineering a better world.

Tata Technologies would like to thank all applicants for their interest; each application will be reviewed against the set criteria for the role. We would like to advise that only candidates under consideration will be contacted. If you do not hear from us within 10 working days following the closing date it will mean that unfortunately your application has not been successful. We will however retain your details for any suitable future opportunities.

Key Skills

Ranked by relevance

cloud aws gcp kubernetes storage devops incident response message queues microservices serverless prometheus terraform datadog java cicd sql sla

Related Jobs

3 roles aligned with this opportunity

View all jobs

DevOps Engineer

2026-07-07

Full-time

Not Applicable

Sweden

Motor Vehicle Manufacturing

Engineering

Senior Data Engineer

2026-07-06

Full-time

Not Applicable

Slovenia

Industrial Machinery Manufacturing

Research

Software Engineer, Missions Software

2026-07-10

Full-time

Not Applicable

Canada

Internet Marketplace Platforms

Engineering

🇸🇪

Country Guide

Sweden

Nordic quality of life with strong tech brands

Posted: Feb 17, 2026
Type: Full-time
Level: Mid-Senior
Location: Gothenburg
Company: Tata Technologies

Industries

Motor Vehicle Manufacturing Industrial Machinery Manufacturing

Related Jobs

3 roles aligned with this opportunity

View all jobs

DevOps Engineer

2026-07-07

Full-time

Not Applicable

Sweden

Motor Vehicle Manufacturing

Engineering

Senior Data Engineer

2026-07-06

Full-time

Not Applicable

Slovenia

Industrial Machinery Manufacturing

Research

Software Engineer, Missions Software

2026-07-10

Full-time

Not Applicable

Canada

Internet Marketplace Platforms

Engineering

Senior DevOps / SRE Engineer

Key Skills

Related Jobs

DevOps Engineer

Senior Data Engineer

Software Engineer, Missions Software

Related Jobs

DevOps Engineer

Senior Data Engineer

Software Engineer, Missions Software

Cookie Settings