Track This Job
Add this job to your tracking list to:
- Monitor application status and updates
- Change status (Applied, Interview, Offer, etc.)
- Add personal notes and comments
- Set reminders for follow-ups
- Track your entire application journey
Save This Job
Add this job to your saved collection to:
- Access easily from your saved jobs dashboard
- Review job details later without searching again
- Compare with other saved opportunities
- Keep a collection of interesting positions
- Receive notifications about saved jobs before they expire
AI-Powered Job Summary
Get a concise overview of key job requirements, responsibilities, and qualifications in seconds.
Pro Tip: Use this feature to quickly decide if a job matches your skills before reading the full description.
About the Role
We are looking for a highly collaborative Site Reliability Engineer (SRE) to join our global technology team. This role is ideal for someone passionate about building resilient infrastructure, driving performance improvements, and enabling engineering teams to deliver at scale. You will design, build, and operate cloud platforms while ensuring security, reliability, and cost efficiency.
Responsibilities
What You’ll Do
- Infrastructure as Code & CI/CD: Automate provisioning and deployments with Terraform and integrate pipelines (GitHub Actions, ArgoCD, etc.).
- Reliability Engineering: Define SLIs/SLOs, manage error budgets, and create dashboards and alerts to proactively monitor system health.
- Security & Compliance: Implement IAM policies, automate vulnerability scans, and maintain audit logging.
- Monitoring & Observability: Instrument services with metrics, logs, and distributed tracing for rapid troubleshooting and system insights.
- Cost Optimization: Implement tagging strategies, optimize resources, and provide data-driven recommendations for cloud spend efficiency.
- Documentation & Mentorship: Develop runbooks, standards, and best-practice guides while coaching development teams on DevOps, reliability, and security practices.
What We’re Looking For
- Strong proficiency with AWS Cloud and cloud-native best practices.
- Hands-on experience with Kubernetes (EKS, GKE) and large-scale container orchestration.
- Expertise in Terraform for provisioning and maintaining infrastructure.
- Knowledge of databases such as Redis and Postgres.
- Solid understanding of cloud networking components (VPC, VPN, Load Balancing) and web/network protocols (HTTP, REST, TLS, DNS).
- Proficiency with Git workflows and CI/CD system integrations.
Nice to Have
- Familiarity with tools such as ArgoCD, GitHub Actions, Jenkins.
- Knowledge of Python, Golang, or Helm templating.
- Experience running scalable, resilient Node.js microservices.
- Awareness of security best practices for cloud infrastructure.
- Understanding of Terragrunt and Terraform project structures.
- Background in production readiness in fast-paced environments.
- Professional proficiency in English (written and spoken).
Key Skills
Ranked by relevanceReady to apply?
Join Avenue Code and take your career to the next level!
Application takes less than 5 minutes