Track This Job
Add this job to your tracking list to:
- Monitor application status and updates
- Change status (Applied, Interview, Offer, etc.)
- Add personal notes and comments
- Set reminders for follow-ups
- Track your entire application journey
Save This Job
Add this job to your saved collection to:
- Access easily from your saved jobs dashboard
- Review job details later without searching again
- Compare with other saved opportunities
- Keep a collection of interesting positions
- Receive notifications about saved jobs before they expire
AI-Powered Job Summary
Get a concise overview of key job requirements, responsibilities, and qualifications in seconds.
Pro Tip: Use this feature to quickly decide if a job matches your skills before reading the full description.
In this role, you will support the reliability and automation of large-scale, multi-tenant environments by combining software engineering practices with hands-on infrastructure expertise. You will contribute to operating and automating the full lifecycle of environments, from provisioning to ongoing maintenance. The position offers an opportunity to work on high-impact systems, solve complex operational problems, and contribute to scalable infrastructure architectures. You'll collaborate with cross-functional teams to strengthen reliability, enhance observability, and streamline delivery processes. This is an ideal opportunity for someone who thrives at the intersection of automation, cloud operations, and continuous improvement.
Accountabilities
- Support the automation of environment provisioning, configuration, and lifecycle management using Terraform, Ansible, and Kubernetes
- Assist in troubleshooting production issues, including failed deployments, pod crashes, and scheduling conflicts using tools such as kubectl
- Contribute to infrastructure-as-code modules and improve CI/CD workflows for safe and repeatable deployments
- Participate in monitoring and maintenance activities using Prometheus, ELK, and Grafana, helping improve visibility and reliability across environments
- Take part in incident response processes, triaging alerts and supporting recovery efforts with guidance from senior engineers
- Collaborate with engineering teams to improve platform reliability, scalability, and operational efficiency
- Experience with infrastructure-as-code, particularly Terraform and Ansible, with an understanding of modules, state, and variables
- Familiarity with Kubernetes concepts such as pods, deployments, rollouts, and experience using kubectl, Helm, or Kustomize
- Basic programming abilities in languages like Go or Ruby, including modifying existing automation tools
- Exposure to multi-environment or multi-tenant operations and understanding of consistency challenges across setups
- Knowledge of observability practices and ability to identify issues using logs, dashboards, or metrics
- Strong collaboration skills and eagerness to share knowledge and learn from teammates
- Previous participation in on-call rotations and comfort responding to alerts and incidents in production systems
- Flexible Paid Time Off
- Team Member Resource Groups
- Equity compensation and Employee Stock Purchase Plan
- Growth and development budget
- Parental leave
- Home office support
- Comprehensive health, financial, and well-being benefits
When you apply, your profile goes through our AI-powered screening process designed to identify top talent efficiently and fairly.
🔍 Our AI evaluates your CV and LinkedIn profile thoroughly, analyzing your skills, experience and achievements.
📊 It compares your profile to the job's core requirements and past success factors to determine your match score.
🎯 Based on this analysis, we automatically shortlist the 3 candidates with the highest match to the role.
🧠 When necessary, our human team may perform an additional manual review to ensure no strong profile is missed.
The process is transparent, skills-based, and free of bias — focusing solely on your fit for the role.
Once the shortlist is completed, we share it directly with the company that owns the job opening. The final decision and next steps, such as interviews or assessments, are then managed by their internal hiring team.
Thank you for your interest!
Key Skills
Ranked by relevanceReady to apply?
Join Jobgether and take your career to the next level!
Application takes less than 5 minutes

