Track This Job
Add this job to your tracking list to:
- Monitor application status and updates
- Change status (Applied, Interview, Offer, etc.)
- Add personal notes and comments
- Set reminders for follow-ups
- Track your entire application journey
Save This Job
Add this job to your saved collection to:
- Access easily from your saved jobs dashboard
- Review job details later without searching again
- Compare with other saved opportunities
- Keep a collection of interesting positions
- Receive notifications about saved jobs before they expire
AI-Powered Job Summary
Get a concise overview of key job requirements, responsibilities, and qualifications in seconds.
Pro Tip: Use this feature to quickly decide if a job matches your skills before reading the full description.
Job Responsibilities:
Automate routine operational tasks using Shell scripting, ensuring efficiency in log analysis, batch management, and system optimization.
Maintain and optimize middleware components supporting infrastructure operations, ensuring stability and performance.
Administer and optimize Kubernetes clusters, ensuring scalability, security, and performance.
Maintain and optimize monitoring and alerting systems based on Prometheus, ensuring high availability of services.
Contribute to the development of CI/CD pipelines Manage cloud resources efficiently, implementing cost optimization strategies to reduce cloud expenditure.
Improve operational processes, develop automation tools, troubleshoot incidents, and enhance system stability and reliability.
Job Requirements:
Proficiency in Shell scripting for automating operational workflows and system management tasks.
Experience in Python or Go, preferably for system automation, tooling, or backend services.
At least 2 years of hands-on Kubernetes administration experience, including expertise in CSI, CNI, and managing clusters with 20+ nodes in production.
Experience with Prometheus for monitoring and alerting in an enterprise environment.
Familiarity with CI/CD deployment processes, with knowledge of GitOps principles. Hands-on experience with GitOps is a plus.
Experience managing cloud platforms using Infrastructure as Code (IaC) tools like Terraform/OpenTofu. Azure experience is a plus.
Strong problem-solving skills, a proactive approach to troubleshooting, and a commitment to improving operational efficiency and system reliability.
Bonus Points: Experience managing large-scale distributed systems and microservices architecture. Background in Site Reliability Engineering (SRE) best practices
Key Skills
Ranked by relevanceReady to apply?
Join Astra Tech and take your career to the next level!
Application takes less than 5 minutes

