Director - Cloud, DevOps

Exela TechnologiesIndia5 days ago

Full-timeSales

Track This Job

Add this job to your tracking list to:

Monitor application status and updates
Change status (Applied, Interview, Offer, etc.)
Add personal notes and comments
Set reminders for follow-ups
Track your entire application journey

Save This Job

Add this job to your saved collection to:

Access easily from your saved jobs dashboard
Review job details later without searching again
Compare with other saved opportunities
Keep a collection of interesting positions
Receive notifications about saved jobs before they expire

AI-Powered Job Summary

Get a concise overview of key job requirements, responsibilities, and qualifications in seconds.

Pro Tip: Use this feature to quickly decide if a job matches your skills before reading the full description.

Director of Cloud, DevOps, and SRE: Emphasis on Execution

We are looking for a Director of Cloud, DevOps, and Site Reliability Engineering (SRE) who will be a hands-on, execution-focused leader responsible for driving the technical strategy, implementation, and continuous operation of our cloud infrastructure and services. This role demands a pragmatic leader capable of translating strategic vision into tangible, high-quality, and scalable results.

Key Responsibilities and Execution Focus

The primary responsibility of the Director is to execute on the cloud, DevOps, and SRE strategy, ensuring immediate and long-term operational excellence.

1. Delivery and Implementation (Execution)

Lead the migration and deployment of core business applications and services to cloud platforms (e.g., AWS, Azure, GCP), ensuring projects are delivered on time, within budget, and meet defined non-functional requirements (security, scalability, performance).
Direct the implementation of Continuous Integration/Continuous Delivery (CI/CD) pipelines across all engineering teams, focusing on fully automated, reliable, and repeatable deployments.
Drive Infrastructure as Code (IaC) adoption (e.g., Terraform, Ansible), establishing a 100% code-driven infrastructure environment with clear governance and review processes.
Establish and enforce Service Level Objectives (SLOs) and Service Level Indicators (SLIs) for all critical services, immediately implementing monitoring and alerting to measure against these targets.

2. Operational Excellence and Reliability (SRE Execution)

Direct the SRE function to minimize operational toil by developing and deploying automation tools and services for routine tasks, incident response, and capacity management.
Lead major incident response and post-mortem processes, ensuring effective root cause analysis and implementing immediate, execution-driven solutions to prevent recurrence.
Execute a robust cost management strategy for cloud resources, implementing FinOps practices to optimize spending without compromising reliability or performance.
Own the security posture of the cloud environment, working hands-on with security teams to implement and automate compliance and security controls (DevSecOps).

3. Team Leadership and Mentorship (Pragmatic Leadership)

Recruit, develop, and mentor a high-performing team of Cloud Engineers, DevOps Engineers, and SREs, setting clear, execution-focused goals and metrics.
Foster a culture of ownership, accountability, and execution within the team, emphasizing rapid iteration, collaboration, and bias for action.
Act as a hands-on leader by actively participating in design reviews, critical deployments, and troubleshooting efforts.

Qualifications and Requirements

Required Skills & Experience (Execution-Driven)

Minimum of 10 years of progressive experience in infrastructure, operations, or software engineering, with at least 3 years in a Director or Senior Management role overseeing Cloud, DevOps, or SRE teams.
Deep, demonstrable expertise in a major cloud provider (AWS, Azure, and GCP), including advanced networking, security services, and serverless architectures. Certification at the Professional/Specialty level is a plus.
Extensive experience implementing and scaling IaC and configuration management tools (e.g., Terraform, Ansible, SaltStack) in a production environment.
Proven track record of establishing and running SRE practices (SLOs, error budgets, toil reduction) with tangible results in improving service reliability and availability.
Proficiency in modern scripting/programming languages (e.g., Python, Go, Bash) for automation and tool development.

Education

Bachelor’s degree in Computer Science, Engineering, or a related field; equivalent practical experience is accepted.

Key Skills

Ranked by relevance

Ready to apply?

Join Exela Technologies and take your career to the next level!

Application takes less than 5 minutes

Apply