Track This Job
Add this job to your tracking list to:
- Monitor application status and updates
- Change status (Applied, Interview, Offer, etc.)
- Add personal notes and comments
- Set reminders for follow-ups
- Track your entire application journey
Save This Job
Add this job to your saved collection to:
- Access easily from your saved jobs dashboard
- Review job details later without searching again
- Compare with other saved opportunities
- Keep a collection of interesting positions
- Receive notifications about saved jobs before they expire
AI-Powered Job Summary
Get a concise overview of key job requirements, responsibilities, and qualifications in seconds.
Pro Tip: Use this feature to quickly decide if a job matches your skills before reading the full description.
We are hiring a Lead DevOps Engineer to ensure the reliability and growth of our Ads Organization data infrastructure at scale. In this role you will upgrade platforms, troubleshoot production issues, and optimize CI/CD pipelines while partnering closely with stakeholders. Apply now to build dependable delivery and monitoring practices
Responsibilities
- Drive and optimize data processing operations across Airflow/MWAA, Spark, and Flink
- Architect and maintain AWS cloud infrastructure using Kubernetes and Terraform
- Coordinate with stakeholders to elicit requirements and provide visibility into infrastructure changes
- Perform upgrades, routine maintenance, and root-cause troubleshooting on data platforms with Datadog monitoring and performance insights
- Strengthen CI/CD automation by enhancing Spinnaker and Jenkins pipelines for reliable releases
Requirements
- Extensive experience in DevOps engineering, including 5+ years in similar roles
- Hands-on leadership exposure with 1+ year guiding a team or owning delivery outcomes
- Strong expertise in Amazon Web Services (AWS) for cloud infrastructure deployment and operations
- Practical experience using Apache Airflow to orchestrate and schedule data pipelines
- Expert-level knowledge of Kubernetes for managing and scaling containerized applications
- Solid proficiency with Terraform for automating infrastructure and managing configuration
- English proficiency at B2 (Upper-Intermediate) or above to collaborate effectively and report clearly
Nice to have
- Background with Apache Flink for real-time data stream processing
- Working knowledge of Apache NiFi for data flow automation and control
- Familiarity with Databricks for advanced analytics and machine learning workloads
- Hands-on use of Datadog for monitoring infrastructure and resolving issues
We offer
- International projects with top brands
- Work with global teams of highly skilled, diverse peers
- Healthcare benefits
- Employee financial programs
- Paid time off and sick leave
- Upskilling, reskilling and certification courses
- Unlimited access to the LinkedIn Learning library and 22,000+ courses
- Global career opportunities
- Volunteer and community involvement opportunities
- EPAM Employee Groups
- Award-winning culture recognized by Glassdoor, Newsweek and LinkedIn
Key Skills
Ranked by relevanceReady to apply?
Join EPAM Systems and take your career to the next level!
Application takes less than 5 minutes

