-
View all jobs
AgileEngine is an Inc. 5000 company that creates award-winning software for Fortune 500 brands and trailblazing startups across 17+ industries. We rank among the leaders in areas like application development and AI/ML, and our people-first culture has earned us multiple Best Place to Work awards.
WHY JOIN US
If you're looking for a place to grow, make an impact, and work with people who care, we'd love to meet you!
ABOUT THE ROLE
We are looking for a Cloud Operations Engineer to own production support across a full AWS-native technology stack supporting multiple platforms and hundreds of terabytes of data. You will monitor systems, triage and escalate incidents, execute operational playbooks, and build automation to reduce manual toil across ECS, RDS, Glue, Lambda, and observability tooling. The role operates as a standalone support function in a fintech environment with on-call responsibilities.
WHAT YOU WILL DO
- Monitor production systems and respond to alerts across the full stack;
- Perform first-level triage on incidents and support requests, escalating to developers with thorough context and diagnostics;
- Execute patching, operational tasks, and documented playbooks;
- Contribute to improving documentation, monitoring coverage, reporting, and automation of operational capabilities;
- Follow and contribute to improving incident management procedures, participate in post-incident reviews, and feed lessons back into runbooks;
- Contribute to root cause analysis and identify recurring issues;
- Support SLA adherence and highlight risks;
- Implement and enhance automation and monitoring based on existing frameworks, agentic workflows, and scripting to minimize manual toil;
- Collaborate with help desk and deskside support partners for production tasks affecting employees;
- Support security incident handling in coordination with internal processes.
MUST HAVES
- 3+ years of experience in production support, SRE, NOC, or operations engineering supporting system uptime and incident resolution;
- Hands-on AWS operations experience across compute, networking, and security services;
- Operational proficiency with PostgreSQL and Amazon RDS;
- Full-stack triage ability across infrastructure, data pipelines, and application layers;
- Experience working within incident response frameworks such as ITIL or NIST;
- Experience working in SLA-driven environments and meeting performance targets;
- Experience implementing automation and using AI/ML tools to streamline operations;
- Strong communication and coordination skills for cross-functional work with developers, security partners, and support providers;
- Upper-intermediate English level.
NICE TO HAVES
- Experience with AWS data services such as Glue, S3, Athena, and EventBridge, and ETL pipeline operations;
- Familiarity with Datadog, Metaplane, or comparable observability and data quality platforms;
- Infrastructure-as-code proficiency with SAM, CloudFormation, or Terraform;
- Background in financial services or environments with regulatory and compliance requirements;
- AWS certifications such as Solutions Architect or SysOps Administrator.
PERKS AND BENEFITS
- Professional growth: Mentorship, TechTalks, and personalized growth roadmaps.
- Competitive compensation: USD-based pay with education, fitness, and team activity budgets.
- Exciting projects: Modern solutions with Fortune 500 and top product companies.
- Flextime: Flexible schedule with remote and office options.
Meet Our Recruitment Process
Application → Coding Challenge → Video Interview → Technical Interview or Hiring Manager Interview
Each step helps us understand your skills and overall fit.
If it’s a match, you’ll receive an offer.
WHY JOIN US
If you're looking for a place to grow, make an impact, and work with people who care, we'd love to meet you!
ABOUT THE ROLE
We are looking for a Cloud Operations Engineer to own production support across a full AWS-native technology stack supporting multiple platforms and hundreds of terabytes of data. You will monitor systems, triage and escalate incidents, execute operational playbooks, and build automation to reduce manual toil across ECS, RDS, Glue, Lambda, and observability tooling. The role operates as a standalone support function in a fintech environment with on-call responsibilities.
WHAT YOU WILL DO
- Monitor production systems and respond to alerts across the full stack;
- Perform first-level triage on incidents and support requests, escalating to developers with thorough context and diagnostics;
- Execute patching, operational tasks, and documented playbooks;
- Contribute to improving documentation, monitoring coverage, reporting, and automation of operational capabilities;
- Follow and contribute to improving incident management procedures, participate in post-incident reviews, and feed lessons back into runbooks;
- Contribute to root cause analysis and identify recurring issues;
- Support SLA adherence and highlight risks;
- Implement and enhance automation and monitoring based on existing frameworks, agentic workflows, and scripting to minimize manual toil;
- Collaborate with help desk and deskside support partners for production tasks affecting employees;
- Support security incident handling in coordination with internal processes.
MUST HAVES
- 3+ years of experience in production support, SRE, NOC, or operations engineering supporting system uptime and incident resolution;
- Hands-on AWS operations experience across compute, networking, and security services;
- Operational proficiency with PostgreSQL and Amazon RDS;
- Full-stack triage ability across infrastructure, data pipelines, and application layers;
- Experience working within incident response frameworks such as ITIL or NIST;
- Experience working in SLA-driven environments and meeting performance targets;
- Experience implementing automation and using AI/ML tools to streamline operations;
- Strong communication and coordination skills for cross-functional work with developers, security partners, and support providers;
- Upper-intermediate English level.
NICE TO HAVES
- Experience with AWS data services such as Glue, S3, Athena, and EventBridge, and ETL pipeline operations;
- Familiarity with Datadog, Metaplane, or comparable observability and data quality platforms;
- Infrastructure-as-code proficiency with SAM, CloudFormation, or Terraform;
- Background in financial services or environments with regulatory and compliance requirements;
- AWS certifications such as Solutions Architect or SysOps Administrator.
PERKS AND BENEFITS
- Professional growth: Mentorship, TechTalks, and personalized growth roadmaps.
- Competitive compensation: USD-based pay with education, fitness, and team activity budgets.
- Exciting projects: Modern solutions with Fortune 500 and top product companies.
- Flextime: Flexible schedule with remote and office options.
Meet Our Recruitment Process
Application → Coding Challenge → Video Interview → Technical Interview or Hiring Manager Interview
Each step helps us understand your skills and overall fit.
If it’s a match, you’ll receive an offer.
Key Skills
Ranked by relevance
aws
sla
incident response
cloudformation
datadog
etl
ecs
sam
s3
Related Jobs
3 roles aligned with this opportunity
View Job Details
Related
DevOps Engineer (AWS)
2026-05-27
Full-time
Associate
Argentina
Software Development
Engineering
View Job Details
Related
Full-stack .NET Software Engineer (React/Angular)
2026-05-27
Full-time
Associate
Ukraine
Software Development
Information Technology
View Job Details
Related
Backend Engineer
2026-05-26
Contract
Mid-Senior
Singapore
IT Services
Design
Login to Apply
- Posted
- May 12, 2026
- Type
- Full-time
- Level
- Mid-Senior
- Location
- Ig
- Company
- AgileEngine
Industries
IT Services
IT Consulting
Categories
Business Development
Related Jobs
3 roles aligned with this opportunity
View Job Details
Related
DevOps Engineer (AWS)
2026-05-27
Full-time
Associate
Argentina
Software Development
Engineering
View Job Details
Related
Full-stack .NET Software Engineer (React/Angular)
2026-05-27
Full-time
Associate
Ukraine
Software Development
Information Technology
View Job Details
Related
Backend Engineer
2026-05-26
Contract
Mid-Senior
Singapore
IT Services
Design