Track This Job
Add this job to your tracking list to:
- Monitor application status and updates
- Change status (Applied, Interview, Offer, etc.)
- Add personal notes and comments
- Set reminders for follow-ups
- Track your entire application journey
Save This Job
Add this job to your saved collection to:
- Access easily from your saved jobs dashboard
- Review job details later without searching again
- Compare with other saved opportunities
- Keep a collection of interesting positions
- Receive notifications about saved jobs before they expire
AI-Powered Job Summary
Get a concise overview of key job requirements, responsibilities, and qualifications in seconds.
Pro Tip: Use this feature to quickly decide if a job matches your skills before reading the full description.
About the Company
For over 20 years, MultiBank Group has operated as one of the world’s leading regulated financial groups. With global licenses, a strong technology backbone, and billions in monthly trading volume, the Group is now undergoing a large-scale modernization of its trading infrastructure, platforms, and cloud environments.
We are expanding our engineering teams to support this next phase of growth — building fast, resilient, scalable systems used by traders worldwide.
Position Overview
The Senior SRE Engineer ensures the reliability, scalability, and performance of mission-critical infrastructure and services. This role drives improvements across automation, observability, incident response, and cloud optimization — shaping the standards that keep our global trading environment stable and high-performing.
Key Responsibilities
1. System Reliability & Performance
- Maintain and enhance the reliability, uptime, and performance of production systems.
- Monitor overall system health and proactively identify bottlenecks.
- Conduct RCAs and contribute to post-incident reviews to prevent recurrence.
2. Incident Response & Operations
- Participate in on-call rotations and respond to incidents swiftly.
- Support triage, mitigation, documentation, and recovery procedures.
- Develop runbooks and automation scripts to streamline operations.
3. Automation & Infrastructure Optimization
- Implement and maintain IaC using Terraform, Ansible, or CloudFormation.
- Enhance CI/CD pipelines for seamless, reliable deployments.
- Automate recurring operational tasks to increase efficiency.
- Optimize cloud resource utilization for performance and cost.
4. Monitoring & Observability
- Build and maintain monitoring, alerting, and observability systems (Prometheus, Grafana, ELK, Datadog, etc.).
- Ensure alerting is meaningful and actionable.
- Embed observability into service design and deployment with engineering teams.
5. Cross-Functional Collaboration
- Work closely with engineering, QA, and DevOps teams to integrate reliability into the development lifecycle.
- Support stakeholders with reliable deployments and delivery processes.
- Maintain documentation, playbooks, and process improvements.
6. Continuous Improvement & Innovation
- Identify gaps in systems, processes, and automation frameworks.
- Evaluate and implement emerging technologies to enhance reliability and scale.
- Participate in post-mortems and long-term reliability initiatives.
7. Security & Compliance
- Apply security best practices across configurations, monitoring, and access control.
- Work with security teams to meet compliance and hardening requirements.
- Assist in vulnerability management and deploy timely patches.
8. Reporting & Metrics
- Track and report system reliability trends and incident insights.
- Use metrics-driven analysis to improve reliability and operational excellence.
Education & Qualifications
- Bachelor’s or Master’s degree in Computer Science, Engineering, or equivalent experience.
- Preferred (not mandatory):
- AWS Certified Cloud Engineer
- AWS Machine Learning Ops Engineer
Key Skills
Ranked by relevanceReady to apply?
Join Boundless and take your career to the next level!
Application takes less than 5 minutes

