Track This Job
Add this job to your tracking list to:
- Monitor application status and updates
- Change status (Applied, Interview, Offer, etc.)
- Add personal notes and comments
- Set reminders for follow-ups
- Track your entire application journey
Save This Job
Add this job to your saved collection to:
- Access easily from your saved jobs dashboard
- Review job details later without searching again
- Compare with other saved opportunities
- Keep a collection of interesting positions
- Receive notifications about saved jobs before they expire
AI-Powered Job Summary
Get a concise overview of key job requirements, responsibilities, and qualifications in seconds.
Pro Tip: Use this feature to quickly decide if a job matches your skills before reading the full description.
Site Reliability Engineer, Events (Enterprise Messaging)
Your role
We are a group of professionals who enjoy identifying areas of improvement and engineering better solutions. As a crew, we do our best to create a supportive environment where each of us feel appreciated and have a chance to develop professionally.
We are looking for candidates to take the role of Site Reliability Engineer and help us to:
- Main Technologies: IBM MQ for distributed environment and Mainframe Knowledge
- Additional Technology : TIBCO EMS or Confluent Kafka
- Provide 3rd level /SRE support for IBM MQ for distributed environment and Mainframe MQ, specifically planning, configuring, building, migrating and administering MQ managers.
- Mainly building end to end configuration work for MQ for Payments applications. Integration and decom work with Blue side (CS) and Red Side.
- maintain a risk-aware and secure environment
- buildup support for cloud technologies and Kafka
- work with payments application teams to deliver their middleware requirements on time
- work on automating processes
- spot problems/toils, areas for improvement, and performance bottlenecks
- improve reliability, quality, and time-to-market of software solutions
- measure and optimize system performance
The key responsibilities are:
- determine the reliability of our digital products, technology services, and the infrastructure that underpins them
- minimize the risk and impact of failures by engineering operational improvements, such as predictive monitoring, auto scaling or self-healing
- respond to production incidents to gain first-hand experience of operational hotspots and to identify the root causes of problems
- collect and analyze operational data, define and monitor key metrics to identify and communicate areas for improvement
- apply a broad range of engineering practices with a focus on reliability, from instrumentation, performance analysis, and log analytics to automated testing, deployment, and operations
- ensure the quality, security, reliability, and compliance of our solutions by applying our digital principles and implementing both functional and non-functional requirements
Key Skills
Ranked by relevanceReady to apply?
Join Thrive IT Systems and take your career to the next level!
Application takes less than 5 minutes

