Thrive IT Systems
Site Reliability Engineer
Thrive IT SystemsSwitzerland6 hours ago
ContractInformation Technology

Site Reliability Engineer, Events (Enterprise Messaging)


Your role

We are a group of professionals who enjoy identifying areas of improvement and engineering better solutions. As a crew, we do our best to create a supportive environment where each of us feel appreciated and have a chance to develop professionally.


We are looking for candidates to take the role of Site Reliability Engineer and help us to:


  • Main Technologies: IBM MQ for distributed environment and Mainframe Knowledge
  • Additional Technology : TIBCO EMS or Confluent Kafka
  • Provide 3rd level /SRE support for IBM MQ for distributed environment and Mainframe MQ, specifically planning, configuring, building, migrating and administering MQ managers.
  • Mainly building end to end configuration work for MQ for Payments applications. Integration and decom work with Blue side (CS) and Red Side.
  • maintain a risk-aware and secure environment
  • buildup support for cloud technologies and Kafka
  • work with payments application teams to deliver their middleware requirements on time
  • work on automating processes
  • spot problems/toils, areas for improvement, and performance bottlenecks
  • improve reliability, quality, and time-to-market of software solutions
  • measure and optimize system performance


The key responsibilities are:

  • determine the reliability of our digital products, technology services, and the infrastructure that underpins them
  • minimize the risk and impact of failures by engineering operational improvements, such as predictive monitoring, auto scaling or self-healing
  • respond to production incidents to gain first-hand experience of operational hotspots and to identify the root causes of problems
  • collect and analyze operational data, define and monitor key metrics to identify and communicate areas for improvement
  • apply a broad range of engineering practices with a focus on reliability, from instrumentation, performance analysis, and log analytics to automated testing, deployment, and operations
  • ensure the quality, security, reliability, and compliance of our solutions by applying our digital principles and implementing both functional and non-functional requirements

Key Skills

Ranked by relevance