-
Intellect Design Arena Ltd

Site Reliability Engineer

Intellect Design Arena Ltd
Canada · Full-time · Mid-Senior

Position Details

Job Title: Site Reliability Engineer

Location: Vancouver, Canada

Job Type - Full Time


Job Overview

At Intellect Design Arena, our Digital Experience Platform is redefining how people shop and bank, delivering seamless, secure, and innovative experiences through a microservice-based SaaS platform on Azure AKS. Certified for SOC2 Level 2 and ISO27001, we’re all about pushing boundaries while keeping reliability strong. As our Site Reliability Engineer (SRE), you’ll ensure our platform’s uptime, leveraging Azure DevOps, Prometheus, Grafana, and Azure Monitor to ensure rock-solid performance. Reporting to our Platform Management Lead, you’ll drive reliability in our GitOps-powered, high-compliance environment.


Responsibilities

  • Champion Reliability: Define and monitor SLIs/SLOs to ensure our Digital Experience Platform delivers flawless retail and banking experiences, keeping users happy and businesses thriving across Canada.
  • Master Incident Response: Lead incident response, perform root cause analysis (RCA), and implement fixes to keep our platform running 24/7, minimizing disruptions for retailers and bankers.
  • Build Observability: Set up and optimize monitoring with Prometheus, Grafana, and Azure Monitor, configuring alerts and dashboards to catch issues before they impact users.
  • Automate Reliability: Collaborate with CI/CD & Automation Engineers to integrate reliability checks into Azure DevOps pipelines, ensuring GitOps-driven deployments are stable and secure.
  • Fortify Security & Compliance: Work with our overseas Cloud Security team to embed SOC2/ISO27001 controls, ensuring monitoring and incident processes meet compliance standards for secure digital experiences.
  • Optimize Performance: Partner with Platform Engineering to tune AKS clusters and microservices, ensuring scalability and low latency for Canada-wide users.
  • Tackle Technical Debt: Identify and prioritize technical debt in monitoring, automation, and infrastructure, keeping our platform as clean as a freshly provisioned AKS node.
  • Collaborate with Visionaries: Team up with Network Engineering, Azure Cloud Engineering, and overseas Cloud Architecture to build a platform that redefines digital experiences.
  • Document the Reliability Magic: Maintain runbooks, document incident RCAs, and create service desk integrations to keep our team aligned and compliance audits breezy.


Qualifications

  • Diploma / degree in computer science
  • 2+ years of experience in a Developer of System Administrator role
  • Reliability Expertise: Proven experience defining SLIs/SLOs, managing incidents, and ensuring high availability in cloud-native environments like Azure AKS.
  • Monitoring Mastery: Deep knowledge of Prometheus, Grafana, and Azure Monitor for building observability and alerting systems.
  • Automation Skills: Familiarity with Azure DevOps and scripting (Python, Bash, PowerShell) to automate reliability tasks and integrate with GitOps workflows.
  • Security Savvy: Experience embedding SOC2/ISO27001 compliance into monitoring and incident processes, ensuring secure digital experiences for retail and banking.
  • Problem-Solving Superpowers: A knack for debugging complex issues, performing RCAs, and implementing fixes to keep systems humming.
  • Team Player Energy: Strong collaboration skills to work with cross-functional teams, including overseas Cloud Security and Architecture, while reporting to our Platform Management Lead.
  • Bonus Points: Experience with ArgoCD or Argo Workflows, microservices, or high-compliance SaaS platforms. A GitHub repo with automation scripts or a passion for disrupting retail and banking is a huge win

Key Skills

Ranked by relevance

cloud microservices saas incident response high availability cloud security powershell python bash
Login to Apply
Posted
Aug 21, 2025
Type
Full-time
Level
Mid-Senior
Location
Vancouver

Industries

Software Development IT Services IT Consulting Financial Services

Categories

Administrative

Related Jobs

3 roles aligned with this opportunity

View all jobs
View Job Details
Netlight
Related

Mobile Engineering Consultant (mid-level)

2026-05-28

Full-time
Entry
Germany
IT Services
Information Technology
View Job Details
flaconi
Related

Full Stack Software Engineer (all genders)

2026-05-21

Full-time
Mid-Senior
Germany
Construction
Engineering
View Job Details
Tenth Revolution Group
Related

DevOps Engineer

2026-05-28

Full-time
Mid-Senior
Germany
Information Services
Information Technology