-
View all jobs
This position is posted by Jobgether on behalf of a partner company. We are currently looking for a SRE/DevOps Engineer in Canada.
This role sits at the frontline of enterprise platform reliability, ensuring the stability, availability, and performance of large-scale cloud and hybrid systems. You will act as the first line of response for incidents across modern infrastructure environments, including Kubernetes, APIs, databases, and cloud-native services. Working in a highly operational and collaborative setting, you will monitor systems, execute runbooks, and support rapid incident resolution to minimize downtime. The position combines hands-on technical troubleshooting with structured operational processes, where precision and communication are critical. You will contribute directly to service reliability by identifying issues, escalating intelligently, and improving documentation and automation opportunities. This is a high-impact role ideal for professionals who thrive in fast-paced, incident-driven environments and enjoy keeping complex systems running smoothly.
Accountabilities
We use an AI-powered matching process to ensure your application is reviewed quickly, objectively, and fairly against the role's core requirements. Our system identifies the top-fitting candidates, and this shortlist is then shared directly with the hiring company. The final decision and next steps (interviews, assessments) are managed by their internal team.
We appreciate your interest and wish you the best!
Why Apply Through Jobgether?
Data Privacy Notice: By submitting your application, you acknowledge that Jobgether will process your personal data to evaluate your candidacy and share relevant information with the hiring employer. This processing is based on legitimate interest and pre-contractual measures under applicable data protection laws (including GDPR). You may exercise your rights (access, rectification, erasure, objection) at any time.
We may use artificial intelligence (AI) tools to support parts of the hiring process, such as reviewing applications, analyzing resumes, or assessing responses. These tools assist our recruitment team but do not replace human judgment. Final hiring decisions are ultimately made by humans. If you would like more information about how your data is processed, please contact us.
This role sits at the frontline of enterprise platform reliability, ensuring the stability, availability, and performance of large-scale cloud and hybrid systems. You will act as the first line of response for incidents across modern infrastructure environments, including Kubernetes, APIs, databases, and cloud-native services. Working in a highly operational and collaborative setting, you will monitor systems, execute runbooks, and support rapid incident resolution to minimize downtime. The position combines hands-on technical troubleshooting with structured operational processes, where precision and communication are critical. You will contribute directly to service reliability by identifying issues, escalating intelligently, and improving documentation and automation opportunities. This is a high-impact role ideal for professionals who thrive in fast-paced, incident-driven environments and enjoy keeping complex systems running smoothly.
Accountabilities
- Monitor system health across cloud and on-prem environments using observability tools such as dashboards, logs, and alerting systems.
- Perform first-line incident triage, identify system anomalies, and execute standardized runbooks for resolution or escalation.
- Troubleshoot application and infrastructure issues across Kubernetes, APIs, databases, and cloud services to isolate root causes.
- Communicate incident status clearly and effectively to stakeholders, ensuring timely updates and accurate reporting.
- Support deployment operations and routine tasks by following predefined operational procedures and workflows.
- Document incidents, identify gaps in runbooks, and contribute to continuous improvement of operational knowledge bases.
- Assist in onboarding new applications into operational monitoring and support frameworks.
- Collaborate with engineering and L2/L3 teams to ensure smooth escalation and resolution of complex issues.
- 2-5 years of experience in IT operations, NOC, SRE, or DevOps-related roles.
- Strong understanding of Linux, Kubernetes basics, and networking fundamentals.
- Experience working with observability tools such as Prometheus, Grafana, Splunk, ELK, or similar platforms.
- Ability to follow structured operational workflows, including runbooks and incident management procedures.
- Basic scripting knowledge in Python, Bash, or PowerShell for minor automation or script adjustments.
- Familiarity with cloud platforms such as AWS, Azure, or GCP is a strong plus.
- Understanding of troubleshooting techniques (DNS, logs, connectivity checks, networking tools).
- Strong analytical and problem-solving mindset with a focus on incident resolution and root cause identification.
- Effective communication skills for incident reporting and stakeholder updates.
- Nice to have: exposure to ServiceNow, Jira, xMatters, SQL/NoSQL basics, or AI-assisted operational tools.
- Competitive compensation aligned with experience and technical expertise.
- Flexible working arrangements depending on role and location.
- Comprehensive health and wellness support programs.
- Opportunities for continuous learning, upskilling, and career development.
- Exposure to large-scale cloud-native and enterprise systems.
- Inclusive and diverse work environment focused on collaboration and innovation.
- Strong emphasis on work-life balance and employee well-being.
- Access to modern tools, platforms, and automation-driven operations practices.
We use an AI-powered matching process to ensure your application is reviewed quickly, objectively, and fairly against the role's core requirements. Our system identifies the top-fitting candidates, and this shortlist is then shared directly with the hiring company. The final decision and next steps (interviews, assessments) are managed by their internal team.
We appreciate your interest and wish you the best!
Why Apply Through Jobgether?
Data Privacy Notice: By submitting your application, you acknowledge that Jobgether will process your personal data to evaluate your candidacy and share relevant information with the hiring employer. This processing is based on legitimate interest and pre-contractual measures under applicable data protection laws (including GDPR). You may exercise your rights (access, rectification, erasure, objection) at any time.
We may use artificial intelligence (AI) tools to support parts of the hiring process, such as reviewing applications, analyzing resumes, or assessing responses. These tools assist our recruitment team but do not replace human judgment. Final hiring decisions are ultimately made by humans. If you would like more information about how your data is processed, please contact us.
Key Skills
Ranked by relevance
cloud
kubernetes
ai
artificial intelligence
prometheus
powershell
ai tools
grafana
python
devops
splunk
linux
bash
gdpr
jira
aws
gcp
elk
dns
Related Jobs
3 roles aligned with this opportunity
View Job Details
Related
Java Developer
2026-05-26
Full-time
Not Applicable
Canada
Internet Marketplace Platforms
Engineering
View Job Details
Related
Software Engineer (Golang Developer)
2026-05-24
Full-time
Not Applicable
India
Internet Marketplace Platforms
Engineering
View Job Details
Related
Infrastructure Engineer
2026-05-27
Full-time
Not Applicable
Brazil
Internet Marketplace Platforms
Information Technology
Login to Apply
- Posted
- May 12, 2026
- Type
- Full-time
- Level
- Associate
- Location
- Canada
- Company
- Jobgether
Industries
Internet Marketplace Platforms
Categories
Engineering
Information Technology
Related Jobs
3 roles aligned with this opportunity
View Job Details
Related
Java Developer
2026-05-26
Full-time
Not Applicable
Canada
Internet Marketplace Platforms
Engineering
View Job Details
Related
Software Engineer (Golang Developer)
2026-05-24
Full-time
Not Applicable
India
Internet Marketplace Platforms
Engineering
View Job Details
Related
Infrastructure Engineer
2026-05-27
Full-time
Not Applicable
Brazil
Internet Marketplace Platforms
Information Technology