Hybrid: 3 work from office- Face 2 Face interview required
Skills:
• Production experience in SRE / Infrastructure / ops for large-scale systems
• Strong programming/scripting skills (Python, Go, Java, or equivalent)
• Deep experience with containerization (Docker), orchestration (Kubernetes, etc.)
• Infrastructure-as-code (Terraform, Helm, CloudFormation, Ansible, etc.)
• Familiarity with GPU / AI compute clusters, high-performance data storage, and distributed architectures
• Experience with monitoring / observability / logging / alerting tools (Prometheus, Grafana, ELK / EFK, Datadog, etc.)
• Networking & systems engineering knowledge (TCP/IP, DNS, routing, load bal-ancing, distributed storage)
• Solid experience in capacity planning, performance tuning, scaling, and incident response
• Demonstrated ability to lead RCAs, deploy fixes, and drive reliability improve-ments
• Experience in regulated environments (financial services, compliance, audit, se-curity) is a strong plus
• Excellent communication, documentation, and cross-team collaboration skills
• Proven track record of reducing operational toil via automation
Key Skills
Ranked by relevance
Related Jobs
3 roles aligned with this opportunity
Full-Stack Developer | Remote
2026-05-27
Network and Systems Engineer
2026-05-28
DevOps Engineer
2026-05-27
- Posted
- Feb 24, 2026
- Type
- Contract
- Level
- Mid-Senior
- Location
- Montreal
- Company
- Tekgence Inc
Industries
Categories
Related Jobs
3 roles aligned with this opportunity
Full-Stack Developer | Remote
2026-05-27
Network and Systems Engineer
2026-05-28
DevOps Engineer
2026-05-27