-
Amaris Consulting

DevOps Engineer

Amaris Consulting
Canada · Full-time · Mid-Senior

We are looking for a motivated MLOps Engineer to join our team, working remotely from Canada (Western Timezone only – Pacific or Mountain time zones). As an MLOps Engineer, you will bridge the gap between data science and operations, ensuring seamless integration, deployment, and management of machine learning models in production environments. Your mission will be to automate, scale, and monitor the entire ML lifecycle, leveraging your expertise in cloud infrastructure, DevOps practices, and scripting to deliver efficient, reliable, and secure data-driven solutions that support business innovation.


Key Responsibilities

- Architect, provision, and automate infrastructure on both hyperscaler CSPs and NCP for AI/ML workloads.

- Build, optimize, and maintain end-to-end machine learning pipelines (CI/CD/CT) for continuous integration, delivery, and training in high-throughput, GPU-driven environments.

- Advance Infrastructure as Code (IaC) methods with tools such as Terraform, Ansible, and proprietary SDKs/APIs.

- Manage the deployment and orchestration of large-scale clusters, GPU scheduling, VM automation, and data/storage/network for multi-cloud landscapes.

- Containerize, serve, and monitor ML models using Slurm, Docker, Kubernetes (including Helm and advanced GPU scheduling).

- Implement comprehensive monitoring, model/data drift detection, and operational analytics tailored to high-performance compute platforms. (OTEL, DCGM)

- Ensure robust security, compliance, identity management, and audit readiness in mixed cloud environments. (SOC2)

- Collaborate across engineering, AI research, and operations, producing clear technical documentation and operational runbooks.


Main Requirements

- 6+ years of infrastructure, cloud, or MLOps experience, with at least 1 year in NCP platforms (e.g., CoreWeave, Nebius, Lambda Labs, Yotta).

- Expertise in CSPs (AWS, Azure, GCP) and NCPs (specialized GPU/AI clouds).

- Strong proficiency in IaC (Terraform, Ansible, Pulumi) and DevOps principles.

- Deep hands-on experience orchestrating and monitoring GPU-accelerated workloads and large-scale Slurm or Kubernetes based environments.

- Strong Go/Python (or comparable scripting language) and solid Linux/Unix administration.

- Proven experience in ML pipeline and model deployment in heterogeneous or multi-cloud AI setups.

- Excellent teamwork, stakeholder management, and communication for cross-disciplinary project delivery.


Preferred Skills

- Familiarity with GPU-as-a-Service, job orchestration, MLflow/W&B, and advanced monitoring (OTEL, ELK, LGTM, DCGM).

- Industry certifications in major clouds (AWS/GCP/Azure).

- Experience supporting enterprise-grade business continuity, disaster recovery, and compliance in mixed cloud environments.

Key Skills

Ranked by relevance

cloud machine learning kubernetes terraform ansible devops mlops ai continuous integration infrastructure as code docker aws gcp elk vm
Login to Apply
Posted
Sep 09, 2025
Type
Full-time
Level
Mid-Senior
Location
Canada

Industries

IT Services IT Consulting

Categories

Consulting

Related Jobs

3 roles aligned with this opportunity

View all jobs
View Job Details
Ivy Partners
Related

DevOps Cloud Engineer

2026-05-26

Full-time
Not Applicable
Switzerland
IT Services
Information Technology
View Job Details
NEPTA
Related

DevOps Engineer

2026-05-18

Full-time
Mid-Senior
Italy
Information Technology & Services
Consulting
View Job Details
Holidu
Related

DevOps Engineer (all genders)

2026-05-28

Full-time
Associate
Germany
IT Services
Engineering