MLOps Engineer (AI Platform)

OmiliaSpain3 days ago

Full-timeOther

Track This Job

Add this job to your tracking list to:

Monitor application status and updates
Change status (Applied, Interview, Offer, etc.)
Add personal notes and comments
Set reminders for follow-ups
Track your entire application journey

Save This Job

Add this job to your saved collection to:

Access easily from your saved jobs dashboard
Review job details later without searching again
Compare with other saved opportunities
Keep a collection of interesting positions
Receive notifications about saved jobs before they expire

AI-Powered Job Summary

Get a concise overview of key job requirements, responsibilities, and qualifications in seconds.

Pro Tip: Use this feature to quickly decide if a job matches your skills before reading the full description.

Are you ready to move beyond maintaining legacy systems and build something truly new? What if your next role gave you the keys to architect an entire AI platform from the ground up, powering systems that serve millions of users? We're looking for a foundational MLOps leader to design, build, and own the infrastructure that will define the future of our AI capabilities.

You will architect and build the automated, scalable infrastructure that powers our entire suite of AI models—from Agentic AI and NLU to Voice Biometrics and ASR—ensuring they operate flawlessly and securely for millions of users. You will make key technical decisions, establishing the patterns and practices that will guide our machine learning operations for years to come.

This is a hands-on technical leadership role for an engineer who wants to make a career-defining impact. You won't just be joining a team; you'll be setting the standard for how we build, deploy, and operate machine learning at scale.

Your Mission: Architecting the Future of AI Operations

As our first dedicated MLOps Architect, you will have the autonomy and resources to build our ML platform from scratch. You will be responsible for the entire lifecycle of our production AI systems, ensuring they are reliable, secure, and automated. You will:

Architect and build the automated, scalable infrastructure that powers our entire suite of AI models—from Agentic AI and NLU to Voice Biometrics and ASR—ensuring they operate flawlessly and securely for millions of users
Make key technical decisions, establishing the patterns, tools, and best practices that will guide our machine learning operations for years to come
Collaborate closely with world-class researchers, data scientists, ML engineers, and cloud architects to translate cutting-edge research into robust, production-grade products
Champion a culture of automation, governance, and performance across all our AI/ML initiatives

A Glimpse into Your Impact: Key Challenges You'll Own

Infrastructure as Code (IaC) Foundation: You will design and implement our entire MLOps infrastructure on AWS from the ground up using Terraform, establishing best practices for security, scalability, and cost-efficiency
CI/CD for Machine Learning: You will build and own the end-to-end CI/CD pipelines using GitLab and Jenkins, automating everything from model training and validation to canary deployments and production rollbacks
Containerization & Orchestration at Scale: You will lead the productization of our complex ML models, containerizing them with Docker and deploying them on a robust Kubernetes platform that you will help architect, build, and manage with Helm
Proactive Observability: You will establish a culture of deep system insight by implementing and managing a comprehensive observability stack (e.g., Prometheus and Grafana), ensuring our models meet stringent performance, reliability, and security SLAs

What You'll Bring To The Team

We are looking for an experienced engineer with a builder's mindset and a passion for creating elegant, scalable systems. You have a proven track record of operating critical infrastructure at scale and thrive in an environment where you can take ownership and drive technical strategy.

Requirements

5+ years in a Senior DevOps, SRE, or MLOps role with a focus on production systems
Deep expertise in architecting and managing Kubernetes clusters in a production environment
Proven mastery of at least one major IaC tool (Terraform is strongly preferred)
Strong proficiency in a systems-level scripting language (e.g., Python, Go)
A track record of building and maintaining CI/CD pipelines for critical production services
Direct experience deploying and managing specific ML models (e.g., Agentic AI, NLU, ASR, TTS)
Experience with dedicated ML workflow orchestration tools (e.g., Kubeflow, Apache Airflow)
Familiarity with ML experiment tracking and model registry tools (e.g., MLflow, SageMaker Model Registry)
Experience deploying models on specialized hardware (e.g., GPUs, Inferentia, Trainium, etc.)

Benefits

Fixed compensation;
Long-term employment with the working days vacation;
Development in professional growth (courses, training, etc);
Being part of successful cutting-edge technology products that are making a global impact in the service industry;
Proficient and fun-to-work-with colleagues;
Apple gear

Omilia is proud to be an equal opportunity employer and is dedicated to fostering a diverse and inclusive workplace. We believe that embracing diversity in all its forms enriches our workplace and drives our collective success. We are committed to creating an environment where everyone feels welcomed, valued, and empowered to contribute their unique perspectives without regard to factors such as race, color, religion, gender, gender identity or expression, sexual orientation, national origin, heredity, disability, age, or veteran status, all eligible candidates will be given consideration for employment.

Key Skills

Ranked by relevance

Ready to apply?

Join Omilia and take your career to the next level!

Application takes less than 5 minutes

Apply