Software Engineer – Python & MLOPS Systems

We are looking for a Senior Developer for an important U.S.-based client.

Project: Acceleration of ML platform alignment, infrastructure automation, and deployment repeatability. The goal is to support large-scale AI customer engagements in regulated environments by improving model release velocity, architectural consistency, and the operational stability of AI/ML systems.

What You’ll Do

Design, build, and scale MLOps infrastructure and internal platforms using Python 3 to support production-grade AI services.
Architect and optimize automated ML workflows, managing dependencies and environments through the Poetry build system.
Engineer robust CI/CD/CT (Continuous Training) pipelines specifically for AI/ML using CloudBees/Jenkins.
Operationalize and monitor LLM lifecycles using frameworks like LiteLLM and LangFuse to ensure model reliability and performance tracking.
Deploy and manage AI Agent architectures (CrewAI, OpenAI) as scalable system components, ensuring they meet enterprise reliability standards.
Lead Configuration Management across hybrid environments to ensure strict system consistency and drift detection.
Implement MLOps best practices, including model versioning, automated testing, and observability (logging/metrics/tracing) for AI systems.
Collaborate with Data Scientists and DevOps to bridge the gap between model prototyping and production-scale deployment.

Must Have

4+ years of professional experience in Python 3 with a focus on system backend or infrastructure.

Proven experience in MLOps or DevOps specifically supporting AI/ML production environments.

Hands-on expertise with Poetry for reproducible environment management.

Deep experience building CI/CD pipelines for ML (CloudBees, Jenkins, or GitHub Actions).

Minimum Qualifications

Bachelor’s degree in Computer Science, Systems Engineering, or equivalent practical experience.

2-5 years of experience in Software Engineering, DevOps, or MLOps roles.

Demonstrated ability to automate complex technical workflows.

Experience working in agile teams with a "systems thinking" approach to AI challenges.

Nice to Have

Cloud Infrastructure expertise (AWS, GCP, or Azure) focused on ML services (e.g., SageMaker, Vertex AI).

Container Orchestration: Strong familiarity with Docker and Kubernetes (EKS/GKE) for scaling ML workloads.

Observability for ML: Experience with Prometheus, Grafana, or specialized ML monitoring tools.

Experience in regulated or FedRAMP environments, ensuring security and compliance in AI deployments.

Knowledge of Vector Databases (Chroma, Pinecone, Weaviate) and RAG (Retrieval-Augmented Generation) infrastructure.

Experience with LLM Operations (LLMOps) tools such as LiteLLM, LangFuse, or Arize Phoenix.
Direct experience deploying or managing AI agent frameworks (CrewAI, OpenAI Agents, or similar) at scale.
Solid mastery of Configuration Management (Ansible, Terraform, or similar) and Infrastructure as Code (IaC).
Strong English communication skills for technical collaboration in global environments.

#WFH

Software Engineer – Python & MLOPS Systems

Key Skills

Related Jobs

DevOps Engineer

DevOps Engineer (all genders)

AI Software Engineer (m/f/d) - Berlin

Related Jobs

DevOps Engineer

DevOps Engineer (all genders)

AI Software Engineer (m/f/d) - Berlin

Cookie Settings