ThoughtBot
Machine Learning Engineer
ThoughtBotFrance13 hours ago
ContractInformation Technology

This profile defines the personnel requirements for a Senior MLOps Engineer position supporting a managed private cloud infrastructure with GPU-as-a-Service capabilities. The role is critical to delivering MLOps services up to the production layer for government defence sector clients requiring the highest security standards.


MANDATORY REQUIREMENTS

Security & Legal

• Security Clearance: Active Confidential Défense C3 or ability to obtain within 6 months

• Location: Must be based in metropolitan France

• Background: Clean security background check with no foreign influence concerns

• Availability: Able to work on-site at secure facilities when required


Core Technical Competencies

• Minimum 6 years hands-on experience in cloud infrastructure and containerization

• Minimum 4 years specific experience with RedHat OpenShift/Kubernetes in production environments

• Minimum 3 years MLOps pipeline design and implementation

• Proven experience with GPU cluster management (NVIDIA A100/H100 preferred)

TECHNICAL SKILL REQUIREMENTS


Essential Technical Stack (Must Have)

Technology Area Required Skills Proficiency Level

Container Orchestration Kubernetes administration, cluster managementExpert

Managed Kubernetes RedHat OpenShift Intermediate

Storage Solutions Persistent Storage Solutions for Containers such as Dell Powerscale Advanced

Database Management PostgreSQL with AI extensions (pgvector), Pinecone etc.Intermediate

Authentication KeyCloak SSO, LDAP/AD integration Intermediate

MLOps Frameworks and Tools RedHat Open AI, Mistral AI, ZenML, ClearML, Tensorflow, PyTorch, DVC, MLflow, Apache Airflow, Kubeflow Pipelines, Prefect, Dagster,Advanced

DevOps Tools GitLab CI/CD, automation, IaC Expert

GPU Management NVIDIA GPU scheduling, resource allocationAdvanced


Highly Desirable (Preferred)

• Experience with French government/defence sector projects

• ANSSI security framework knowledge

• Multi-cluster Kubernetes management

• Service mesh technologies (Istio, Linkerd)

• Monitoring and observability (Prometheus, Grafana)


EXPERIENCE PROFILE

Professional Background

• Total Experience: 6-10 years in cloud infrastructure and DevOps

• Leadership Experience: 2+ years leading technical teams (3-8 members)

• Project Scale: Experience managing infrastructure supporting 100+ concurrent users

• Industry Experience: Government, defence, or highly regulated industries preferred


Specific Project Experience Required

• Deployed and managed Kubernetes clusters (500+ nodes)

• Implemented enterprise-grade storage solutions for data-intensive workloads

• Built end-to-end MLOps pipelines from development to production

• Integrated authentication systems in secure, multi-tenant environments

• Managed GPU resources for AI/ML workloads at scale


COMPETENCY ASSESSMENT CRITERIA

Technical Evaluation (Weight: 60%)

1. Architecture Design - Ability to design scalable, secure MLOps architectures

2. Programming Skills - Proficiency in a major programming language such as Python or Go, with hands-on development experience

3. Implementation Skills - Hands-on experience with required technology stack

4. Problem Solving - Troubleshooting complex distributed systems issues

5. Security Awareness - Understanding of defence-grade security requirements


Professional Qualities (Weight: 40%)

1. Communication - Ability to explain technical concepts to non-technical stakeholders

2. Project Management - Experience managing technical deliverables and timelines

3. Collaboration - Working effectively in cross-functional teams

4. Adaptability - Learning new technologies and adapting to changing requirements


EDUCATION & CERTIFICATION REQUIREMENTS

Minimum Education

• Bachelor’s degree in Computer Science, Engineering, or a related technical field

• Master's degree preferred but not mandatory with equivalent experience


Required Certifications (at least 2 of the following)

• Certified Kubernetes Administrator (CKA) or equivalent

• Cloud platform certifications (AWS, Azure, GCP)

• GitLab Certified DevOps Professional

• NVIDIA GPU computing certifications

• Security-related certifications (CISSP, CISM, or equivalent)


ROLE RESPONSIBILITIES

Primary Duties

• Design and implement MLOps infrastructure using specified technology stack

• Manage GPU-as-a-Service platform for client AI/ML workloads

• Ensure compliance with French government security standards (ANSSI)

• Collaborate with solution architects on client requirement analysis

• Provide 24/7 support rotation for critical infrastructure


SUCCESS PROFILE

Ideal Candidate Characteristics

• Technical Depth: Strong foundation in distributed systems and cloud architecture

• Security Mindset: Understands and embrace security-first approach

• Client Focus: Experience working directly with government or enterprise clients

• Cultural Fit: Aligns with French business culture and government sector expectations

• Growth Potential: Demonstrates ability to learn and adapt to emerging technologies

Key Skills

Ranked by relevance