Principal ML Engineer (Infra/hardware)

Neurons Lab

Finland · Full-time · Mid-Senior

About The Project

We're looking for an experienced ML Infrastructure Engineer who has successfully implemented large-scale ML infrastructure optimization projects. The primary focus is migrating and optimizing computer vision models from Nvidia GPU-based infrastructure to AWS Inferentia/Trainium while getting performance boost and cost reduction.

Current Infrastructure:

ML Models: RetinaFace, OpenPose, CLIP, and other CV models
Hardware: A10/T4 GPUs on EKS
Serving: Triton Inference Server
Orchestration: Mix of Kubernetes and Ray

Stage: Presale and Delivery

Duration: 2 months (preliminary)

Capacity: part-time (20h/week)

Areas of Responsibility

Technical Leadership:

Lead the architecture design for ML infrastructure modernization
Define compilation and optimization strategies for model migration
Establish performance benchmarking framework
Set up monitoring and alerting for the new infrastructure

Performance Optimization:

Implement efficient model compilation pipelines for Inferentia2
Optimize batch processing and memory layouts
Fine-tune model serving configurations
Ensure latency requirements are met across all services

Cost Optimization:

Analyze and optimize infrastructure costs
Implement efficient resource allocation strategies
Set up cost monitoring and reporting
Achieve target cost reduction while maintaining performance

Skills

Proven track record of ML infrastructure optimization projects
Hands-on experience with AWS Neuron SDK and Inferentia/Trainium deployment
Deep expertise in PyTorch model optimization and compilation
Experience with high-throughput computer vision model serving
Production experience with both Kubernetes and Ray for ML workloads

Knowledge

Model Optimization Expertise:

Deep understanding of ML model architecture optimization
Experience with model compilation techniques for specialized hardware (Inferentia/Trainium)
Proficiency in optimizing computer vision models (CNN architectures)
Knowledge of model serving optimization patterns

Performance Optimization:

Advanced understanding of ML model inference optimization
Expertise in batch processing strategies
Memory layout optimization for vision models
Experience with pipeline parallelism implementation
Proficiency in latency/throughput optimization techniques

Hardware Acceleration:

Deep knowledge of ML accelerator architectures
Understanding of hardware-specific optimizations
Experience with model compilation for specialized chips
Proficiency in memory access pattern optimization

Performance Analysis:

Proficiency in ML model profiling tools
Experience with performance bottleneck identification
Knowledge of performance monitoring techniques
Ability to analyze and optimize inference patterns

Nice to Have:

Experience with Ray architecture for ML serving
Knowledge of distributed ML systems
Understanding of ML pipeline optimization
Experience with model quantization techniques

Experience

Model Optimization (4+ years):

Proven track record of optimizing large-scale ML inference systems
Successfully implemented hardware-specific model optimizations
Demonstrated experience with computer vision model optimization
Led projects achieving significant performance improvements

Proven Results (Examples):
Successfully optimized computer vision models similar to RetinaFace/CLIP
Achieved significant cost reduction while maintaining performance
Implemented efficient batch processing strategies
Developed performance monitoring and optimization frameworks

Key Skills

Ranked by relevance

computer vision kubernetes aws pytorch

Related Jobs

3 roles aligned with this opportunity

View all jobs

GCP Cloud Engineer

2025-12-02

Full-time

Entry

Estonia

IT Services

Engineering

AI Cloud Solution Architect & Engineer

2025-11-28

Part-time

Mid-Senior

Lithuania

IT Services

Engineering

Senior Embedded Machine Learning Engineer (C++)

2026-05-28

Full-time

Mid-Senior

Finland

Software Development

Information Technology

🇫🇮

Country Guide

Finland

Nordic stability with strong tech and gaming scene

Posted: Jan 23, 2025
Type: Full-time
Level: Mid-Senior
Location: Finland
Company: Neurons Lab

Industries

IT Services IT Consulting

Related Jobs

3 roles aligned with this opportunity

View all jobs

GCP Cloud Engineer

2025-12-02

Full-time

Entry

Estonia

IT Services

Engineering

AI Cloud Solution Architect & Engineer

2025-11-28

Part-time

Mid-Senior

Lithuania

IT Services

Engineering

Senior Embedded Machine Learning Engineer (C++)

2026-05-28

Full-time

Mid-Senior

Finland

Software Development

Information Technology

Principal ML Engineer (Infra/hardware)

Key Skills

Related Jobs

GCP Cloud Engineer

AI Cloud Solution Architect & Engineer

Senior Embedded Machine Learning Engineer (C++)

Related Jobs

GCP Cloud Engineer

AI Cloud Solution Architect & Engineer

Senior Embedded Machine Learning Engineer (C++)

Cookie Settings