-
Neurons Lab

Principal ML Engineer (Infra/hardware)

Neurons Lab
Finland · Full-time · Mid-Senior

About The Project

We're looking for an experienced ML Infrastructure Engineer who has successfully implemented large-scale ML infrastructure optimization projects. The primary focus is migrating and optimizing computer vision models from Nvidia GPU-based infrastructure to AWS Inferentia/Trainium while getting performance boost and cost reduction.

Current Infrastructure:

  • ML Models: RetinaFace, OpenPose, CLIP, and other CV models
  • Hardware: A10/T4 GPUs on EKS
  • Serving: Triton Inference Server
  • Orchestration: Mix of Kubernetes and Ray

Stage: Presale and Delivery

Duration: 2 months (preliminary)

Capacity: part-time (20h/week)

Areas of Responsibility

  • Technical Leadership:
    • Lead the architecture design for ML infrastructure modernization
    • Define compilation and optimization strategies for model migration
    • Establish performance benchmarking framework
    • Set up monitoring and alerting for the new infrastructure
  • Performance Optimization:
    • Implement efficient model compilation pipelines for Inferentia2
    • Optimize batch processing and memory layouts
    • Fine-tune model serving configurations
    • Ensure latency requirements are met across all services
  • Cost Optimization:
    • Analyze and optimize infrastructure costs
    • Implement efficient resource allocation strategies
    • Set up cost monitoring and reporting
    • Achieve target cost reduction while maintaining performance
Skills

  • Proven track record of ML infrastructure optimization projects
  • Hands-on experience with AWS Neuron SDK and Inferentia/Trainium deployment
  • Deep expertise in PyTorch model optimization and compilation
  • Experience with high-throughput computer vision model serving
  • Production experience with both Kubernetes and Ray for ML workloads

Knowledge

  • Model Optimization Expertise:
    • Deep understanding of ML model architecture optimization
    • Experience with model compilation techniques for specialized hardware (Inferentia/Trainium)
    • Proficiency in optimizing computer vision models (CNN architectures)
    • Knowledge of model serving optimization patterns
  • Performance Optimization:
    • Advanced understanding of ML model inference optimization
    • Expertise in batch processing strategies
    • Memory layout optimization for vision models
    • Experience with pipeline parallelism implementation
    • Proficiency in latency/throughput optimization techniques
  • Hardware Acceleration:
    • Deep knowledge of ML accelerator architectures
    • Understanding of hardware-specific optimizations
    • Experience with model compilation for specialized chips
    • Proficiency in memory access pattern optimization
  • Performance Analysis:
    • Proficiency in ML model profiling tools
    • Experience with performance bottleneck identification
    • Knowledge of performance monitoring techniques
    • Ability to analyze and optimize inference patterns
Nice to Have:

  • Experience with Ray architecture for ML serving
  • Knowledge of distributed ML systems
  • Understanding of ML pipeline optimization
  • Experience with model quantization techniques

Experience

  • Model Optimization (4+ years):
    • Proven track record of optimizing large-scale ML inference systems
    • Successfully implemented hardware-specific model optimizations
    • Demonstrated experience with computer vision model optimization
    • Led projects achieving significant performance improvements
  • Proven Results (Examples):
  • Successfully optimized computer vision models similar to RetinaFace/CLIP
  • Achieved significant cost reduction while maintaining performance
  • Implemented efficient batch processing strategies
  • Developed performance monitoring and optimization frameworks

Key Skills

Ranked by relevance

computer vision kubernetes aws pytorch
Login to Apply
Posted
Jan 23, 2025
Type
Full-time
Level
Mid-Senior
Location
Finland

Industries

IT Services IT Consulting

Categories

Engineering Information Technology

Related Jobs

3 roles aligned with this opportunity

View all jobs
View Job Details
Neurons Lab
Related

GCP Cloud Engineer

2025-12-02

Full-time
Entry
Estonia
IT Services
Engineering
View Job Details
Neurons Lab
Related

AI Cloud Solution Architect & Engineer

2025-11-28

Part-time
Mid-Senior
Lithuania
IT Services
Engineering
View Job Details
Scandit
Related

Senior Embedded Machine Learning Engineer (C++)

2026-05-28

Full-time
Mid-Senior
Finland
Software Development
Information Technology