Multiple positions for AI Engineers - Building the Future of Secure AI Infrastructure
A stealth-stage AI infrastructure company developing technology that enables enterprises to deploy and run large language models securely with their own confidential data.
A Kubernetes-based AI platform provides ready-to-use endpoints for retrieval-augmented generation (RAG), intelligent agents, and model serving all through standard APIs. This allows organizations to leverage cutting-edge AI while maintaining complete control over privacy, compliance, and data security.
Applied Research Scientist / Engineer (Agents & Retrieval)
You’ll collaborate directly with the Head of AI to develop advanced retrieval-augmented generation systems and intelligent agent architectures. The ideal candidate combines research depth with engineering rigor, building production-ready AI systems inspired by the latest research.
Responsibilities:
- Architect and optimize RAG pipelines, including document processing, embedding models, and retrieval strategies
- Build agentic systems with reasoning, planning, and tool orchestration capabilities
- Design evaluation frameworks for retrieval, generation quality, and agentic task completion
- Track emerging research and incorporate relevant innovations into production systems
- Collaborate with users and internal teams to translate real-world needs into scalable AI solutions
Requirements:
- Several years of experience in information retrieval, NLP, or applied ML
- Understand RAG systems end-to-end
- Value rigorous quantitative evaluation
- Balance research curiosity with real-world engineering pragmatism
- Write clean, reproducible, and well-tested code
Nice to have
- Publications in top ML/NLP venues (e.g., NeurIPS, ICLR, EMNLP, ACL)
- Experience fine-tuning encoders or generating synthetic evaluation data
- Contributions to open-source projects
AI / ML Engineer (Inference)
As the first dedicated Inference Engineer, you’ll own the model serving layer of our AI platform ensuring LLMs and other models run efficiently, reliably, and securely. You’ll optimize inference performance, extend support to new model architectures, and build robust deployment pipelines.
Responsibilities:
- Design and optimize inference infrastructure using vLLM, SGLang, and custom serving engines
- Improve latency, throughput, and GPU utilization across diverse hardware environments
- Expand model coverage to include diffusion, ASR/TTS, and multimodal architectures
- Develop robust deployment and monitoring pipelines on Kubernetes
- Collaborate on GPU resource management, autoscaling, and isolation in secure environments
Requirements:
- Have 5+ years of experience in ML systems, inference, or distributed engineering
- Are fluent in PyTorch, CUDA, and GPU/HPC optimization (NCCL, NVLink, InfiniBand, etc.)
- Enjoy deep debugging and performance tuning
- Understand transformer architectures and inference optimization
- Thrive in small, high-autonomy teams
Nice to have
- Experience with model compression (e.g., GPTQ, AWQ)
- Open-source contributions
- Experience with Kubernetes or multimodal models
✉ Email: [email protected]
☎ Contact Number: +44(0)1915949744 (1587) / +358 753 266586
Key Skills
Ranked by relevance
Related Jobs
3 roles aligned with this opportunity
Senior Engineering Manager
2026-05-27
Graduate software engineer
2026-05-20
Senior Cloud Software Engineer
2026-06-10
- Posted
- Oct 20, 2025
- Type
- Full-time
- Level
- Director
- Location
- Helsinki
- Company
- Tenth Revolution Group
Industries
Categories
Related Jobs
3 roles aligned with this opportunity
Senior Engineering Manager
2026-05-27
Graduate software engineer
2026-05-20
Senior Cloud Software Engineer
2026-06-10