European Tech Recruit
Senior Researcher - LLM Systems
European Tech RecruitSwitzerland5 days ago
Full-timeResearch, Engineering

An internationally recognized technology research organization is expanding its advanced AI systems team in Zurich and is looking for a Senior Researcher in LLM Systems & Inference Architecture.


Based in one of Europe’s most dynamic technology hubs, the Zurich research centre brings together world-class scientists and engineers working on next-generation AI infrastructure. The team focuses on building the underlying systems that enable large-scale AI models to run faster, more efficiently, and at global scale across modern heterogeneous hardware platforms.


This is a rare opportunity to work at the cutting edge of LLM infrastructure, contributing to innovations that directly influence the future of generative AI systems.


Why This Role Is Interesting


Large language models are transforming how software is built and how people interact with technology. However, scaling them efficiently remains one of the most challenging problems in modern computing.


In this role, you will tackle those challenges head-on — working on inference efficiency, system architecture, and hardware–software co-design to enable faster, more scalable AI systems. You’ll collaborate with leading researchers and engineers, contribute to open research, and help shape the infrastructure powering the next generation of AI.


The Zurich research environment offers a strong academic connection, access to cutting-edge hardware platforms, and the freedom to pursue impactful research while building production-grade systems.


The Role


LLM Inference Engine Development


  • Design and implement high-performance inference engines for large language models.
  • Optimize existing open-source and internal inference frameworks.
  • Develop advanced model optimization techniques including:
  • Quantization
  • Sparse attention
  • Key-value cache reuse and memory optimization
  • Improve throughput and latency for transformer and generative AI workloads.


Hardware–Software Co-Design


  • Develop optimized compute kernels targeting heterogeneous accelerator platforms.
  • Enable efficient execution of LLM workloads across specialized AI hardware.
  • Profile end-to-end AI pipelines to identify performance bottlenecks across frameworks, runtime layers, and hardware.
  • Bridge Python-based ML frameworks with high-performance accelerator backends.


Performance Engineering & System Optimization


  • Analyse and improve large-scale AI inference pipelines.
  • Optimize system performance with focus on:
  • Memory bandwidth utilization
  • Compute efficiency
  • Execution scheduling
  • Improve end-to-end system performance across distributed and heterogeneous computing environments.


Research & Ecosystem Contribution


  • Published research in leading machine learning and systems conferences (e.g., ISCA, ASPLOS, MLSys).
  • Contribute to open-source AI infrastructure and tooling ecosystems.
  • Support adoption of advanced AI infrastructure through developer tooling, documentation, and collaboration with external research partners.


What We’re Looking For


Required


  • PhD or Master’s degree in Computer Science, Computer Engineering, or a related discipline.
  • Strong programming skills in Python and C/C++ for system-level development.
  • Hands-on experience with machine learning frameworks such as PyTorch or TensorFlow.
  • Deep understanding of optimization techniques for large-scale neural networks.
  • Experience profiling and optimizing system performance in large AI workloads.


Preferred


  • Experience with heterogeneous computing or accelerator programming (GPU, NPU, or other xPU architectures).
  • Background in kernel development, compiler optimizations, or custom operator implementation.
  • Experience working with multimodal or vision-language models.
  • Publications in top-tier ML or systems conferences or contributions to open-source AI infrastructure.


Personal Profile


  • Strong systems thinker who can bridge AI models with underlying hardware platforms.
  • Research-driven mindset combined with strong engineering execution.
  • Ability to work independently while collaborating with multidisciplinary teams.
  • Clear and effective technical communication skills.


If you are passionate about building the next generation of high-performance AI systems and want to work in a world-class research environment in Zurich, we would love to hear from you.

Qualified candidates are encouraged to apply now or email [email protected].


By applying to this role you understand that we may collect your personal data and store and process it on our systems. For more information please see our Privacy Notice (https://eu-recruit.com/about-us/privacy-notice/)

Key Skills

Ranked by relevance