HPC Infrastructure Engineering

KBC Technologies GroupSweden5 days ago

ContractAnalyst

Track This Job

Add this job to your tracking list to:

Monitor application status and updates
Change status (Applied, Interview, Offer, etc.)
Add personal notes and comments
Set reminders for follow-ups
Track your entire application journey

Save This Job

Add this job to your saved collection to:

Access easily from your saved jobs dashboard
Review job details later without searching again
Compare with other saved opportunities
Keep a collection of interesting positions
Receive notifications about saved jobs before they expire

AI-Powered Job Summary

Get a concise overview of key job requirements, responsibilities, and qualifications in seconds.

Pro Tip: Use this feature to quickly decide if a job matches your skills before reading the full description.

Role Description and Key Deliverables

We are seeking a passionate HPC engineer. The ideal candidate will have extensive hands-on experience making an impact with HPC technology, delivering HPC services to a high quality, and able to relate to the scientific community and work closely with users to make the best use of research computing services.

The HPC landscape is continually evolving. You will need the skills to help build and operate industry-leading capabilities, including application build frameworks, containerised applications and cloud software-as-a-service. Automated deployment is a key feature, and you will need to be comfortable with DevOps processes and delivering consistency through automation and infrastructure-as-code.

Key Responsibilities

Design, implement, and maintain robust platform infrastructure using Infrastructure as Code (IaC) tools such as Terraform, ensuring secure and scalable environments in our private cloud ecosystem.
Develop, deliver and operate research computing services and applications.
Take a Site Reliability Engineering approach to HPC services, managing the development deployment, monitoring and incident response end-to-end.
Solve complex technical problems, both with SCP services and the user’s use of them.

Essential Knowledge, Skills, and Experience

10+ years of hands-on experience operating, crafting or engineering large-scale computing environments, such as HPC, HTC or BC
Drive innovative computational solutions and exploit emerging technologies
Experience of administration of large-scale cluster and server computing and related
Software (e.g. Slurm, LSF, Grid Engine)
Hands-on experience working in a DevOps team and using agile methodologies
Operating and consuming virtualized private cloud resources (e.g. OpenStack)
Understanding of Linux system administration, the TCP/IP stack, and storage subsystems
Experience in implementing and administering large-scale parallel filesystems (e.g. Weka, GPFS, Lustre)
Proven experience of using configuration management (e.g. ansible, salt, puppet) and technology frameworks in IT operations
Experience of developing and managing relationships with 3rd party suppliers
Scripting and tool development for HPC & DevOps style platform operations using bash and Python

Desirable Skills and Knowledge

Scientific degree, and/or experience in computationally intensive analysis of scientific data
Previous experience in high performance computing (HPC) environments, especially at large scales (>10,000 cores)
Operation and configuration of public cloud computing infrastructure (e.g. AWS, Azure, GCP) is a plus
Managing a virtualized private cloud environment (e.g. OpenStack) is a plus
Container technology (e.g. LXD, Singularity, Docker, Kubernetes) is a plus
Demonstrated development experience with a variety of programming languages, tools, and technologies (Java/C++, Python/Ruby/Perl, SQL, AWS) is a plus
Experience with Hashicorp tools like terraform, vault, consul and nomad is a plus
Working experience with high-speed networks (e.g. InfiniBand)

Key Skills

Ranked by relevance

Ready to apply?

Join KBC Technologies Group and take your career to the next level!

Application takes less than 5 minutes

Apply