-
5Blue Software

Senior DevOps Engineer

5Blue Software
Ukraine · Full-time · Mid-Senior

Hi there,

We are looking for a Senior DevOps Engineer to help design, automate, and manage large-scale AI infrastructure.

You will work closely with network engineers, IT teams, and AI/ML specialists to build secure, resilient, and scalable solutions across GPU, CPU, storage, and networking layers.

 

About project:

Our client is shaping the future of Artificial Intelligence by providing high-performance, scalable infrastructure. Their mission is to drive innovation across industries by delivering reliable platforms for training and deploying AI models, including LLMs, computer vision, and generative AI.

They design and operate AI-optimized data centers with advanced GPU infrastructure, secure networking, and enterprise-level reliability—helping their customers seamlessly transition from AI concepts to full-scale production.

 

Qualifications:

  • 6/7+ years of experience in DevOps, Infrastructure, or Site Reliability Engineering.
  • Expertise in Kubernetes, Docker, Terraform, and infrastructure-as-code (IaC) methodologies.
  • Strong scripting and automation skills in Python, Bash, or Go.
  • Deep knowledge of high-performance networking, including InfiniBand, RoCE, and out-of-band (OOB) management.
  • Hands-on experience with firewall security (FortiGate preferred), routing policies, and network segmentation.
  • Experience in distributed computing, high-performance computing (HPC), or AI-driven environments is a strong plus.


Key Responsibilities:

  • Infrastructure Automation: Design and automate infrastructure provisioning, configuration, and management using Terraform, Ansible, or similar tools.
  • Containerized Environments: Develop and maintain high-availability AI workloads using Docker and Kubernetes.
  • GPU/CPU Management: Automate provisioning, backup, and recovery processes for GPU clusters and CPU infrastructure.
  • Security & Networking: Implement and manage network segmentation, IP access controls, and secure API authentication workflows.
  • Firewall & Routing: Collaborate with the Networking team to integrate automated firewall configurations, routing logic, and multi-ISP resilience strategies.
  • Monitoring & Optimization: Monitor system performance, set up alerts for anomalies in GPU, CPU, and storage usage, and optimize resource utilization.
  • API & Automation: Write and manage internal API calls and integrations to automate access, provisioning, and scaling operations.
  • Remote Access & OOB Management: Support the setup of out-of-band management systems and remote access controls for data center infrastructure.
  • Disaster Recovery: Contribute to disaster recovery planning, implement automated rollback testing, and ensure failover strategies.
  • System Monitoring Tools: Utilize tools like IPMI and Redfish to automate system monitoring and data collection.
  • Documentation & Best Practices: Maintain clear, comprehensive documentation for workflows, procedures, and infrastructure design.


What we offer:

  • Flexible Work Environment: Opportunity to work remotely or in our safe office in Kyiv.
  • Premium Medical Insurance: Comprehensive health insurance to ensure your well-being.
  • 1:1 English Classes: Individual English language training to enhance your communication skills.
  • Great Team: Work with a supportive, collaborative, and dynamic international team.
  • Equipment Provided: All necessary equipment supplied for efficient job performance.
  • Annual Vacation: 18 days of paid vacation and 7 days of paid sick leave.
  • Commitment to Hiring Ukrainians: We are dedicated to hiring Ukrainian talent and promoting Ukraine as a fantastic place to work.
  • Flexible payment system, which allows you to withdraw funds in one click and has about twenty withdrawal options.


Key Skills

Ranked by relevance

storage computer vision firewall docker devops bash ai
Login to Apply
Posted
Apr 08, 2025
Type
Full-time
Level
Mid-Senior
Location
Ukraine

Industries

Software Development

Categories

Engineering Information Technology

Related Jobs

3 roles aligned with this opportunity

View all jobs
View Job Details
EPAM Systems
Related

DevOps Engineer (AWS)

2026-05-27

Full-time
Associate
Argentina
Software Development
Engineering
View Job Details
leverbox
Related

DevOps Engineer (AWS) con orientación Backend

2026-05-27

Full-time
Mid-Senior
Argentina
Software Development
Engineering
View Job Details
EPAM Systems
Related

Cloud & DevOps Trainee

2026-05-27

Internship
Internship
Ukraine
Software Development
Engineering