Track This Job
Add this job to your tracking list to:
- Monitor application status and updates
- Change status (Applied, Interview, Offer, etc.)
- Add personal notes and comments
- Set reminders for follow-ups
- Track your entire application journey
Save This Job
Add this job to your saved collection to:
- Access easily from your saved jobs dashboard
- Review job details later without searching again
- Compare with other saved opportunities
- Keep a collection of interesting positions
- Receive notifications about saved jobs before they expire
AI-Powered Job Summary
Get a concise overview of key job requirements, responsibilities, and qualifications in seconds.
Pro Tip: Use this feature to quickly decide if a job matches your skills before reading the full description.
Company Overview:
Open Innovation AI is a global technology company that specializes in developing advanced
solutions for managing AI workloads. Its flagship product, the Open Innovation Cluster
Manager (OICM) orchestrates complex AI tasks efficiently across diverse infrastructures. The
platform is hardware-agnostic, optimized for various GPUs and accelerators hardware, and
facilitates seamless integration and scalability for enterprise AI applications. Open Innovation
AI focuses on optimizing and simplifying AI workload management and making AI
technologies accessible to organizations of all sizes. With its innovative solutions, companies
can reduce operational costs, accelerate time to value, and maximize their return on investment,
ensuring that their AI strategies contribute directly to enhanced business outcomes
Role Overview:
The Storage Support Engineer – L2 is responsible for supporting, monitoring, and maintaining the storage infrastructure across large on-prem deployments. This includes high-performance parallel filesystems used in HPC environments, distributed storage platforms such as Ceph, object storage and S3-compatible systems, NFS and file-sharing services, block storage, vSAN, and other storage components that support compute, Kubernetes, and GPU/AI workloads.
The Storage Support Engineer works collaboratively with the Service Desk and Systems Support and Engineering teams to support incident handling, change activities, and overall operational stability of the storage environment. The role focuses on ensuring data integrity, storage availability, and consistent performance through proactive monitoring, structured troubleshooting, and execution of approved operational tasks.
Roles & Responsibilities
- Monitor storage platforms—including HPC parallel filesystems, Ceph clusters, object/S3 systems, NFS services, block storage, and vSAN—for health, performance, capacity, and fault conditions.
- Respond to storage-related incidents, perform diagnostics, and troubleshoot issues across filesystem layers, storage nodes, clusters, disks, metadata services, and client paths.
- Conduct routine health checks on storage components such as nodes, OSDs, disks, controllers, pools, mounts, and service endpoints to ensure stable and reliable operation.
- Execute approved configuration changes, capacity expansions, firmware updates, and maintenance tasks following established Change Management procedures.
- Work closely with the Service Desk for accurate triage, timely ticket updates, and structured communication during incident response.
- Collaborate with Systems Support and Engineering for cross-domain investigations or deeper technical analysis when required.
- Collect and interpret logs, metrics, telemetry, and diagnostic outputs from storage platforms to support incident resolution and root-cause investigations.
- Support backup, replication, data protection, and recovery workflows as applicable to the deployed storage technologies.
- Maintain updated documentation including storage diagrams, configuration baselines, operational runbooks, and tuning parameters.
- Participate in validation of upgrades, hardware replacements, cluster expansions, and storage system reconfigurations in production or staging environments.
- Identify recurring storage issues and contribute to Problem Management by providing technical insights and recommending corrective or preventive actions.
- Follow approved access controls, export rules, S3 policies, quotas, and storage security standards to maintain secure and compliant operations.
- Participate in on-call rotations as required to support 24/7 operational continuity for storage services.
- Contribute to continuous improvement by enhancing monitoring, refining SOPs, im-proving documentation, and suggesting operational optimizations.
Required Qualification, Experience, Competence and Certifications:
- Bachelor’s degree in Computer Science, Information Technology, Engineering, or a related field.
- 3–6 years of experience in storage operations, storage support, HPC storage environments, or L2 infrastructure roles.
- Experience with high-performance parallel filesystems, distributed storage platforms, object/S3 storage systems, NFS services, block storage, and vSAN.
- Strong understanding of storage performance concepts including throughput, latency, IOPS, and metadata operations.
- Hands-on experience diagnosing cluster health issues, disk or node failures, replication or rebalancing problems, and client access issues.
- Familiarity with networking fundamentals relevant to storage (MTU, bonding/LACP, multi-pathing, IB/RoCE considerations).
- Experience working within ITIL-aligned operational processes, including Incident, Change, and Problem Management.
- Ability to analyze logs, cluster metrics, health outputs, and telemetry for troubleshooting and incident resolution.
- Strong documentation skills, including the ability to maintain runbooks, SOPs, baselines, and configuration records.
- Relevant certifications such as RHCSA/RHCE, Ceph certifications, VMware VCP, or vendor storage trainings are preferred.
- Excellent communication skills and ability to collaborate with Service Desk, Systems Support, and Engineering teams.
Reporting To: Manager – Technical Operations
Key Skills
Ranked by relevanceReady to apply?
Join Open Innovation AI and take your career to the next level!
Application takes less than 5 minutes

