Track This Job
Add this job to your tracking list to:
- Monitor application status and updates
- Change status (Applied, Interview, Offer, etc.)
- Add personal notes and comments
- Set reminders for follow-ups
- Track your entire application journey
Save This Job
Add this job to your saved collection to:
- Access easily from your saved jobs dashboard
- Review job details later without searching again
- Compare with other saved opportunities
- Keep a collection of interesting positions
- Receive notifications about saved jobs before they expire
AI-Powered Job Summary
Get a concise overview of key job requirements, responsibilities, and qualifications in seconds.
Pro Tip: Use this feature to quickly decide if a job matches your skills before reading the full description.
Position Overview
Seeking a meticulous Engineer - Site Reliability who will support the Presight delivery model that empowers product & technology teams to develop & deliver high-quality products, improve platform infrastructure and strengthen the reliability of products and solutions.
You play a key role in defining & establishing the delivery model deployed in the development of cutting edge, next-gen analytics solutions & services at Presight.
Key Responsibilities
- Manage the infrastructure required to run our solutions deployed to public or private cloud (air-gapped).
- Analyze service performance, identify bottlenecks, and provide measurable improvement plans.
- Maintain the environment’s health by continuously monitoring technical and business metrics, configuring alerts for potential issues, and proactively addressing risks to prevent disruptions
- Deploy application updates with minimal disruption to services
- Identify, evaluate, and conduct proof-of-concepts for new technologies.
- Contribute to the knowledge base.
- Review and modify CI/CD principles and service maturity iteratively, striving for continuous improvement
Requirements
- 5+ years of experience in managing Kubernetes clusters.
- 5+ years of experience in configuring and using monitoring/observability platforms
- Familiarity with at least one type of database
Experience
- 5+ years in a SRE/DevOps/Sysadmin/Platform Engineer role
Mandatory skills:
- Strong background in Linux/Unix Administration
- Solid hands-on experience deploying and operating Kubernetes or Openshift clusters
- Experience configuring and maintaining monitoring and observability solutions
- Ability to troubleshoot and resolve complex production issues efficiently, including performing root
- cause analysis and restoring services quickly during high-pressure incidents or critical outages
- Experience in backing up and restoring various systems
- Working together with project managers and solution architects while serving as subject matter
- Experts
- Implementing basic network security (e.g. configuring VPCs, firewalls/security groups, etc.)
- Understand the dependencies of various GPU cards, and upgrade container images as needed in
- order to ensure compatibility
- Deploy and operate products provided by third party providers
- Creating releases together with the development team and deploying release packages to all
- required environments
Bonus Skills:
- Good understanding of typical system architecture and interaction between its components
- Experience automating tasks using infrastructure-as-code tools, e.g. Ansible, Terraform
- Thorough understanding of a company's systems, including auxiliary components like caching
- systems (e.g., Redis, Memcached) and message queues (e.g., RabbitMQ, Kafka)
- Good understanding of databases, e.g. Postgres, Elasticsearch, Clickhouse
- Basic scripting
- Working knowledge of OAuth 2.0, OpenID/OpenID-Connect, SAML 2.0, Kerberos, LDAP
Join us at Presight, where we offer a culture of innovation, outstanding career growth opportunities, and competitive rewards. If you're eager to conquer new frontiers in AI and thrive in a dynamic environment, we welcome you to our community.
Key Skills
Ranked by relevanceReady to apply?
Join Presight and take your career to the next level!
Application takes less than 5 minutes

