Track This Job
Add this job to your tracking list to:
- Monitor application status and updates
- Change status (Applied, Interview, Offer, etc.)
- Add personal notes and comments
- Set reminders for follow-ups
- Track your entire application journey
Save This Job
Add this job to your saved collection to:
- Access easily from your saved jobs dashboard
- Review job details later without searching again
- Compare with other saved opportunities
- Keep a collection of interesting positions
- Receive notifications about saved jobs before they expire
AI-Powered Job Summary
Get a concise overview of key job requirements, responsibilities, and qualifications in seconds.
Pro Tip: Use this feature to quickly decide if a job matches your skills before reading the full description.
About Presight
Presight is an ADX-listed public company with Abu Dhabi based G42 as its majority shareholder and is the region’s leading big data analytics company powered by GenAI. It combines big data, analytics, and AI expertise to serve every sector, of every scale, to create business and positive societal impact. Presight excels at all-source data interpretation to support insight-driven decision-making that shapes policy and creates safer, healthier, happier, and more sustainable societies. Today, through its range of GenAI-driven products and solutions, Presight is bringing Applied AI to the private and public sector, enabling them to realize their AI strategy and ambitions faster.
Position Overview
Seeking a meticulous Engineer - Site Reliability who will support the Presight delivery model that empowers product & technology teams to develop & deliver high-quality products, improve platform infrastructure and strengthen the reliability of products and solutions.
You play a key role in defining & establishing the delivery model deployed in the development of cutting edge, next-gen analytics solutions & services at Presight.
Key Responsibilities
- Manage the infrastructure required to run our solutions deployed to public or private cloud (air-gapped).
- Analyze service performance, identify bottlenecks, and provide measurable improvement plans.
- Maintain the environment’s health by continuously monitoring technical and business metrics, configuring alerts for potential issues, and proactively addressing risks to prevent disruptions
- Deploy application updates with minimal disruption to services
- Identify, evaluate, and conduct proof-of-concepts for new technologies.
- Contribute to the knowledge base.
- Review and modify CI/CD principles and service maturity iteratively, striving for continuous improvement
Requirements
- 5+ years of experience in managing Kubernetes clusters.
- 5+ years of experience in configuring and using monitoring/observability platforms
- Familiarity with at least one type of database
Experience
- 5+ years in a SRE/DevOps/Sysadmin/Platform Engineer role
Skills
Mandatory skills:
- Strong background in Linux/Unix Administration
- Solid hands-on experience deploying and operating Kubernetes or Openshift clusters
- Experience configuring and maintaining monitoring and observability solutions
- Ability to troubleshoot and resolve complex production issues efficiently, including performing root
- cause analysis and restoring services quickly during high-pressure incidents or critical outages
- Experience in backing up and restoring various systems
- Working together with project managers and solution architects while serving as subject matter
- Experts
- Implementing basic network security (e.g. configuring VPCs, firewalls/security groups, etc.)
- Understand the dependencies of various GPU cards, and upgrade container images as needed in
- order to ensure compatibility
- Deploy and operate products provided by third party providers
- Creating releases together with the development team and deploying release packages to all
- required environments
Bonus Skills:
- Good understanding of typical system architecture and interaction between its components
- Experience automating tasks using infrastructure-as-code tools, e.g. Ansible, Terraform
- Thorough understanding of a company's systems, including auxiliary components like caching
- systems (e.g., Redis, Memcached) and message queues (e.g., RabbitMQ, Kafka)
- Good understanding of databases, e.g. Postgres, Elasticsearch, Clickhouse
- Basic scripting
- Working knowledge of OAuth 2.0, OpenID/OpenID-Connect, SAML 2.0, Kerberos, LDAP
Join us at Presight, where we offer a culture of innovation, outstanding career growth opportunities, and competitive rewards. If you're eager to conquer new frontiers in AI and thrive in a dynamic environment, we welcome you to our community.
What working at Presight offers
Culture: An open, diverse and inclusive environment with a global vision that encourages personal growth and focuses on ground-breaking, industry-first innovations.
Career: Accelerate your career through high-impact projects and access to resources for continuous growth and learning opportunities.
Rewards: A competitive remuneration package with a host of perks including healthcare, education support, leave benefits and more.
Key Skills
Ranked by relevanceReady to apply?
Join Presight and take your career to the next level!
Application takes less than 5 minutes