Shory
Machine Learning Engineer
ShoryUnited Arab Emirates1 day ago
Full-timeInformation Technology

Shory is the soft revolution in the Insurtech market. Welcome to a new age where insurance actually empowers its customers. We use technology to serve our customers and create ease of mind and trustworthiness around insurance needs. With Shory, a new time has begun.


We are seeking a seasoned Machine Learning Engineer to join our AI team in Abu Dhabi, UAE, help design and build the technical backbone of intelligent products. You will develop scalable, cloud-native systems supporting machine learning workflows. The ideal candidate has a strong foundation in software engineering, with experience in one or more of the following: GenAI integration (e.g., OpenAI, AWS Bedrock) or LLMOps. You will work alongside specialists in ML and data to deliver robust, production-grade AI capabilities.


Responsibilities


  • Architect and maintain cloud-native ML/LLM pipelines for training, evaluation, deployment, model registry, and continuous monitoring.
  • Build automated CI/CD workflows for ML and LLM systems, including prompt pipelines, model updates, container builds, and infrastructure deployments.
  • Design and deploy scalable ML and GenAI services using containerized and serverless compute (e.g., Cloud Run, GKE, Kubernetes, Functions).
  • Productionize LLMs through the full lifecycle: fine-tuning, distillation, evaluation, inference optimization, monitoring, and governance.
  • Collaborate with Data Engineering to develop feature stores, data pipelines, RAG pipelines, and vector databases for LLM-powered applications.
  • Implement observability frameworks for LLMs, including: model drift & data drift detection,

hallucination detection latency & cost monitoring, prompt performance and quality metrics

  • Integrate and evaluate open-source tools and frameworks (MLflow, Ray, LangChain, KServe, Kubeflow, Weights & Biases).
  • Partner with Data Scientists to convert prototypes into reliable, fault-tolerant, enterprise-grade AI services.
  • Implement cloud-level security standards including IAM, secrets management, data encryption, and protected inference pathways.
  • Ensure LLM systems comply with internal AI governance, ethical AI, privacy, and compliance requirements.
  • Maintain transparent documentation, including model cards, audit logs, and deployment traceability.
  • Act as a bridge between experimentation and production, ensuring models and LLM workflows become scalable, observable, and maintainable services.
  • Mentor junior engineers and contribute to cloud and AI engineering standards across the organization.
  • Create detailed architecture diagrams, design documents, runbooks, and troubleshooting guides.


Requirements:


  • Bachelor’s or Master’s degree in Computer Science, Software Engineering or a related field
  • 5+ years in ML engineering, with strong exposure to LLMOps and ML systems architecture.
  • 2+ years of cloud experience, preferably GCP (Cloud Run, GKE, Vertex AI, BigQuery, Cloud Functions).
  • Deep understanding of DevOps/MLOps practices, CI/CD, and infrastructure automation.
  • Proficiency with ML/LLM platforms such as MLflow, Vertex AI, Kubeflow, BentoML, Ray, or similar.
  • Hands-on experience finetuning, deploying, and operating large language models in production
  • Strong skills with orchestration systems (Airflow, Argo), IaC tools (Terraform, Ansible), and Kubernetes.
  • Expert knowledge in Python, PyTorch/TensorFlow, and LLM frameworks (HuggingFace Transformers, vLLM).
  • Solid understanding of: distributed computing, scalable inference, model and prompt versioning & reproducibility
  • API and microservice design
  • Excellent analytical, problem-solving, and communication skills.

Key Skills

Ranked by relevance