Humai
Senior Machine Learning Engineer
HumaiUnited Arab Emirates2 days ago
Full-timeRemote FriendlyEngineering, Information Technology

Senior Machine Learning Engineer - AI Systems


About Humai


At Humai, we're pioneering the future of AI-powered products that seamlessly integrate into people's daily lives. We believe in building intelligent systems that not only push the boundaries of what's technically possible but also create meaningful value for our users. Our team combines deep technical expertise with a product-first mindset, focusing on shipping features that users love while maintaining the highest standards of AI safety and reliability.


We're a fast-moving, engineering-driven company that values innovation, ownership, and impact. Our culture emphasizes rapid iteration, data-driven decision making, and collaborative problem-solving.


About the Role


We're looking for a Senior Machine Learning Engineer to lead end-to-end model engineering for production AI features. You'll design and optimize LLM systems, leverage cutting-edge frameworks like DSPy and LangGraph, and drive continuous improvements in latency, quality, and cost while shipping real products that users love.

This is a high-impact role where you'll work directly with our founding team to shape our AI architecture and establish best practices that will scale with our growth. You'll have the autonomy to make critical technical decisions and the resources to implement them at scale.


What You'll Build


LLM Systems & Architecture

  • Design and implement RAG systems, autonomous agents, and intelligent planners
  • Build robust tool-use capabilities with proper grounding and guardrails
  • Create comprehensive evaluation frameworks for model performance
  • Develop multi-modal capabilities (vision-language models, document processing)


Advanced Frameworks & Optimization

  • Leverage DSPy for declarative prompting with auto-tuning capabilities
  • Build complex stateful workflows using LangGraph for multi-step agent behaviors
  • Implement fine-tuning strategies (LoRA/QLoRA), model distillation, and quantization (AWQ/GPTQ)
  • Optimize inference performance through intelligent batching, caching, and streaming
  • Apply speculative decoding, continuous batching, and KV-cache optimization techniques


Production ML Engineering

  • Deploy and maintain reliable inference services with comprehensive monitoring
  • Implement quality controls, drift detection, and hallucination safeguards
  • Build traditional ML components (classifiers, regressors, rankers) for routing and scoring
  • Design robust evaluation pipelines with metrics like EM/F1/ROUGE/BLEU and human evaluation
  • Establish MLOps best practices including blue-green deployments, canary releases, and A/B testing infrastructure


Data & Experimentation

  • Curate high-quality training and evaluation datasets
  • Design and execute A/B tests to measure real-world impact
  • Close feedback loops using production telemetry and user interaction data
  • Implement synthetic data generation strategies for training
  • Ensure GDPR/CCPA compliance and data lineage tracking


Advanced RAG & Knowledge Systems

  • Build graph RAG and knowledge graph integration systems
  • Implement hierarchical retrieval and re-ranking systems
  • Optimize dynamic chunking and context window strategies
  • Develop multi-hop reasoning and chain-of-thought retrieval capabilities


Production AI Systems at Scale

  • Design prompt injection defense and adversarial robustness measures
  • Architect multi-tenancy solutions for SaaS AI products
  • Implement real-time streaming and WebSocket interfaces for chat systems
  • Drive cost optimization through model routing, caching strategies, and request batching


What We're Looking For


Required Experience

  • Shipped AI products end-to-end with demonstrable user impact
  • Deep hands-on experience with LLMs (OpenAI/Anthropic/open-source), DSPy, and LangGraph
  • Strong computer science fundamentals: algorithms, data structures, and mathematical optimization
  • Proven expertise in fine-tuning (LoRA/QLoRA), model distillation, and inference optimization
  • Production-grade Python and PyTorch skills with emphasis on code quality and testing
  • Experience building evaluation frameworks and making informed quality/latency/cost trade-offs
  • Practical knowledge of AI safety, privacy, and compliance requirements


Bonus Points

  • Deep experience with Graph RAG systems: Neo4j, knowledge graph construction, entity resolution, graph embeddings, and GraphML frameworks
  • Advanced algorithm implementation: approximate algorithms (LSH, MinHash), streaming algorithms (Count-Min Sketch, HyperLogLog), and probabilistic data structures
  • Experience with cross-modal matching algorithms for vision-language alignment and multi-modal retrieval
  • Experience with retrieval systems: embedding models, chunking strategies, hybrid search, vector databases (pgvector, Pinecone, Qdrant, Weaviate)
  • Knowledge of agent frameworks, function calling, and multi-actor graph systems
  • MLOps experience with observability tools (W&B/MLflow), model registries, and feature stores
  • High-performance serving expertise (Triton, vLLM, Text Generation Inference, Ray Serve)
  • Hardware optimization knowledge (A100/H100 GPUs, cost-efficient CPU inference, Apple MLX for on-device inference)
  • Cloud platform experience (GCP/Azure): serverless ML, container orchestration (GKE/AKS), auto-scaling endpoints
  • Experience with disaster recovery, fallback strategies, and model versioning infrastructure
  • Knowledge of federated learning concepts and custom tokenization for domain-specific use cases


Ready to build the future of AI? Join us in creating the next generation of AI-powered products that make a real difference for users.

Key Skills

Ranked by relevance