Thri5
AI / Machine Learning Engineer
Thri5Canada13 days ago
Full-timeRemote FriendlyEngineering, Information Technology

About Thri5

Thri5 is the AI-powered System of Actions for the modern retailer.


Despite massive investments in planning, forecasting, and analytics, retailers still face the same operational issues—out-of-stocks, bad master data, margin leakage, and inconsistent execution across stores and channels. The gap isn’t in intelligence; it’s in execution.


Thri5 continually scans data across the business, detects and prioritizes opportunities, evaluates impact, and orchestrates execution through both humans and AI agents. From store managers and DC leaders to category and supply chain teams, Thri5 routes the right actions to the right owners—with clear context, recommendations, and workflows—closing the gap between plan and real-world performance.


Our vision is to become the trusted AI operating layer for retail execution, making every operator 10x more effective and freeing them to focus on what matters most: serving customers and growing the business.


Founded by a team with deep retail and retail-technology experience, Thri5 is venture-backed by some of Canada’s most prominent VC and angel investors.


Your Role

As an AI / Machine Learning Engineer at Thri5, you’ll help build the agent layer that powers our System of Actions. You’ll design and implement multi-agent Co-pilot systems that orchestrate complex workflows, call tools and APIs, and automate operational tasks at scale. You’ll also develop deterministic, data-driven detection models to reliably identify operational issues and opportunities—and then layer LLM-based capabilities on top to generate high-quality alerts, recommended actions, and explanations grounded in real retail data.

You’ll work closely with the founding team to turn messy, real-world retail problems into robust, production agent workflows that operators actually trust and use every day.

Key Responsibilities


Agent Framework & Orchestration

  • Design and build the core frameworks that power Thri5’s AI agents: task decomposition, routing, tool calling, multi-step workflows, and human-in-the-loop escalation.
  • Implement agents that coordinate across operators (store, DC, category, supply chain) and systems to drive real actions, not just insights.


LLM-Driven Intelligence

  • Develop and fine-tune LLM-based components to detect anomalies and opportunities that impact commercial and operational performance.
  • Build prompt, retrieval, and grounding patterns that produce reliable behaviour in noisy, real-world data.
  • Combine deterministic signals with LLMs to produce contextual narratives, explanations, and recommended actions.


Deterministic Detection & Scoring

  • Design and implement deterministic and semi-deterministic detection models (e.g., statistical anomaly detection, rules + ML hybrids, scoring systems) to identify out-of-stocks, bad master data, and execution gaps.
  • Build evaluation frameworks (precision/recall, false positive control, business impact, backtests) to ensure detections are trustworthy and stable in production.
  • Collaborate with product and domain experts to translate heuristics and business rules into robust, maintainable detection logic.


Data & Recommendation Pipelines

  • Build and optimize pipelines that leverage real-time and batch customer data (transactions, inventory, operations) to power agent decisions and recommendations.
  • Own end-to-end ML workflows—data preprocessing, feature engineering, training, evaluation, and production inference.


MLOps / LLMOps & Reliability

  • Implement robust MLOps practices for CI/CD, experimentation, and monitoring of models and agents.
  • Instrument and monitor agent behaviour (latency, cost, quality, safety) and continuously iterate to improve performance, accuracy, and scalability.


Collaboration & Product

  • Partner with product and engineering to translate customer problems into concrete agent capabilities and use cases.
  • Contribute to technical decision-making and architecture as we scale the Thri5 platform.

Requirements

  • AI Fluency: 5+ years of software development experience with deep exposure to modern AI/ML, including both classical ML / data science and LLMs, GPT-style models, and agent/tool-calling ecosystems.


  • ML / Data Science Proficiency: Strong background in supervised/unsupervised learning and anomaly detection, with hands-on experience designing deterministic or semi-deterministic detection systems (statistical models, rules + ML, scoring). Comfortable with model evaluation, experimentation, and translating business heuristics into data-driven logic.


  • Programming & Frameworks: Proficient in Python and familiar with ML frameworks such as PyTorch or TensorFlow. Experience with GenAI tooling (e.g., LangChain, LlamaIndex, custom agent frameworks) and vector databases is an asset.


  • Data Handling: Comfortable working with large-scale datasets, complex schemas, and event-driven data. Strong SQL skills and experience building data pipelines into production systems.


  • Startup Mindset: Thrive in a fast-paced, ambiguous environment; able to bring structure to open-ended problems. Enjoy high accountability and end-to-end ownership from idea to production impact.


  • Teamwork: Collaborative, low-ego, and comfortable working across a small, high-performing team (founders, engineers, product, and customers).


  • Domain Experience (Nice to Have): Experience in retail, supply chain, predictive analytics, time-series modeling, or operational optimization.


  • Education: Bachelor’s, Master’s or Ph.D. in Computer Science, Data Science, Machine Learning, or a related field (or equivalent practical experience).

Key Skills

Ranked by relevance