-
Wiraa

Software Engineer

Wiraa
Canada · Full-time · Associate

About The Company

Our mission is to scale intelligence to serve humanity. We’re training and deploying frontier models for developers and enterprises who are building AI systems to power magical experiences like content generation, semantic search, retrieval-augmented generation (RAG), and autonomous agents. We believe that our work is instrumental to the widespread adoption of artificial intelligence, enabling innovative solutions across various industries. Our commitment to advancing AI technology is reflected in our collaborative environment, cutting-edge research, and focus on impactful applications. We strive to foster a culture of inclusivity, diversity, and continuous learning, ensuring our team members are empowered to contribute their best and drive meaningful change in the AI landscape.

About The Role

Are you energized by building high-performance, scalable, and reliable machine learning systems? Do you want to help define and build the next generation of AI platforms powering advanced NLP applications? We are seeking Members of Technical Staff to join our Model Serving team at Cohere. In this role, you will be responsible for developing, deploying, and operating our AI platform that delivers Cohere’s large language models through user-friendly API endpoints. You will collaborate closely with cross-functional teams to deploy optimized NLP models into production environments characterized by low latency, high throughput, and high availability. Additionally, you will have the opportunity to interface directly with customers, creating customized deployments to meet their specific needs. This position offers a unique chance to work on the forefront of AI technology, solving complex technical challenges, and contributing to the evolution of scalable AI infrastructure.

Qualifications

  • 5+ years of engineering experience managing production infrastructure at a large scale
  • Proficiency in designing large, highly available distributed systems using Kubernetes
  • Experience with GPU workloads on Kubernetes clusters
  • Hands-on experience with Kubernetes development, deployment, and support in production environments
  • Familiarity with cloud platforms such as GCP, Azure, AWS, OCI, and multi-cloud/on-premises/hybrid environments
  • Strong background in designing, deploying, supporting, and troubleshooting complex Linux-based computing environments
  • Knowledge of compute, storage, network resource management, and cost optimization
  • Excellent collaboration and troubleshooting skills for building mission-critical systems
  • Ability to adapt and solve evolving complex technical challenges
  • Understanding of the computational characteristics of accelerators such as GPUs, TPUs, or custom accelerators, and their impact on latency and throughput
  • Working experience with distributed systems architecture and implementation
  • Proficiency in high-performance programming languages such as Golang, C++, or similar

Responsibilities

  • Develop, deploy, and maintain scalable AI infrastructure to support large language models
  • Design and implement highly available, low-latency distributed systems using Kubernetes and cloud technologies
  • Support GPU and accelerator workloads, optimizing performance and resource utilization
  • Collaborate with cross-functional teams to integrate models into production environments and ensure operational excellence
  • Troubleshoot and resolve complex system issues, ensuring high system reliability and uptime
  • Optimize compute, storage, and network resources to balance performance and cost efficiency
  • Interface with customers to understand their deployment needs and deliver customized solutions
  • Contribute to the continuous improvement of deployment pipelines, automation, and system robustness
  • Stay updated with the latest advancements in AI infrastructure and incorporate best practices

Benefits

  • Inclusive and collaborative work environment fostering innovation
  • Opportunity to work with cutting-edge AI research and technology
  • Weekly lunch stipend, in-office lunches, and snacks
  • Comprehensive health and dental benefits, including mental health support
  • 100% parental leave top-up for up to six months
  • Personal enrichment benefits covering arts, culture, fitness, well-being, and workspace improvements
  • Remote-flexible work arrangements with offices in Toronto, New York, San Francisco, London, and Paris, plus co-working stipends
  • Six weeks of vacation (30 working days) to promote work-life balance

Equal Opportunity

We value and celebrate diversity and strive to create an inclusive work environment for all. We welcome applicants from all backgrounds and are committed to providing equal opportunities. If you require any accommodations during the recruitment process, please submit an Accommodations Request Form, and we will work together to meet your needs.

Key Skills

Ranked by relevance

ai kubernetes storage cloud artificial intelligence high availability machine learning golang linux aws gcp san c
Login to Apply
Posted
Feb 21, 2026
Type
Full-time
Level
Associate
Location
Canada
Company
Wiraa

Industries

Technology Information Internet

Categories

Information Technology

Related Jobs

3 roles aligned with this opportunity

View all jobs
View Job Details
Wiraa
Related

Fullstack Software Engineer

2026-03-22

Full-time
Associate
Canada
Technology
Information Technology
View Job Details
Wiraa
Related

Frontend Developer

2026-03-15

Full-time
Associate
Canada
Technology
Information Technology
View Job Details
Clera
Related

Full-Stack Engineer

2026-05-29

Full-time
Not Applicable
Germany
Technology
Engineering