Software Engineer

Wiraa

Canada · Full-time · Associate

About The Company

Our mission is to scale intelligence to serve humanity. We are dedicated to training and deploying frontier models for developers and enterprises who are building AI systems aimed at powering innovative and magical experiences such as content generation, semantic search, retrieval-augmented generation (RAG), and intelligent agents. We believe that our work plays a crucial role in advancing the widespread adoption of artificial intelligence, making it accessible and beneficial across various industries. Our team is composed of passionate professionals committed to pushing the boundaries of AI technology and delivering impactful solutions that transform the way people interact with digital content and services.

About The Role

We are seeking a highly skilled and motivated Member of Technical Staff to join our Model Serving team at Cohere. In this role, you will be responsible for developing, deploying, and maintaining our AI platform that delivers Cohere's large language models through user-friendly API endpoints. You will work closely with cross-functional teams to optimize NLP models for production environments characterized by low latency, high throughput, and high availability. This position offers the opportunity to interface directly with customers, understand their unique deployment needs, and create customized solutions to meet those requirements. Your expertise will be instrumental in ensuring the reliability, scalability, and performance of our AI systems, enabling seamless integration of advanced NLP capabilities into real-world applications.

Qualifications

5+ years of engineering experience managing production infrastructure at a large scale
Proficiency in designing large, highly available distributed systems with Kubernetes
Experience working with GPU workloads within Kubernetes clusters
Hands-on experience with Kubernetes development, deployment, and support
Familiarity with cloud platforms such as GCP, Azure, AWS, OCI, and multi-cloud or hybrid environments
Strong background in Linux-based computing environments, including deployment, support, and troubleshooting
Experience with compute, storage, network resource management, and cost optimization
Excellent collaboration, troubleshooting, and problem-solving skills for mission-critical systems
Grit and adaptability to solve evolving technical challenges
Knowledge of accelerators like GPUs, TPUs, or custom accelerators and their impact on latency and throughput
Strong understanding or experience with distributed systems architecture
Proficiency in programming languages such as Golang, C++, or other high-performance server-side languages

Responsibilities

Develop, deploy, and support scalable NLP models and AI platforms
Design and implement distributed systems that ensure high availability and low latency
Optimize GPU and accelerator workloads for inference performance
Collaborate with cross-functional teams to create customized deployment solutions for clients
Monitor system performance, troubleshoot issues, and implement improvements
Support multi-cloud and hybrid deployment architectures
Manage compute, storage, and network resources efficiently to control costs
Contribute to the development of best practices for deployment, scaling, and maintenance of machine learning systems
Engage with customers to understand their needs and deliver tailored AI solutions
Stay updated with the latest advancements in AI infrastructure and incorporate innovative techniques into our platform

Benefits

An open and inclusive culture fostering innovation and collaboration
Opportunity to work alongside a team at the forefront of AI research and development
Weekly lunch stipend, in-office lunches, and snacks
Comprehensive health and dental benefits, including mental health support
100% parental leave top-up for up to six months
Personal enrichment benefits covering arts, culture, fitness, well-being, and workspace improvements
Remote-flexible work arrangements with offices in Toronto, New York, San Francisco, London, and Paris, plus co-working stipends
Six weeks of vacation (30 working days) to promote work-life balance

Equal Opportunity

We value and celebrate diversity and are committed to creating an inclusive work environment for all employees. We welcome applicants from all backgrounds and are dedicated to providing equal employment opportunities. If you require accommodations during the recruitment process, please submit an Accommodations Request Form, and we will work with you to meet your needs.

Key Skills

Ranked by relevance

ai cloud high availability kubernetes storage artificial intelligence machine learning golang server linux aws gcp san c

Related Jobs

3 roles aligned with this opportunity

View all jobs

Fullstack Software Engineer

2026-03-22

Full-time

Associate

Canada

Technology

Information Technology

Frontend Developer

2026-03-15

Full-time

Associate

Canada

Technology

Information Technology

Full-Stack Engineer

2026-05-29

Full-time

Not Applicable

Germany

Technology

Engineering

🇨🇦

Country Guide

Canada

Express Entry & tech-friendly immigration

Posted: Feb 24, 2026
Type: Full-time
Level: Associate
Location: Canada
Company: Wiraa

Industries

Technology Information Internet

Related Jobs

3 roles aligned with this opportunity

View all jobs

Fullstack Software Engineer

2026-03-22

Full-time

Associate

Canada

Technology

Information Technology

Frontend Developer

2026-03-15

Full-time

Associate

Canada

Technology

Information Technology

Full-Stack Engineer

2026-05-29

Full-time

Not Applicable

Germany

Technology

Engineering

Software Engineer

Key Skills

Related Jobs

Fullstack Software Engineer

Frontend Developer

Full-Stack Engineer

Related Jobs

Fullstack Software Engineer

Frontend Developer

Full-Stack Engineer

Cookie Settings