-
Wiraa

Software Engineer

Wiraa
Canada · Full-time · Associate

About The Company

Our mission is to scale intelligence to serve humanity. We’re training and deploying frontier models for developers and enterprises who are building AI systems to power magical experiences like content generation, semantic search, Retrieval-Augmented Generation (RAG), and autonomous agents. We believe that our work is instrumental to the widespread adoption of artificial intelligence, driving innovation across various industries. Our company is committed to advancing AI technology responsibly and ethically, fostering an inclusive environment where talent from diverse backgrounds can thrive. We leverage cutting-edge research and state-of-the-art infrastructure to develop scalable, reliable, and efficient AI solutions that meet the needs of our clients and partners worldwide.

About The Role

Are you energized by building high-performance, scalable, and reliable machine learning systems? Do you aspire to define and build the next generation of AI platforms powering advanced natural language processing (NLP) applications? We are seeking dedicated Members of Technical Staff to join our Model Serving team at Cohere. In this role, you will be responsible for developing, deploying, and operating our AI platform that delivers Cohere’s large language models via user-friendly API endpoints. You will work closely with cross-functional teams to ensure the deployment of optimized NLP models in low latency, high throughput, and high availability environments. Additionally, you will interface directly with customers to create customized deployments tailored to their specific needs, ensuring seamless integration and performance. This position offers a unique opportunity to influence the future of AI infrastructure and contribute to innovative solutions that impact a global user base.

Qualifications

  • 5+ years of engineering experience managing production infrastructure at a large scale
  • Proficiency in designing large, highly available distributed systems with Kubernetes and GPU workloads
  • Hands-on experience with Kubernetes development, deployment, and support in production environments
  • Familiarity with cloud platforms such as GCP, Azure, AWS, OCI, as well as multi-cloud, on-premises, or hybrid environments
  • Strong background in Linux-based computing environments, including deployment, support, and troubleshooting
  • Experience in compute, storage, network resource management, and cost optimization
  • Excellent collaboration and troubleshooting skills for building mission-critical systems
  • Grit and adaptability to solve complex technical challenges that evolve daily
  • Understanding of the computational characteristics of accelerators (GPUs, TPUs, custom accelerators) and their impact on latency and throughput
  • Working knowledge of distributed systems architecture and implementation
  • Proficiency in programming languages such as Golang, C++, or other high-performance server languages
  • Desire to learn and grow, even if some qualifications are not fully met

Responsibilities

  • Design, develop, and maintain scalable, reliable, and high-performance distributed systems for AI model serving
  • Deploy and optimize large language models and NLP solutions in production environments
  • Manage and support Kubernetes clusters, ensuring high availability and efficient resource utilization
  • Collaborate with research and engineering teams to integrate new models and features into the platform
  • Interface with customers to understand their deployment needs and create customized solutions
  • Monitor system performance, troubleshoot issues, and implement improvements to enhance reliability and throughput
  • Support multi-cloud and hybrid deployment strategies, ensuring seamless operation across platforms
  • Optimize compute, storage, and network resources to balance performance and cost
  • Contribute to documentation, best practices, and development standards for the team
  • Stay current with emerging technologies in AI infrastructure, accelerators, and distributed systems

Benefits

  • An open and inclusive culture fostering innovation and collaboration
  • Opportunity to work closely with a team at the forefront of AI research
  • Weekly lunch stipend, in-office lunches, and snacks
  • Comprehensive health and dental benefits, including mental health support
  • 100% parental leave top-up for up to six months
  • Personal enrichment benefits supporting arts, culture, fitness, well-being, and workspace improvement
  • Remote‑flexible work arrangements with offices in Toronto, New York, San Francisco, London, and Paris, along with a co‑working stipend
  • Six weeks of vacation (30 working days) to promote work-life balance

Equal Opportunity

We value and celebrate diversity and strive to create an inclusive work environment for all. We welcome applicants from all backgrounds and are committed to providing equal opportunities. If you require any accommodations during the recruitment process, please submit an Accommodations Request Form, and we will work together to meet your needs.

Key Skills

Ranked by relevance

ai kubernetes cloud high availability storage natural language processing artificial intelligence machine learning golang server linux aws gcp san c
Login to Apply
Posted
Feb 20, 2026
Type
Full-time
Level
Associate
Location
Canada
Company
Wiraa

Industries

Technology Information Internet

Categories

Information Technology

Related Jobs

3 roles aligned with this opportunity

View all jobs
View Job Details
Wiraa
Related

Fullstack Software Engineer

2026-03-22

Full-time
Associate
Canada
Technology
Information Technology
View Job Details
Wiraa
Related

Frontend Developer

2026-03-15

Full-time
Associate
Canada
Technology
Information Technology
View Job Details
Clera
Related

Full-Stack Engineer

2026-05-29

Full-time
Not Applicable
Germany
Technology
Engineering