AI Platform Engineer

T-ProIreland14 days ago

Full-timeRemote FriendlyEngineering, Information Technology

Track This Job

Add this job to your tracking list to:

Monitor application status and updates
Change status (Applied, Interview, Offer, etc.)
Add personal notes and comments
Set reminders for follow-ups
Track your entire application journey

Save This Job

Add this job to your saved collection to:

Access easily from your saved jobs dashboard
Review job details later without searching again
Compare with other saved opportunities
Keep a collection of interesting positions
Receive notifications about saved jobs before they expire

AI-Powered Job Summary

Get a concise overview of key job requirements, responsibilities, and qualifications in seconds.

Pro Tip: Use this feature to quickly decide if a job matches your skills before reading the full description.

We are seeking an AI Platform Engineer to build and scale the infrastructure that powers our production AI services. You will take cutting-edge models-ranging from speech recognition (ASR) to large language models (LLMs), and deploy them into highly available, developer-friendly APIs.

You will be responsible for creating the bridge between the R&D team, who train models, and the applications that consume them. This means developing robust APIs, deploying and optimising models on Triton Inference Server (or similar frameworks), and ensuring real-time, scalable inference.

Responsibilities

API Development

Design, build, and maintain production-ready APIs for speech, language, and other AI models.
Provide SDKs and documentation to enable easy developer adoption.

Model Deployment

Deploy models (ASR, LLM, and others) using Triton Inference Server or similar systems.
Optimise inference pipelines for low-latency, high-throughput workloads.

Scalability & Reliability

Architect infrastructure for handling large-scale, concurrent inference requests.
Implement monitoring, logging, and auto-scaling for deployed services.

Collaboration

Work with research teams to productionize new models.
Partner with application teams to deliver AI functionality seamlessly through APIs.

DevOps & Infrastructure

Automate CI/CD pipelines for models and APIs.
Manage GPU-based infrastructure in cloud or hybrid environments.

Requirements

Core Skills

Strong programming experience in Python (FastAPI, Flask) and/or Go/Node.js for API services.
Hands-on experience with model deployment using Triton Inference Server, TorchServe, or similar.
Familiarity with both ASR frameworks and LLM frameworks (Hugging Face Transformers, TensorRT-LLM, vLLM, etc.).

Infrastructure

Experience with Docker, Kubernetes, and managing GPU-accelerated workloads.
Deep knowledge of real-time inference systems (REST, gRPC, WebSockets, streaming).
Cloud experience (AWS, GCP, Azure).

Bonus

Experience with model optimisation (quantisation, distillation, TensorRT, ONNX).
Exposure to MLOps tools for deployment and monitoring

Key Skills

Ranked by relevance

Ready to apply?

Join T-Pro and take your career to the next level!

Application takes less than 5 minutes

Apply