Billie Onsite
Senior Platform Engineer (AI Infrastructure Focus)
Billie OnsiteAustralia13 days ago
Full-timeEngineering, Information Technology

About us

We are a fast growing team building innovative AI solutions for the real world. We specialise in developing sophisticated AI engines to analyse and transform unstructured real world data into structured enterprise insights to power workflow automations, data analytics, and custom reports. These AI engines have been battle-tested on our enterprise SaaS solution, Billie Onsite.


We are now seeking a Senior Platform Engineer (AI Infrastructure Focus) to further commercialise our AI engines into an Infrastructure-as-a-Service. You’re architecting the Control Plane that lets us sell this infrastructure to numerous technology companies and teams without their data ever touching. From automated provisioning the moment a customer signs up, to metering pipelines that drive billing. Your designs will turn isolation and scale into our competitive edge.


What You’ll Do

• Integrate with existing AI APIs (via REST/HTTP clients or SDKs) to automate model deployments into isolated, data-siloed environments.

• Provision and scale AWS resources (ECS/Fargate, Lambda) for tenant-specific workloads, ensuring high-traffic users stay ring-fenced with zero cross-contamination.

• Containerize (Docker) and deploy AI models in coordination with the team, enforcing strict isolation from signup to runtime.

• Implement RBAC and auth layers in API Gateway to secure tenant boundaries and compliance for sensitive AI data across a large customer base.

• Set up observability pipelines (CloudWatch, Datadog) to monitor performance, trace issues, and alert on anomalies before they hit users.

• Automate infrastructure provisioning (Terraform/CDK) for instant on-demand scaling and cost controls (e.g., spot instances) right at customer signup.

• Build metering and rate-limiting logic to track usage end-to-end, feeding seamless billing pipelines without over-provisioning.

• Debug and resolve production incidents, collaborating with the team to iterate on reliability.

Must-Have Skills (Extensive Proven Experience)

• SaaS-Grade AWS/Cloud Hands-on with ECS/Fargate, Lambda, API Gateway for external-facing multi-tenant setups—not just internal tools. You’ve built VPCs for customers, not just teams.

• Container Orchestration (K8s/ECS) Deep experience with multi-tenant clusters—setting resource quotas and limits so one tenant doesn’t hog the farm.

• Infrastructure as Code (Terraform/CDK) Write reusable modules where “New Customer” is just a variable—no copy-paste drudgery.

• API integration experience Comfortable consuming/wrapping services (e.g., via curl, SDKs, or lightweight scripts) in distributed systems.

• Scaling track record Handled production-scale multi-tenancy in prod with mission-critical uptime (e.g., canary rollouts).


Nice-to-Haves

• FinOps Experience Tools/scripts to track “Cost Per Tenant” (e.g., spotting when Customer X’s GPU compute eats into margins).

• Backend languages Python/Go for glue scripts (e.g., hooking Stripe webhooks to provisioning).

• Security/compliance tools (e.g., IAM policies for AI governance).

Monitoring suites (e.g., Grafana for dashboards).


Key qualifications 

·       Bachelor’s degree or above in Computer Science or related fields.

·       At least 5+ years’ relevant engineering experience


What you can expect from this role

• Competitive salary and compensation package

• Young & dynamic environment with a strong culture.

• Tight-knit bond with product and business teams to deliver common product vision

• International exposure given user base in Australia, Hong Kong and United States


Application method

Please send your CV to [email protected]

Key Skills

Ranked by relevance