Artificial Intelligence Engineer

TribusAustralia12 hours ago

Full-timeRemote FriendlyEngineering, Information Technology

Track This Job

Add this job to your tracking list to:

Monitor application status and updates
Change status (Applied, Interview, Offer, etc.)
Add personal notes and comments
Set reminders for follow-ups
Track your entire application journey

Save This Job

Add this job to your saved collection to:

Access easily from your saved jobs dashboard
Review job details later without searching again
Compare with other saved opportunities
Keep a collection of interesting positions
Receive notifications about saved jobs before they expire

AI-Powered Job Summary

Get a concise overview of key job requirements, responsibilities, and qualifications in seconds.

Pro Tip: Use this feature to quickly decide if a job matches your skills before reading the full description.

Software Engineer - AI

LLM | Python | AWS

We’re partnering with a fast-growing software company building AI-driven products used in high-stakes, real-world workflows.

The focus is on production-quality AI: systems that must be reliable, measurable, and safe at scale.

They’re looking for a Software Engineer with AI experience to join a team responsible for the core AI platform, with a particular emphasis on LLM evaluation, observability, and reliability.

This is a hands-on engineering role, sitting close to product and domain experts, where your work directly influences how AI quality is defined, measured, and enforced in production.

What you’ll work on

Building and operating LLM evaluation pipelines that assess model quality, robustness, and safety
Defining test sets, metrics, and evaluation workflows, including human-in-the-loop processes where required
Translating product and domain constraints into concrete, testable evaluation criteria
Running and orchestrating distributed evaluation workloads on AWS, including monitoring compute usage
Analysing evaluation results, identifying failure modes, and collaborating on mitigations (prompt changes, data updates, model selection or fine-tuning)
Integrating and assessing open-source and vendor evaluation frameworks, writing glue code where needed
Contributing to the evolution of the AI evaluation and platform architecture

What they’re looking for

Strong Python engineering skills
Experience monitoring and evaluating LLM-based applications
Hands-on exposure to LLM evaluation tools, benchmarks, and metrics
Understanding of common LLM failure modes (e.g. hallucination, bias, toxicity, prompt injection)
Experience with cloud ML infrastructure, ideally AWS
Familiarity with distributed workloads (e.g. Ray, AWS Lambda, or similar)
Comfort working with an evolving LLM observability and evaluation stack
Ability to work with non-ML stakeholders and convert qualitative requirements into quantitative tests

Working environment & benefits

Flexible hybrid setup, with twice-weekly collaboration in a modern CBD office
Strong learning and career development opportunities in a scaling business
Wellness focus including additional leave and gym membership
Collaborative team culture with regular social events
Pool table, snacks, and a genuinely supportive environment

This role is well suited to engineers who care about AI reliability and correctness, and who want to work on systems where evaluation and safeguards genuinely matter.

Must be based in Sydney with full working rights. Remote working or sponsorship is not available for this role.

Key Skills

Ranked by relevance

Ready to apply?

Join Tribus and take your career to the next level!

Application takes less than 5 minutes

Apply