Software Engineer - AI
LLM | Python | AWS
We’re partnering with a fast-growing software company building AI-driven products used in high-stakes, real-world workflows.
The focus is on production-quality AI: systems that must be reliable, measurable, and safe at scale.
They’re looking for a Software Engineer with AI experience to join a team responsible for the core AI platform, with a particular emphasis on LLM evaluation, observability, and reliability.
This is a hands-on engineering role, sitting close to product and domain experts, where your work directly influences how AI quality is defined, measured, and enforced in production.
What you’ll work on
- Building and operating LLM evaluation pipelines that assess model quality, robustness, and safety
- Defining test sets, metrics, and evaluation workflows, including human-in-the-loop processes where required
- Translating product and domain constraints into concrete, testable evaluation criteria
- Running and orchestrating distributed evaluation workloads on AWS, including monitoring compute usage
- Analysing evaluation results, identifying failure modes, and collaborating on mitigations (prompt changes, data updates, model selection or fine-tuning)
- Integrating and assessing open-source and vendor evaluation frameworks, writing glue code where needed
- Contributing to the evolution of the AI evaluation and platform architecture
What they’re looking for
- Strong Python engineering skills
- Experience monitoring and evaluating LLM-based applications
- Hands-on exposure to LLM evaluation tools, benchmarks, and metrics
- Understanding of common LLM failure modes (e.g. hallucination, bias, toxicity, prompt injection)
- Experience with cloud ML infrastructure, ideally AWS
- Familiarity with distributed workloads (e.g. Ray, AWS Lambda, or similar)
- Comfort working with an evolving LLM observability and evaluation stack
- Ability to work with non-ML stakeholders and convert qualitative requirements into quantitative tests
Working environment & benefits
- Flexible hybrid setup, with twice-weekly collaboration in a modern CBD office
- Strong learning and career development opportunities in a scaling business
- Wellness focus including additional leave and gym membership
- Collaborative team culture with regular social events
- Pool table, snacks, and a genuinely supportive environment
This role is well suited to engineers who care about AI reliability and correctness, and who want to work on systems where evaluation and safeguards genuinely matter.
Must be based in Sydney with full working rights. Remote working or sponsorship is not available for this role.
Key Skills
Ranked by relevance
Related Jobs
3 roles aligned with this opportunity
Artificial Intelligence Engineer
2025-11-25
Site Reliability Engineer
2026-04-10
Senior/Lead Back End Engineer
2026-04-09
- Posted
- Dec 28, 2025
- Type
- Full-time
- Level
- Mid-Senior
- Location
- Sydney
- Company
- Tribus
Industries
Categories
Related Jobs
3 roles aligned with this opportunity
Artificial Intelligence Engineer
2025-11-25
Site Reliability Engineer
2026-04-10
Senior/Lead Back End Engineer
2026-04-09