Track This Job
Add this job to your tracking list to:
- Monitor application status and updates
- Change status (Applied, Interview, Offer, etc.)
- Add personal notes and comments
- Set reminders for follow-ups
- Track your entire application journey
Save This Job
Add this job to your saved collection to:
- Access easily from your saved jobs dashboard
- Review job details later without searching again
- Compare with other saved opportunities
- Keep a collection of interesting positions
- Receive notifications about saved jobs before they expire
AI-Powered Job Summary
Get a concise overview of key job requirements, responsibilities, and qualifications in seconds.
Pro Tip: Use this feature to quickly decide if a job matches your skills before reading the full description.
What you'll do
- Build evaluation infrastructure: Develop scalable Django Ninja APIs and Next.js/React interfaces that power structured experiments and automated evaluations.
- Design experiment management: Ship intuitive flows for user inputs, evaluation data management, rating criteria, and multi-provider runs.
- Deliver analytics that matter: Create dashboards for result summaries, rating distributions, regression tracking, and compliance reports to drive data-informed decisions.
- Advance our Python SDK: Extend the client library to create experiments, generate responses, and retrieve results—fitting naturally into modern AI dev workflows.
- Optimize for scale: Own async task processing, query optimization, API performance, and containerized deployments to run thousands of evaluations reliably.
- Shape the UX: Design for batch testing, scheduling, and collaborative reviews so rigorous evaluation is accessible to whole teams.
- Raise the bar on platform quality: Improve CI/CD, containerization, code health, and reviews—establishing best practices across the codebase.
- Integrate AI pragmatically: Explore sensible, cutting-edge integrations that improve how teams build, test, and ship LLM apps.
What we're looking for
Must-haves
- Experience with modern frontend (React / Next.js or similar) and backend (Django or FastAPI), shipping production features end-to-end.
- Solid software engineering fundamentals: API design, data modeling, testing, and performance.
- Comfort with Docker/Kubernetes and CI/CD workflows.
- Enthusiasm about AI and its possible applications in software development.
- On-site collaboration ≥3 days/week in Berlin or Bremen. Travel to our Bremen HQ during onboarding.
- Fluency in English (at least B2).
- Valid EU work authorization.
- Hands-on work with LLM apps, evaluation frameworks, prompt/versioning workflows, or developer tooling/SDKs.
- Experience with data-heavy dashboards and analytics; familiarity with async workers (e.g., Celery) and PostgreSQL.
- German language skills.
- Exposure to privacy-sensitive or on-prem deployments.
We prioritize demonstrated excellence in your projects and career. If you’re motivated to build and optimize AI solutions, we want to hear from you—even if you don’t meet every single criterion.
Diversity & inclusion
Different perspectives make us stronger. We welcome applicants from all backgrounds and encourage you to apply.
Why us?
- Shape the future of AI development: You’ll have significant influence on our product and technology direction while building critical infrastructure that every serious AI team needs.
- Technical excellence meets cutting-edge innovation: Work with a modern, well-architected stack (Django Ninja + Next.js + Python SDK) on complex challenges like distributed systems, multi-LLM integrations, and real-time experiment tracking. Without legacy baggage holding you back.
- Career-defining opportunity: You’ll be building essential AI evaluation infrastructure during a massive market transformation. As systematic testing becomes fundamental to AI development, you’ll be at the center of this shift, working on technology that’s becoming as critical as version control.
- Ownership and impact: Get full end-to-end ownership of features, direct collaboration with AI researchers and ML engineers, and immediate feedback on how your code helps teams ship better AI products. Your engineering decisions directly shape how thousands of developers work.
- Competitive package with upside: In addition to a competitive salary, we offer a VSOP (Virtual Stock Option Program) to give you a real stake in the company’s success as we grow this essential AI infrastructure.
- Best-in-class development experience: Fast and streamlined access to all AI technologies that make your life (and development work) easier, plus the latest tools and platforms to maximize your productivity.
- Work environment: Our Bremen office features stunning waterfront views, complimentary beverages, smoothies, and a boat. We’re opening our Berlin office at the end of 2025, giving you flexibility as we expand.
- Grow with transformative technology: Build deep expertise in AI evaluation and LLM infrastructure alongside our expanding team, mastering the technologies that are reshaping software development while helping define industry standards.
We are a cash-flow-positive Germany-based AI startup building elluminate—the enterprise platform that turns AI evaluation from ad-hoc experiments into rigorous, repeatable workflows so teams can ship reliable AI with confidence. Teams use elluminate to design test suites, benchmark models, track regressions, and ship reliable AI with clear, measurable quality gates. We pair elluminate with custom large-language-model solutions and full on-prem deployment options. Our products have already earned the trust of renowned clients such as Deutsche Telekom, the German Federal Government, and leading health insurers like hkk.
Rooted in Bremen and collaborating with leading organizations, our team has a track record in advanced model and dataset development. We like owning problems end-to-end and shipping pragmatically, and contribute to the open-source community across initiatives like OpenEuroLLM, and regularly publish models and tools to accelerate the broader ecosystem.
Compensation Range: €60K - €100K
Key Skills
Ranked by relevanceReady to apply?
Join ellamind and take your career to the next level!
Application takes less than 5 minutes

