Track This Job
Add this job to your tracking list to:
- Monitor application status and updates
- Change status (Applied, Interview, Offer, etc.)
- Add personal notes and comments
- Set reminders for follow-ups
- Track your entire application journey
Save This Job
Add this job to your saved collection to:
- Access easily from your saved jobs dashboard
- Review job details later without searching again
- Compare with other saved opportunities
- Keep a collection of interesting positions
- Receive notifications about saved jobs before they expire
AI-Powered Job Summary
Get a concise overview of key job requirements, responsibilities, and qualifications in seconds.
Pro Tip: Use this feature to quickly decide if a job matches your skills before reading the full description.
The Role
Own and evolve the core “brain” service that powers Qu. Design, build, and operate multi-agent LLM systems that communicate in real time over text and voice. Ship fast Python services with FastAPI, keep latency low, quality high, and evaluation continuous.
What You’ll Do
- Own Qu’s brain service end to end: architecture, SLAs, latency budgets, error modes, rollouts.
- Low-latency comms: streaming text and voice, VAD, barge-in, turn-taking, interruption handling. WebRTC, SIP, and LiveKit experience is a strong plus.
- Multi-agent orchestration: planner–executor–critic patterns, role routing, shared memory, tool routers, coordination protocols and evaluation.
- Reasoning & optimization: ReAct, Chain-of-Thought, plus Tree-/Graph-of-Thoughts when useful.
- Programmatic prompt optimization: DSPy for prompt/program compilation; integrate MiPRO and GEPA for iterative prompt evolution under eval constraints.
- RAG engineering: high-signal retrieval (chunking, hybrid search, re-ranking), query rewriting, compression, caching, freshness, and strong grounding; evaluate faithfulness, context precision/recall, and answer relevancy.
- Evaluation & observability: Pre-call validate inputs, enforce safety, and verify retrieval quality for RAG; in-call trace prompts, tool calls, token/latency/cost and enforce streaming guardrails; post-call run automated task evals (faithfulness, relevancy, hallucination, safety), regressions, red-teaming, and CI/CD gates. Instrument with structured logs and OpenTelemetry, surface dashboards and alerts, and feed live traffic slices into shadow evals for drift detection.
Minimum Qualifications
- 5+ years in ML or backend engineering in product environments; recent focus on LLM systems.
- Expert Python. Strong FastAPI, asyncio, pydantic, and production observability.
- Real-time systems: you’ve built or integrated low-latency text/voice. You have used LiveKit, Pipecat or similar tech.
- Working knowledge of agent patterns and eval-driven development.
- Hands-on with ReAct and CoT; pragmatic with ToT/GoT tradeoffs.
- Prior startup experience.
Nice To Have
- DSPy for compilation and self-improving workflows; MiPRO/GEPA integration.
- Experience with evaluation tooling and LLM-as-judge setups.
- WebRTC/SRTP, jitter buffers, SIP basics; LiveKit a plus.
- LiveKit Agents, SIP–WebRTC gateways, TURN/SFU tuning.
- GCP: Cloud Run/GKE, Pub/Sub, Vertex AI, GCS, Secret Manager, Cloud Logging/Trace.
- Healthcare data familiarity.
Example Problems You’ll Tackle
- Push median voice round-trip under 2 seconds while preserving turn-taking and barge-in.
- Set up OTEL-first tracing for the agent graph with automated eval triggers on production traffic slices.
- Improve our RAG pipeline with hybrid retrieval and re-ranking, then prove gains via faithfulness and context metrics with regression harnesses.
- Turn EHR integrations into LLM tools.
Tech Stack
Python, FastAPI, pydantic, asyncio, Redis, Postgres, vector stores, WebRTC stacks, LiveKit, SIP gateways, STT/TTS, Docker, Terraform, K8s, OTEL, DeepEval.
What You Get
- Work on cutting-edge real-time agent tech with a best-in-class team in healthtech.
- Fun off-sites in Barcelona.
- High-tech laptop and solid dev ergonomics.
- Flexibility: work from home or hybrid in Barcelona/London.
Key Skills
Ranked by relevanceReady to apply?
Join Quadrivia AI and take your career to the next level!
Application takes less than 5 minutes

