Caffeine
Machine Learning Engineer
CaffeineSwitzerland10 days ago
Full-timeRemote FriendlyEngineering

At Caffeine.ai, we are building the world’s first platform to create full-stack, on-chain applications through natural language. Our mission is to make building software as simple as a conversation — transforming ideas into live applications instantly.

We are a cross-functional team of engineers and researchers building the AI that powers this new paradigm. To do this, we need world-class product engineers who can design beautiful, reliable, and performant experiences across the stack.


About the Role

We are looking for a Machine Learning Engineer to help build the next generation of our agentic AI system. You’ll work directly on the intelligence behind specification generation, dependency planning, backend & frontend code generation, error-repair loops, E2E evaluation, dataset quality, and model integrations.


You will play a key role in designing and evolving an agent-based architecture for app creation — enabling modularity, versioning, observability, and continuous improvement across all stages of the pipeline, from user intent to deployed application.


You’ll be joining the AI team and collaborate across teams to ship reliable, scalable, and innovative AI features into production.


What You’ll Do

  • Contribute to the design and implementation of multi-agent LLM architectures that power Caffeine’s full-stack code generation pipeline.
  • Build and maintain training datasets, including synthetic generation, filtering, validation, and alignment mechanisms.
  • Develop post-training and fine-tuning pipelines (SFT, RLHF/RLVR, preference modeling) to improve reasoning, reliability, and code quality.
  • Design and extend evaluation harnesses, including error-reproduction, regression detection, benchmarks, and E2E “AI-generated app” evaluations.
  • Implement and refine RAG / context engineering pipelines to provide the right context for each agent stage.
  • Collaborate with the Integration, App, SRE, and Infra teams to ensure reliability, observability, and maintainability of the AI system in production.
  • Participate in rapid experimentation, benchmarking, ablations, and architectural improvements.


Who You Are

  • Bachelor’s or Master’s in Computer Science, Machine Learning, Engineering, or a related field.
  • Experience with full-stack TypeScript (e.g., React/Next.js, Node.js/NestJS/Fastify) and monorepo tooling (e.g., pnpm, Turborepo)
  • Experience training, fine-tuning, or evaluating LLMs (SFT, RLHF, RAG, or similar).
  • Experience with LLM-based code generation, compilers, or static analysis.
  • Comfortable working in a TypeScript-heavy product codebase and enjoy the intersection of ML and software engineering.
  • Comfortable working in fast-moving, highly technical environments.
  • Background in multi-agent architectures or structured reasoning.


Bonus

  • Publications or prior work involving LLMs, agents, program synthesis, or automated code generation.
  • Experience with large-scale training pipelines (e.g., distributed training, DeepSpeed).
  • Strong interest in designing modular, versioned agentic systems.
  • Experience with prompt engineering for complex multi-step workflows.


*This is a hybrid role based in our Zurich office, with a requirement of 3+ days in the office per week.

Key Skills

Ranked by relevance