Track This Job
Add this job to your tracking list to:
- Monitor application status and updates
- Change status (Applied, Interview, Offer, etc.)
- Add personal notes and comments
- Set reminders for follow-ups
- Track your entire application journey
Save This Job
Add this job to your saved collection to:
- Access easily from your saved jobs dashboard
- Review job details later without searching again
- Compare with other saved opportunities
- Keep a collection of interesting positions
- Receive notifications about saved jobs before they expire
AI-Powered Job Summary
Get a concise overview of key job requirements, responsibilities, and qualifications in seconds.
Pro Tip: Use this feature to quickly decide if a job matches your skills before reading the full description.
About A1
A1 is a self-funded, independent AI group, focused on building a new consumer AI application with global impact. We’re assembling a small, elite team of ML, engineering and product builders who want to work on meaningful, high-impact problems.
About The Role
You will shape the core technical direction of A1 - model selection, training strategy, infrastructure, and long-term architecture. This is a founding technical role: your decisions will define our model stack, our data strategy, and our product capabilities for years ahead.
You won’t just fine-tune models - you’ll design systems: training pipelines, evaluation frameworks, inference stacks, and scalable deployment architectures. You will have full autonomy to experiment with frontier models (LLaMA, Mistral, Qwen, Claude-compatible architectures) and build new approaches where existing ones fall short.
What You’ll Be Doing
- Build end-to-end training pipelines: data → training → eval → inference
- Design new model architectures or adapt open-source frontier models
- Fine-tune models using state-of-the-art methods (LoRA/QLoRA, SFT, DPO, distillation)
- Architect scalable inference systems using vLLM / TensorRT-LLM / DeepSpeed
- Build data systems for high-quality synthetic and real-world training data
- Develop alignment, safety, and guardrail strategies
- Design evaluation frameworks across performance, robustness, safety, and bias
- Own deployment: GPU optimization, latency reduction, scaling policies
- Shape early product direction, experiment with new use cases, and build AI-powered experiences from zero
- Explore frontier techniques: retrieval-augmented training, mixture-of-experts, distillation, multi-agent orchestration, multimodal models
What You'll Need
- Strong background in deep learning and transformer architectures
- Hands-on experience training or fine-tuning large models (LLMs or vision models)
- Proficiency with PyTorch, JAX, or TensorFlow
- Experience with distributed training frameworks (DeepSpeed, FSDP, Megatron, ZeRO, Ray)
- Strong software engineering skills — writing robust, production-grade systems
- Experience with GPU optimization: memory efficiency, quantization, mixed precision
- Comfortable owning ambiguous, zero-to-one technical problems end-to-end
Nice to Have
- Experience with LLM inference frameworks (vLLM, TensorRT-LLM, FasterTransformer)
- Contributions to open-source ML libraries
- Background in scientific computing, compilers, or GPU kernels
- Experience with RLHF pipelines (PPO, DPO, ORPO)
- Experience training or deploying multimodal or diffusion models
- Experience in large-scale data processing (Apache Arrow, Spark, Ray)
- Prior work in a research lab (Google Brain, DeepMind, FAIR, Anthropic, OpenAI)
Key Skills
Ranked by relevanceReady to apply?
Join A1 and take your career to the next level!
Application takes less than 5 minutes

