Track This Job
Add this job to your tracking list to:
- Monitor application status and updates
- Change status (Applied, Interview, Offer, etc.)
- Add personal notes and comments
- Set reminders for follow-ups
- Track your entire application journey
Save This Job
Add this job to your saved collection to:
- Access easily from your saved jobs dashboard
- Review job details later without searching again
- Compare with other saved opportunities
- Keep a collection of interesting positions
- Receive notifications about saved jobs before they expire
AI-Powered Job Summary
Get a concise overview of key job requirements, responsibilities, and qualifications in seconds.
Pro Tip: Use this feature to quickly decide if a job matches your skills before reading the full description.
The Role
You'll be the architect and owner of Neo's AI infrastructure. This means training custom models for our unique use cases, building production ML pipelines, and creating the reasoning systems that make Neo intelligent. You'll work across the full ML lifecycle - from data pipelines to model training to production deployment.
What You'll Own
1. Custom Model Development & Training
Build specialized models that foundation models can't provide. Train speaker diarization for Indian accents, fine-tune embedding models for conversational memory, develop custom NER for Hindi-English code-mixing, and optimize models for edge deployment.
Key Challenges:
● Train speaker diarization models on Indian multi-speaker conversations with code-mixing ● Fine-tune embedding models for semantic search across temporal context ● Build custom NER/entity linking for Hindi-English mixed conversations ● Optimize transformer models for mobile deployment with <100ms latency ● Handle class imbalance in emotion detection and intent classification
Tech Stack: PyTorch/TensorFlow for model training, Hugging Face for fine-tuning, ONNX/TensorRT for optimization
2. Memory Architecture & ML Pipeline
Build the brain that remembers everything. Design temporal knowledge graphs that ingest conversations, extract entities and relationships using custom-trained models, and enable longitudinal pattern detection. Own the full ML pipeline from data ingestion to model inference to graph updates.
Key Challenges:
● Bi-temporal data models with real-time updates
● Entity linking across noisy conversational transcripts
● Relationship extraction using fine-tuned sequence models
● Pattern detection with unsupervised learning (clustering, anomaly detection) ● Privacy-preserving embeddings and federated learning
Tech Stack: PyTorch for custom models, Neo4j/graph databases, vector databases (Qdrant), streaming pipelines
3. Audio Processing & Speech ML
Own the end-to-end speech pipeline. Train/fine-tune ASR models for Indian languages, build speaker diarization systems, develop audio quality assessment models, and optimize for edge deployment. Handle the unique challenges of Indian conversational speech.
Key Challenges:
● Fine-tune Whisper/wav2vec2 for 15+ Indian languages with code-mixing ● Train speaker diarization models handling overlapping speech
● Build voice activity detection for noisy environments
● Develop audio quality assessment using CNNs
● Optimize models for real-time mobile inference (quantization, pruning) Tech Stack: PyTorch, TorchAudio, Kaldi, ESPnet, model compression techniques
4. Intelligence & Reasoning Layer
Create the query understanding and reasoning system. Build hybrid retrieval combining dense embeddings with graph traversal, train ranking models for result quality, develop proactive insight detection, and fine-tune LLMs for conversational queries.
Key Challenges:
● Train re-ranking models for temporal query results
● Fine-tune LLMs for Hindi-English conversational queries
● Build classification models for query intent and temporal scope
● Develop anomaly detection for proactive insights
● Handle distribution shift as user behavior evolves
Tech Stack: PyTorch, sentence-transformers, LLM fine-tuning (LoRA, QLoRA), scikit-learn
5. Multi-Agent Systems & Orchestration
Design agent orchestration where specialized AI agents collaborate. Train classifier models for routing queries, build reward models for agent evaluation, develop action prediction models, and create meta-learning systems that improve over time.
Key Challenges:
● Train intent classification for agent routing
● Build RL-based systems for multi-step action planning
● Develop evaluation models for agent output quality
● Create meta-learning pipelines for continuous improvement
● Handle conflicting agent recommendations with trained arbitration models Tech Stack: PyTorch, Ray for distributed training, custom RL implementations
6. NeoCore SDK & ML Infrastructure
Build enterprise ML APIs with custom model serving. Design multi-tenant architecture with model versioning, build A/B testing infrastructure, implement model monitoring and drift detection, and create auto-scaling inference pipelines.
Key Challenges:
● Sub-100ms inference at scale with model optimization
● Multi-tenant model serving with resource isolation
● A/B testing infrastructure for model experiments
● Automated retraining pipelines on concept drift
● Custom domain fine-tuning for enterprise clients
Tech Stack: FastAPI, model serving (TorchServe, TensorFlow Serving), MLOps tools, Docker/K8s
Technical Stack You'll Master
ML/DL Frameworks: PyTorch (primary), TensorFlow/Keras, JAX
Model Training: Distributed training, mixed precision, gradient accumulation, hyperparameter tuning
Model Optimization: Quantization, pruning, distillation, ONNX, TensorRT MLOps: Experiment tracking (Weights & Biases, MLflow), model versioning, CI/CD for ML Speech/NLP: Transformers, wav2vec2, Whisper, BERT variants, custom architectures Traditional ML: Scikit-learn, XGBoost, clustering, dimensionality reduction Infrastructure: Python async, distributed systems, GPU optimization, streaming pipelines Data: Graph databases, vector databases, real-time analytics
What Success Looks Like
3 Months:
● Custom speaker diarization model in production with >85% accuracy ● Fine-tuned embedding model powering memory search
● ML pipeline processing 10K+ conversations daily with <500ms latency ● First enterprise deployments live
6 Months:
● Edge-optimized models reducing cloud inference costs by 60%
● Proactive insight detection using unsupervised learning
● Multi-agent workflows with trained routing and arbitration
● A/B testing infrastructure validating model improvements
12 Months:
● Automated retraining pipelines maintaining model quality
● You've built an ML engineering team
● Core AI systems are defensible competitive moats
● Models outperform generic foundation models on domain tasks
Who You Are
Must-Have:
● 2-5 years building and deploying ML/DL models in production serving real users at scale
● Strong PyTorch or TensorFlow expertise: training, optimization, debugging, deployment
● End-to-end ML ownership: data pipeline → model training → production → monitoring → iteration
● Deep learning fundamentals: architectures (CNNs, RNNs, Transformers), optimization, regularization
● Production ML systems: model serving, A/B testing, monitoring, retraining pipelines ● Python expert: async programming, optimization, profiling, debugging ● System design: distributed systems, high throughput, low latency, GPU optimization ● Pragmatic builder: ship fast, validate with data, iterate based on metrics
Strong Plus:
● Speech processing (ASR, diarization, TTS) or NLP (NER, embeddings, generation) ● Knowledge graphs and graph neural networks
● Model compression and edge deployment (quantization, pruning, distillation) ● LLM fine-tuning (LoRA, RLHF, prompt engineering)
● Multi-agent systems and reinforcement learning
● Indian language experience (Hindi, Tamil, Telugu, etc.)
● Open-source ML contributions or research publications
● Experience with Hugging Face ecosystem
Why This Role is Special
Greenfield ML Problems: Train models for problems that don't have pre-trained solutions - Indian accent diarization, Hindi-English entity linking, temporal conversation understanding. Build from first principles.
Own the Full Stack: Not just calling APIs. Train models, build data pipelines, optimize for edge, deploy at scale, monitor quality, iterate based on metrics.
Founding Team Equity: Meaningful equity in a fast-growing startup defining a new category.
Exceptional Team: Work with technical founders (IIT Madras AI background) who understand ML deeply. Small team, high autonomy, first-principles thinking.
Real Impact: Your models power how families stay connected, professionals manage relationships, and enterprises build conversation intelligence.
Market Timing: Ambient computing is nascent. The models you build will set standards for conversational AI infrastructure.
What We Offer
Location: Bangalore (Onsite - we ship hardware, need to be hands-on)
Culture: High autonomy, ship-focused, weekly demos, direct feedback
Perks: Learning budget, conference passes, MacBook Pro + GPU workstation, full ML experimentation budget
Equity: Meaningful ownership in a fast-growing startup
How We Work
Ship weekly: Models reach production every week, not quarters
First principles: Question assumptions, validate with ablation studies
Deep work: Protected focus blocks for training runs, batched meetings
Direct communication: No corporate BS, honest technical feedback
AI-assisted development: Leverage Claude/Copilot for 3-4x productivity Experiment rigorously: Track everything, A/B test model changes, data-driven decisions
Interview Stages
1. Initial Screening (30 min): Chat about your ML background and approach to a real Neo problem
2. Technical Deep Dive (2 hours):
○ ML fundamentals discussion (architectures, optimization, debugging) ○ System design for ML at scale
○ Coding: implement a model component in PyTorch
○ Live model debugging/optimization exercise
3. Founder Chat (1 hour): Team meet, vision alignment, compensation discussion
Real Problems You'll Solve (Examples)
1. Train a speaker diarization model that handles 4+ speakers in Hindi-English code-mixed conversations with background noise
2. Fine-tune an embedding model for semantic search where "What did Sarah say about the budget?" retrieves conversations from 3 months ago
3. Build a temporal NER system that links "my manager" mentioned today to "Priya" from last week's conversation
4. Optimize a Transformer model from 200ms to <50ms latency for mobile deployment without accuracy loss
5. Design an RL system where agents learn to proactively remind users of forgotten commitments
These aren't interview questions. These are Tuesday problems.
Key Skills
Ranked by relevanceReady to apply?
Join Recro and take your career to the next level!
Application takes less than 5 minutes

