Artificial Intelligence Engineer

RecroIndia1 day ago

Full-timeRemote FriendlyEngineering, Information Technology

Track This Job

Add this job to your tracking list to:

Monitor application status and updates
Change status (Applied, Interview, Offer, etc.)
Add personal notes and comments
Set reminders for follow-ups
Track your entire application journey

Save This Job

Add this job to your saved collection to:

Access easily from your saved jobs dashboard
Review job details later without searching again
Compare with other saved opportunities
Keep a collection of interesting positions
Receive notifications about saved jobs before they expire

AI-Powered Job Summary

Get a concise overview of key job requirements, responsibilities, and qualifications in seconds.

Pro Tip: Use this feature to quickly decide if a job matches your skills before reading the full description.

The Role

You'll be the architect and owner of Neo's AI infrastructure. This means training custom models for our unique use cases, building production ML pipelines, and creating the reasoning systems that make Neo intelligent. You'll work across the full ML lifecycle - from data pipelines to model training to production deployment.

What You'll Own

1. Custom Model Development & Training

Build specialized models that foundation models can't provide. Train speaker diarization for Indian accents, fine-tune embedding models for conversational memory, develop custom NER for Hindi-English code-mixing, and optimize models for edge deployment.

Key Challenges:

● Train speaker diarization models on Indian multi-speaker conversations with code-mixing ● Fine-tune embedding models for semantic search across temporal context ● Build custom NER/entity linking for Hindi-English mixed conversations ● Optimize transformer models for mobile deployment with <100ms latency ● Handle class imbalance in emotion detection and intent classification

Tech Stack: PyTorch/TensorFlow for model training, Hugging Face for fine-tuning, ONNX/TensorRT for optimization

2. Memory Architecture & ML Pipeline

Build the brain that remembers everything. Design temporal knowledge graphs that ingest conversations, extract entities and relationships using custom-trained models, and enable longitudinal pattern detection. Own the full ML pipeline from data ingestion to model inference to graph updates.

Key Challenges:

● Bi-temporal data models with real-time updates

● Entity linking across noisy conversational transcripts

● Relationship extraction using fine-tuned sequence models

● Pattern detection with unsupervised learning (clustering, anomaly detection) ● Privacy-preserving embeddings and federated learning

Tech Stack: PyTorch for custom models, Neo4j/graph databases, vector databases (Qdrant), streaming pipelines

3. Audio Processing & Speech ML

Own the end-to-end speech pipeline. Train/fine-tune ASR models for Indian languages, build speaker diarization systems, develop audio quality assessment models, and optimize for edge deployment. Handle the unique challenges of Indian conversational speech.

Key Challenges:

● Fine-tune Whisper/wav2vec2 for 15+ Indian languages with code-mixing ● Train speaker diarization models handling overlapping speech

● Build voice activity detection for noisy environments

● Develop audio quality assessment using CNNs

● Optimize models for real-time mobile inference (quantization, pruning) Tech Stack: PyTorch, TorchAudio, Kaldi, ESPnet, model compression techniques

4. Intelligence & Reasoning Layer

Create the query understanding and reasoning system. Build hybrid retrieval combining dense embeddings with graph traversal, train ranking models for result quality, develop proactive insight detection, and fine-tune LLMs for conversational queries.

Key Challenges:

● Train re-ranking models for temporal query results

● Fine-tune LLMs for Hindi-English conversational queries

● Build classification models for query intent and temporal scope

● Develop anomaly detection for proactive insights

● Handle distribution shift as user behavior evolves

Tech Stack: PyTorch, sentence-transformers, LLM fine-tuning (LoRA, QLoRA), scikit-learn

5. Multi-Agent Systems & Orchestration

Design agent orchestration where specialized AI agents collaborate. Train classifier models for routing queries, build reward models for agent evaluation, develop action prediction models, and create meta-learning systems that improve over time.

Key Challenges:

● Train intent classification for agent routing

● Build RL-based systems for multi-step action planning

● Develop evaluation models for agent output quality

● Create meta-learning pipelines for continuous improvement

● Handle conflicting agent recommendations with trained arbitration models Tech Stack: PyTorch, Ray for distributed training, custom RL implementations

6. NeoCore SDK & ML Infrastructure

Build enterprise ML APIs with custom model serving. Design multi-tenant architecture with model versioning, build A/B testing infrastructure, implement model monitoring and drift detection, and create auto-scaling inference pipelines.

Key Challenges:

● Sub-100ms inference at scale with model optimization

● Multi-tenant model serving with resource isolation

● A/B testing infrastructure for model experiments

● Automated retraining pipelines on concept drift

● Custom domain fine-tuning for enterprise clients

Tech Stack: FastAPI, model serving (TorchServe, TensorFlow Serving), MLOps tools, Docker/K8s

Technical Stack You'll Master

ML/DL Frameworks: PyTorch (primary), TensorFlow/Keras, JAX

Model Training: Distributed training, mixed precision, gradient accumulation, hyperparameter tuning

Model Optimization: Quantization, pruning, distillation, ONNX, TensorRT MLOps: Experiment tracking (Weights & Biases, MLflow), model versioning, CI/CD for ML Speech/NLP: Transformers, wav2vec2, Whisper, BERT variants, custom architectures Traditional ML: Scikit-learn, XGBoost, clustering, dimensionality reduction Infrastructure: Python async, distributed systems, GPU optimization, streaming pipelines Data: Graph databases, vector databases, real-time analytics

What Success Looks Like

3 Months:

● Custom speaker diarization model in production with >85% accuracy ● Fine-tuned embedding model powering memory search

● ML pipeline processing 10K+ conversations daily with <500ms latency ● First enterprise deployments live

6 Months:

● Edge-optimized models reducing cloud inference costs by 60%

● Proactive insight detection using unsupervised learning

● Multi-agent workflows with trained routing and arbitration

● A/B testing infrastructure validating model improvements

12 Months:

● Automated retraining pipelines maintaining model quality

● You've built an ML engineering team

● Core AI systems are defensible competitive moats

● Models outperform generic foundation models on domain tasks

Who You Are

Must-Have:

● 2-5 years building and deploying ML/DL models in production serving real users at scale

● Strong PyTorch or TensorFlow expertise: training, optimization, debugging, deployment

● End-to-end ML ownership: data pipeline → model training → production → monitoring → iteration

● Deep learning fundamentals: architectures (CNNs, RNNs, Transformers), optimization, regularization

● Production ML systems: model serving, A/B testing, monitoring, retraining pipelines ● Python expert: async programming, optimization, profiling, debugging ● System design: distributed systems, high throughput, low latency, GPU optimization ● Pragmatic builder: ship fast, validate with data, iterate based on metrics

Strong Plus:

● Speech processing (ASR, diarization, TTS) or NLP (NER, embeddings, generation) ● Knowledge graphs and graph neural networks

● Model compression and edge deployment (quantization, pruning, distillation) ● LLM fine-tuning (LoRA, RLHF, prompt engineering)

● Multi-agent systems and reinforcement learning

● Indian language experience (Hindi, Tamil, Telugu, etc.)

● Open-source ML contributions or research publications

● Experience with Hugging Face ecosystem

Why This Role is Special

Greenfield ML Problems: Train models for problems that don't have pre-trained solutions - Indian accent diarization, Hindi-English entity linking, temporal conversation understanding. Build from first principles.

Own the Full Stack: Not just calling APIs. Train models, build data pipelines, optimize for edge, deploy at scale, monitor quality, iterate based on metrics.

Founding Team Equity: Meaningful equity in a fast-growing startup defining a new category.

Exceptional Team: Work with technical founders (IIT Madras AI background) who understand ML deeply. Small team, high autonomy, first-principles thinking.

Real Impact: Your models power how families stay connected, professionals manage relationships, and enterprises build conversation intelligence.

Market Timing: Ambient computing is nascent. The models you build will set standards for conversational AI infrastructure.

What We Offer

Location: Bangalore (Onsite - we ship hardware, need to be hands-on)

Culture: High autonomy, ship-focused, weekly demos, direct feedback

Perks: Learning budget, conference passes, MacBook Pro + GPU workstation, full ML experimentation budget

Equity: Meaningful ownership in a fast-growing startup

How We Work

Ship weekly: Models reach production every week, not quarters

First principles: Question assumptions, validate with ablation studies

Deep work: Protected focus blocks for training runs, batched meetings

Direct communication: No corporate BS, honest technical feedback

AI-assisted development: Leverage Claude/Copilot for 3-4x productivity Experiment rigorously: Track everything, A/B test model changes, data-driven decisions

Interview Stages

1. Initial Screening (30 min): Chat about your ML background and approach to a real Neo problem

2. Technical Deep Dive (2 hours):

○ ML fundamentals discussion (architectures, optimization, debugging) ○ System design for ML at scale

○ Coding: implement a model component in PyTorch

○ Live model debugging/optimization exercise

3. Founder Chat (1 hour): Team meet, vision alignment, compensation discussion

Real Problems You'll Solve (Examples)

1. Train a speaker diarization model that handles 4+ speakers in Hindi-English code-mixed conversations with background noise

2. Fine-tune an embedding model for semantic search where "What did Sarah say about the budget?" retrieves conversations from 3 months ago

3. Build a temporal NER system that links "my manager" mentioned today to "Priya" from last week's conversation

4. Optimize a Transformer model from 200ms to <50ms latency for mobile deployment without accuracy loss

5. Design an RL system where agents learn to proactively remind users of forgotten commitments

These aren't interview questions. These are Tuesday problems.

Key Skills

Ranked by relevance

Ready to apply?

Join Recro and take your career to the next level!

Application takes less than 5 minutes

Apply