Soraban
AI/ML Software Engineer Intern
SorabanUnited States19 hours ago
ContractRemote FriendlyEngineering, Information Technology
Soraban (YC21) builds AI-powered Copilots and workflow automation for modern accounting firms. Our platform helps tax firms automate client intake, document processing, preparation workflows, and service delivery.

We are looking for an AI/ML Engineer Intern to help Senior engineers deliver impactful customer facing ML and Agentic AI products.

Requirements

As AI Engineer Intern, your job will entail working with every of the following aspects of building AI Agent: data collection and labelling, training different ML or LLM models, trying any out-of-the-box already pre-trained LLM model, building business relevant evaluations for these Agents, monitoring and production support for these models.

The domain Soraban operates in has many problems that are unique and have never been solved before. So this position will give exposure to build and ship AI Agents in production and watch the impact on an entire industry.

What You’ll Own And Drive

  • Defining data topography for each use case and creating training and eval datasets
  • Training and building ML models or GenAI based solutions for solving Soraban specific business problems
  • Integrating these solutions seamlessly into Soraban software applications
  • Implementing production grade and real time deployment, evaluation and monitoring for the AI Agents
  • Handling multiple of such developmental work at the same time

Required

What We’re Looking For

  • BS/MS in Computer Science/Data Science/AI Engineering
  • Well versed in Computer Science theories, concepts, Data Structures and Algorithms
  • Well versed in both traditional machine learning algorithms and in GenAI based algorithms (should have full clarity on how everything from RNN, LSTM, BERT, Transformer to recent LLM actually work)
  • Must have demonstrable example of solving at least few problems using ML or AI Agent in production (any open ended problem solved by fresh graduates and which they can showcase through GitHub repo are also welcome)
  • Must show the rigor to understand process of implementing AI Agent solution with complete business relevance (i.e. bonus points for someone who can describe end-to-end thought process behind their AI Agent project where they moved the needle and had definitive business impact)
  • Must have experience in solving problems that require dealing with real world multi modal data (not just an NLP expert or a CV expert, we seek someone who has experience solving problems with multimodal data)
  • Technological Stack experience: PyTorch/TensorFlow, HF Transformers/LlamaFactory, Unsloth/Ludwig, Phoenix/LangFuse, Python, Milvus/Chroma/Faiss/QDrant

Preferred / Plus

  • Someone with significant thesis/paper/experience involving application of Agentic AI in multimodal data problems and solution
  • Someone with clear demonstrable open source contributions to any Agent Framework, Kaggle Challenges (esp involving multimodal document reading challenges), LLM training infrastructure, Eval infrastructure, Embedding Technique and Generation Infrastructure OR open source repository of from scratch ML algorithms or LLM implementation/optimization implementation.
  • Someone who can work with ownership and minimal supervision
  • Experience with document automation tools
  • Familiarity with tech-enabled workflow systems or SaaS tax tools will be a bonus

Internship Details

  • Internship duration: Intern will work 20 hours per week. 3 months to start, with potential for extension.
  • SF Bay Area candidates only. This role is not fully remote; you will be expected to come into the office on a periodic basis for in-person collaboration.
  • No visa sponsorship is available for this role. Candidates must have work authorization in the United States. F-1 students on OPT are eligible.

Key Skills

Ranked by relevance