Track This Job
Add this job to your tracking list to:
- Monitor application status and updates
- Change status (Applied, Interview, Offer, etc.)
- Add personal notes and comments
- Set reminders for follow-ups
- Track your entire application journey
Save This Job
Add this job to your saved collection to:
- Access easily from your saved jobs dashboard
- Review job details later without searching again
- Compare with other saved opportunities
- Keep a collection of interesting positions
- Receive notifications about saved jobs before they expire
AI-Powered Job Summary
Get a concise overview of key job requirements, responsibilities, and qualifications in seconds.
Pro Tip: Use this feature to quickly decide if a job matches your skills before reading the full description.
What You'll Do
- Implement, train, and evaluate state-of-the-art TTS models to generate high-quality, expressive speech targeted for our key products.
- Collaborate with language specialists and data labelers to organize the collection and maintenance of essential speech data.
- Contribute to the development of core speech synthesis inference engine.
- Optimize models for production runtime.
- Work with the systems and infrastructure teams to assist in the integration and deployment of TTS models into our production environment.
- Analyze model performance and work with product stakeholders to identify areas of improvement. Contribute to the iterative enhancement of our TTS technology.
- Stay current with the latest research and advancements in the TTS field and apply new techniques to our systems.
- 3+ years of professional experience in machine learning, with a strong focus or interest in speech-related topics like TTS or ASR.
- Excellent programming skills in Python and strong experience with PyTorch. Proficiency in C++ is a big plus.
- Strong knowledge of and experience implementing key machine learning concepts such as transformers, speech tokenizers, diffusion, flow-matching, LoRA, GANs.
- Familiarity with cloud technologies such as docker and kubernetes.
- Experience with torchscript or onnx is a plus.
- A track record of working with an entire machine learning pipeline, from data preprocessing to model training and evaluation, in particular for TTS and ASR models.
- A collaborative spirit and the ability to work effectively with cross-functional teams.
- Drawn to tackling complex technical challenges and eager to learn and grow in the field of speech synthesis.
- We recognize that not every candidate will meet every listed requirement. If you believe your skills and experiences position you to contribute meaningfully in this role, we encourage you to apply. You may offer strengths and perspectives we have not yet considered.
Compensation includes salary, equity, comprehensive healthcare, paid time off, and other benefits. Our recruiting team will provide a specific salary range based on location and years of experience.
By working at SoundHound AI, you will join hundreds of employees across the globe who strive every day to create exceptional AI-powered experiences for customers, employees, and patients.
We are a values-driven company that is supportive of one another, open and honest, undaunted by challenges, nimble and focused, and determined to excel and win. Our mission is to build voice AI for the world and use our global, diverse perspectives to achieve real generational breakthroughs.
SoundHound ensures that individuals with disabilities are provided reasonable accommodations to participate in the interview process, perform essential job functions, and receive other employment benefits.
Learn more about our philosophy, benefits, and culture at https://www.soundhound.com/careers.
To view our job applicant privacy policy, please visit https://static.soundhound.com/corpus/ta/applicantprivacynotice.html.
Key Skills
Ranked by relevanceReady to apply?
Join SoundHound AI and take your career to the next level!
Application takes less than 5 minutes