Salary
💰 $150,000 - $220,000 per year
Tech Stack
DockerKubernetes
About the role
- Deepgram is the leading voice AI platform for developers building speech-to-text (STT), text-to-speech (TTS) and full speech-to-speech (STS) offerings.\n
- 200,000+ developers build with Deepgram’s voice-native foundational models – accessed through APIs or as self-managed software – due to our unmatched accuracy, latency and pricing.\n
- The Opportunity: Voice is the most natural modality for human interaction with machines. However, current sequence modeling paradigms based on jointly scaling model and data cannot deliver voice AI capable of universal human interaction. The challenges are rooted in fundamental data problems posed by audio: real-world audio data is scarce and enormously diverse, spanning a vast space of voices, speaking styles, and acoustic conditions. Even if billions of hours of audio were accessible, its inherent high dimensionality creates computational and storage costs that make training and deployment prohibitively expensive at world scale. We believe that entirely new paradigms for audio AI are needed to overcome these challenges and make voice interaction accessible to everyone.\n
- The Role: Deepgram is seeking a highly skilled and versatile Machine Learning Engineer to join our Research Staff team. As a Member of the Research Staff, this role focuses on scaling training systems for speech related technologies, building internal tools, and driving innovation in data strategies. You'll work at the intersection of machine learning, data infrastructure, and internal tooling to support our mission of building world-class speech recognition and synthesis systems.\n
- Key Responsibilities\n Scalable Model Training: Architect and manage horizontally scalable training systems for Speech to Text (STT) and Text to Speech (TTS) models across diverse domains, including, but not limited to: non-english languages, use cases, and customer-centric. These systems include data preparation and management, training pipelines, and automated evaluation tooling.\n Tooling & Accessibility: Design and implement internal UIs and tools that make ML systems and workflows accessible to non-technical stakeholders across the company. These UIs should be designed to provide transparency and flexibility to internally built tooling.\n Infrastructure & Tools: Oversee and manage training tooling, job orchestration, experiment tracking, and data storage.\n
- The Challenge: We are seeking Members of the Research Staff who: See "unsolved" problems as opportunities to pioneer entirely new approaches; Can identify the one critical experiment that will validate or kill an idea in days, not months; Have the vision to scale successful proofs-of-concept 100x; Are obsessed with using AI to automate and amplify your own impact.\n
- If you find yourself energized... you might be the researcher we need.\n
- It's Important to Us That You Have: Strong experience in training large-scale machine learning systems, particularly in STT or related speech domains. Proficiency with orchestration and infrastructure tools like Kubernetes, Docker, and Prefect. Familiarity with ML lifecycle tools such as MLflow. Experience building internal tools or dashboards for non-technical users. Hands-on experience with data engineering practices for unstructured audio and text data. Comfortable working in cross-functional teams that include researchers, engineers, and product stakeholders.
Requirements
- Strong experience in training large-scale machine learning systems, particularly in STT or related speech domains.\n
- Proficiency with orchestration and infrastructure tools like Kubernetes, Docker, and Prefect.\n
- Familiarity with ML lifecycle tools such as MLflow.\n
- Experience building internal tools or dashboards for non-technical users.\n
- Hands-on experience with data engineering practices for unstructured audio and text data.\n
- Comfortable working in cross-functional teams that include researchers, engineers, and product stakeholders.