Tech Stack
KafkaPythonPyTorchTensorflow
About the role
- Lead the design and implementation of end-to-end Machine Learning infrastructure for Voice AI products
- Shape technical vision and own strategic architecture decisions for ML training and inference systems
- Mentor and grow a team of Machine Learning engineers and collaborate with research scientists
- Architect robust, modular ML pipelines for model experimentation, feature extraction, and production inference
- Collaborate with data engineering to improve audio dataset quality, labeling pipelines, and feature engineering
- Optimize models for latency, memory, and real-time performance on CPU/GPU/edge hardware
- Introduce frameworks for continual learning, model versioning, and A/B testing in production
- Work cross-functionally with AI research, Infrastructure, and Product teams to accelerate experimentation and deployment velocity
- Stay current with advancements in Voice AI, deep learning, and multimodal model architectures
Requirements
- 10+ years of experience in Machine Learning Systems and ML workflows, with at least 3+ years in a technical leadership capacity
- Advanced proficiency in Python and ML frameworks like PyTorch, TensorFlow, or JAX
- Strong understanding of deep learning architectures (RNNs, LSTMs, CNNs, Transformers, CTC) and their application in accent translation, noise cancellation, acoustic modeling, language modeling, and language translation
- Experience deploying ML models to production (e.g., via ONNX, TensorRT, TorchScript, or custom inference stacks)
- Familiarity with audio data challenges (large file sizes, time-series features, metadata handling)
- Experience with Voice AI models such as ASR, TTS, and speaker verification
- Familiarity with real-time data processing frameworks (Kafka, Flink, Druid, Pinot)
- Familiarity with MLOps, feature engineering, model training and inference workflows
- Experience with labeling tools, audio annotation platforms, or human-in-the-loop annotation pipelines
- Experience at a high-growth startup or tech company operating at scale
- Deep experience with ML tooling for training and serving models (e.g., PyTorch, ONNX, Hugging Face Transformers, torchaudio)
- Experience deploying real-time ASR, TTS, or voice synthesis models in production
- Background in DSP, audio augmentation, or working with noisy or multilingual datasets