
Senior Speech Applied Scientist
Omilia - Conversational Intelligence
full-time
Posted on:
Location Type: Remote
Location: Remote • 🇬🇷 Greece
Visit company websiteJob Level
Senior
Tech Stack
Node.jsPyTorchTensorflow
About the role
- Pioneer Research: Research and implement state-of-the-art approaches for multi-modal LLMs within an end-to-end, speech-to-speech dialog system architecture.
- Train & Optimize: Drive the training, fine-tuning, and optimization of our multi-modal LLMs. Your focus will be on enabling full-duplex conversational capabilities, advanced tool-calling, robust barge-in detection, stronger reasoning, in-context learning, and context-aware natural speech generation.
- Build Data Flywheels: Design and implement robust data pipelines for the entire multi-modal LLM lifecycle, including data curation, preparation, model training, and rigorous evaluation.
- Scale Our Infrastructure: Develop and optimize our training infrastructure to enable fast, large-scale experimentation (multi-GPU and multi-node training), dramatically accelerating our S2S model development cycle.
- Collaborate & Deploy: Work closely with product and engineering teams to transform your research models into robust, scalable, and deployable services that our customers will love.
- Publish Your Work: Publish pioneering research at top-tier academic conferences while successfully deploying systems into production environments.
Requirements
- A PhD or MSc in Computer Science, Electrical Engineering, Computational Linguistics, or a related field with a focus on speech processing or deep learning.
- Proven experience in one or more of the following areas: Automatic Speech Recognition (ASR), Text-to-Speech (TTS), Natural Language Processing (NLP), and Spoken Language Understanding (SLU).
- Deep hands-on experience with deep learning frameworks like PyTorch, TensorFlow, DeepSpeed or Lightning.
- Strong background in training, fine-tuning, and evaluating Large Language Models (LLMs), especially in multi-modal or speech-related contexts.
- Experience with large-scale model training on distributed, multi-GPU/multi-node infrastructure.
- A strong publication record in top-tier conferences (e.g., ICASSP, Interspeech, NeurIPS, ACL) is a plus.
- A proactive, collaborative, and innovative mindset with a passion for solving challenging problems.
Benefits
- Fixed compensation;
- Long-term employment with the working days vacation;
- Development in professional growth (courses, training, etc);
- Being part of successful cutting-edge technology products that are making a global impact in the service industry;
- Proficient and fun-to-work-with colleagues;
- Apple gear.
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard skills
multi-modal LLMsAutomatic Speech Recognition (ASR)Text-to-Speech (TTS)Natural Language Processing (NLP)Spoken Language Understanding (SLU)deep learning frameworksPyTorchTensorFlowDeepSpeedLightning
Soft skills
collaborativeinnovativeproactiveproblem-solving
Certifications
PhDMSc