Senior Speech Applied Scientist

Omilia - Conversational Intelligence

full-time

Posted on: 10/2/2025

Location Type: Remote

Location: Remote • 🇬🇷 Greece

Visit company website

✨ AI Apply

Apply

Job Level

Senior

Tech Stack

Node.jsPyTorchTensorflow

About the role

Pioneer Research: Research and implement state-of-the-art approaches for multi-modal LLMs within an end-to-end, speech-to-speech dialog system architecture.
Train & Optimize: Drive the training, fine-tuning, and optimization of our multi-modal LLMs. Your focus will be on enabling full-duplex conversational capabilities, advanced tool-calling, robust barge-in detection, stronger reasoning, in-context learning, and context-aware natural speech generation.
Build Data Flywheels: Design and implement robust data pipelines for the entire multi-modal LLM lifecycle, including data curation, preparation, model training, and rigorous evaluation.
Scale Our Infrastructure: Develop and optimize our training infrastructure to enable fast, large-scale experimentation (multi-GPU and multi-node training), dramatically accelerating our S2S model development cycle.
Collaborate & Deploy: Work closely with product and engineering teams to transform your research models into robust, scalable, and deployable services that our customers will love.
Publish Your Work: Publish pioneering research at top-tier academic conferences while successfully deploying systems into production environments.

Requirements

A PhD or MSc in Computer Science, Electrical Engineering, Computational Linguistics, or a related field with a focus on speech processing or deep learning.
Proven experience in one or more of the following areas: Automatic Speech Recognition (ASR), Text-to-Speech (TTS), Natural Language Processing (NLP), and Spoken Language Understanding (SLU).
Deep hands-on experience with deep learning frameworks like PyTorch, TensorFlow, DeepSpeed or Lightning.
Strong background in training, fine-tuning, and evaluating Large Language Models (LLMs), especially in multi-modal or speech-related contexts.
Experience with large-scale model training on distributed, multi-GPU/multi-node infrastructure.
A strong publication record in top-tier conferences (e.g., ICASSP, Interspeech, NeurIPS, ACL) is a plus.
A proactive, collaborative, and innovative mindset with a passion for solving challenging problems.

Benefits

Fixed compensation;
Long-term employment with the working days vacation;
Development in professional growth (courses, training, etc);
Being part of successful cutting-edge technology products that are making a global impact in the service industry;
Proficient and fun-to-work-with colleagues;
Apple gear.

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard skills

multi-modal LLMsAutomatic Speech Recognition (ASR)Text-to-Speech (TTS)Natural Language Processing (NLP)Spoken Language Understanding (SLU)deep learning frameworksPyTorchTensorFlowDeepSpeedLightning

Soft skills

collaborativeinnovativeproactiveproblem-solving

Certifications

PhDMSc