Omilia - Conversational Intelligence

Senior Speech Applied Scientist

Omilia - Conversational Intelligence

full-time

Posted on:

Location Type: Remote

Location: Remote • 🇬🇷 Greece

Visit company website
AI Apply
Apply

Job Level

Senior

Tech Stack

Node.jsPyTorchTensorflow

About the role

  • Pioneer Research: Research and implement state-of-the-art approaches for multi-modal LLMs within an end-to-end, speech-to-speech dialog system architecture.
  • Train & Optimize: Drive the training, fine-tuning, and optimization of our multi-modal LLMs. Your focus will be on enabling full-duplex conversational capabilities, advanced tool-calling, robust barge-in detection, stronger reasoning, in-context learning, and context-aware natural speech generation.
  • Build Data Flywheels: Design and implement robust data pipelines for the entire multi-modal LLM lifecycle, including data curation, preparation, model training, and rigorous evaluation.
  • Scale Our Infrastructure: Develop and optimize our training infrastructure to enable fast, large-scale experimentation (multi-GPU and multi-node training), dramatically accelerating our S2S model development cycle.
  • Collaborate & Deploy: Work closely with product and engineering teams to transform your research models into robust, scalable, and deployable services that our customers will love.
  • Publish Your Work: Publish pioneering research at top-tier academic conferences while successfully deploying systems into production environments.

Requirements

  • A PhD or MSc in Computer Science, Electrical Engineering, Computational Linguistics, or a related field with a focus on speech processing or deep learning.
  • Proven experience in one or more of the following areas: Automatic Speech Recognition (ASR), Text-to-Speech (TTS), Natural Language Processing (NLP), and Spoken Language Understanding (SLU).
  • Deep hands-on experience with deep learning frameworks like PyTorch, TensorFlow, DeepSpeed or Lightning.
  • Strong background in training, fine-tuning, and evaluating Large Language Models (LLMs), especially in multi-modal or speech-related contexts.
  • Experience with large-scale model training on distributed, multi-GPU/multi-node infrastructure.
  • A strong publication record in top-tier conferences (e.g., ICASSP, Interspeech, NeurIPS, ACL) is a plus.
  • A proactive, collaborative, and innovative mindset with a passion for solving challenging problems.
Benefits
  • Fixed compensation;
  • Long-term employment with the working days vacation;
  • Development in professional growth (courses, training, etc);
  • Being part of successful cutting-edge technology products that are making a global impact in the service industry;
  • Proficient and fun-to-work-with colleagues;
  • Apple gear.

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard skills
multi-modal LLMsAutomatic Speech Recognition (ASR)Text-to-Speech (TTS)Natural Language Processing (NLP)Spoken Language Understanding (SLU)deep learning frameworksPyTorchTensorFlowDeepSpeedLightning
Soft skills
collaborativeinnovativeproactiveproblem-solving
Certifications
PhDMSc