Tether.to

Senior Research Engineer – Multimodal, Video Foundation Model

Tether.to

full-time

Posted on:

Location Type: Remote

Location: Anywhere in North America

Visit company website

Explore more

AI Apply
Apply

Job Level

Tech Stack

About the role

  • Pioneer multimodal and video-centric research that moves fast and breaks ground, contributing directly to usable prototypes and scalable systems.
  • Design and implement novel AI architectures for multimodal language models, integrating text, visual, and audio modalities.
  • Engineer scalable training and inference pipelines optimized for large-scale multimodal datasets and distributed GPU systems across thousands of GPUs.
  • Optimize systems and algorithms for efficient data processing, model execution, and pipeline throughput.
  • Build modular tools for preprocessing, analyzing, and managing multimodal data assets (e.g., images, video, text).
  • Collaborate cross-functionally with research and engineering teams to translate cutting-edge model innovations into production-grade solutions.
  • Prototype generative AI applications showcasing new capabilities of multimodal foundation models in real-world products.
  • Develop benchmarking tools to rigorously evaluate model performance across diverse multimodal tasks.

Requirements

  • Bachelor’s degree in Computer Science, Computer Engineering, or a related technical field, or equivalent practical experience
  • Expertise in Python & Pytorch, including practical experience working with the full development pipeline from data processing & data loading to training, inference, and optimization.
  • Experience working with large-scale text data, or (bonus) interleaved data spanning audio, video, image, and/or text.
  • Direct hands-on experience in developing or benchmarking at least one of the following topics: LLMs, Vision Language Models, Audio Language Models, generative video models
  • First-author publications at leading AI conferences such as CVPR, ICCV, ECCV, ICML, ICLR, NeurIPS etc.
  • PhD in Computer Vision, Machine Learning, NLP, Computer Science, Applied Statistics, or a closely related field (Nice to have)
Benefits
  • Health insurance
  • Flexible work arrangements
  • Professional development opportunities
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
PythonPytorchAI architecturesmultimodal language modelsdata processingtraininginferenceoptimizationbenchmarkinggenerative AI applications
Soft Skills
collaborationcross-functional teamwork
Certifications
Bachelor’s degree in Computer SciencePhD in Computer VisionPhD in Machine LearningPhD in NLPPhD in Applied Statistics