AI Research Engineer, Model Compression, Quantization

Tether.to

AI Research Engineer at Tether driving innovation in model compression for multimodal AI systems. Focused on efficient deployment of advanced AI models while maintaining high performance.

Posted 5/19/2026full-timeRemote • 🇮🇹 ItalyMid-LevelSeniorWebsite

Tech Stack

Tools & technologies

PyTorch

About the role

Key responsibilities & impact

Drive innovation in model compression and efficient deployment for advanced multimodal AI systems, including large language models (LLMs) and vision-language models (VLMs).
Reduce model footprint and computational cost while preserving accuracy, enabling high-performance AI to run efficiently across resource-constrained edge devices.
Apply and advance compression techniques such as quantization, knowledge distillation, and pruning.
Build robust compression pipelines, establish performance and fidelity metrics, and address bottlenecks in production inference.

Requirements

What you’ll need

A degree in Computer Science or related field.
Ideally PhD in NLP, Machine Learning, or a related field, complemented by a solid track record in AI R&D (with good publications in A* conferences).
Experience with PyTorch deep learning frameworks or equivalent frameworks
Hands-on experience with model quantization including both Quantization-Aware Training (QAT) and Post-Training Quantization (PTQ).
Research and hands-on experience with knowledge distillation for compressing large models into smaller, efficient ones.
Research and hands-on experience with model pruning for compressing large models into smaller, efficient ones.
Solid understanding of neural network architectures and training processes – Including transformers (e.g., LLMs, VLMs), backpropagation, optimization, and fine-tuning techniques.
Familiarity with C++ is a plus (especially for implementing low-level quantization kernels or inference optimizations).

Benefits

Comp & perks

Competitive salary
Flexible work hours
Professional development budget
Home office setup allowance
Global team events

ATS Keywords

✓ Tailor your resume

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools

model compressionquantizationknowledge distillationpruningPyTorchQuantization-Aware TrainingPost-Training Quantizationneural network architecturestransformersbackpropagation

Certifications

PhD in NLPPhD in Machine Learning