AI Research Engineer – Image/Video Foundation Models

GenPeach AI

full-time

Posted on: 3/31/2026

Location Type: Remote

✨ AI Apply

About the role

Implement and iterate on image/video generative model ideas (architecture, losses, conditioning, sampling, pre-training, distillation, post-training)
Own training performance end-to-end (distributed training, throughput, memory, stability, debugging scaling failure modes)
Build the experimentation loop (evals, ablations, reproducibility tooling, reporting, decision hygiene)
Build and improve VLMs for image/video captioning (data recipes, training strategies, model variants, evaluation)
Run high-iteration research: read papers when useful, implement ideas, validate empirically
Create captioning pipelines that improve generation training and product quality
Partner with inference/product to ship under real constraints (latency, cost, reliability, rollout safety)
Build demos and prototypes to showcase capabilities and accelerate iteration

Strong Python and PyTorch skills (4+ years of experience)
Experience implementing and training deep learning models (generative models, VLMs, LLMs, vision/video, or adjacent)
Solid understanding of training dynamics, optimization, and practical debugging
Ability to drive projects end-to-end with minimal supervision

Benefits

Visa sponsorship (where applicable); we’ll make a strong effort to relocate you to Switzerland or Poland if desired
Remote-friendly: work fully remote, hybrid, or on-site from our hubs
Regular offsites and in-person events to collaborate and connect
Flexible PTO

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools

PythonPyTorchdeep learning modelsgenerative modelsVLMsLLMstraining dynamicsoptimizationdebuggingimage/video captioning

Soft Skills

project managementindependenceempirical validationcommunication