FREE ACCESS
5,000–10,000 jobs/day

See all jobs on JobTailor
Search thousands of fresh jobs every day.
Discover
- Fresh listings
- Fast filters
- No subscription required
Create a free account and start exploring right away.

Research Engineer, Multimodal
Character.AIResearch Engineer on the Multimodal team at Character.AI building advanced AI video and image models. Lead fine-tuning, design architectures and collaborate to improve multimedia generation.
Posted 4/15/2026full-timeRedwood City • California • 🇺🇸 United StatesMid-LevelSenior💰 $225,000 - $400,000 per yearWebsite
Tech Stack
Tools & technologiesCloudDockerKubernetesPyTorch
About the role
Key responsibilities & impact- Lead fine-tuning and continued training of video generation models, including image-to-video and joint audio-visual generation.
- Design and experiment with novel model architectures for multimodal generation, including multimodal conditioning (voice, structured text, reference images).
- Leverage techniques such as LoRA, RLHF, and full-parameter fine-tuning to improve model quality across diverse visual scenarios.
- Design and build large-scale data pipelines and automated annotation workflows to support continuous model improvement.
- Explore model compression, inference acceleration, and serving optimizations to enable efficient real-time video processing at scale.
Requirements
What you’ll need- Strong passion for pushing the boundaries of visual AI, with a self-driven, hands-on approach to solving complex technical problems
- Proficient in PyTorch with end-to-end experience across data processing, model training, and deployment
- Solid understanding of video and image generation architectures, including diffusion models, DiT, ControlNet, and SOTA video generation models
- Experience with multimodal model training, including working with audio, vision, and language modalities together
- Experience with distributed training tools (FSDP, DeepSpeed, etc.)
- Experience with large-scale data processing, dataset construction, and automated data cleaning
- Experience with joint audio-visual or speech-conditioned generation models (Nice to Have)
- Experience with AIGC, video effects, character animation, or asset generation products (Nice to Have)
- Familiarity with ML deployment and orchestration (Kubernetes, Slurm, Docker, cloud platforms) (Nice to Have)
- Publications in relevant venues (NeurIPS, ICLR, CVPR, ECCV, ICCV, or similar) (Nice to Have)
Benefits
Comp & perks- 🩺 Top-notch health coverage for you & your family, with majority of the premium covered
- 💰 We invest in your future with a generous 401(K) contribution
- 🍼 New parents, we've got you covered with incredible paid leave -up to 20 weeks
- 🌴 4 weeks of PTO to explore, unwind & come back recharged
- 🍽️ Daily in-office catering plus a monthly Doordash stipend to help keep you fueled no matter where you are**
- ✨ Monthly wellness stipend to support you in your health journey
ATS Keywords
✓ Tailor your resumeApplicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
PyTorchvideo generation modelsimage-to-video generationmultimodal generationLoRARLHFmodel compressioninference accelerationdata processingautomated data cleaning
Soft Skills
self-drivenproblem-solvinghands-on approachpassion for visual AI