Luma AI

Research Engineer - Evaluations

Luma AI

full-time

Posted on:

Location: California • 🇺🇸 United States

Visit company website
AI Apply
Apply

Salary

💰 $220,000 - $280,000 per year

Job Level

Mid-LevelSenior

Tech Stack

Distributed SystemsPythonPyTorchTensorflow

About the role

  • Luma is pushing the boundaries of generative AI, building tools that redefine how visual content is created
  • Design and implement scalable pipelines for automated evaluation of generative models, focusing on visual and multimodal outputs (image, video, text, audio)
  • Develop novel metrics and evaluation models capturing fidelity, coherence, temporal consistency, and alignment with human intent
  • Integrate evaluation signals into training loops (including reinforcement learning and reward modeling) to improve model performance
  • Build infrastructure for large-scale regression testing, benchmarking, and monitoring of multimodal generative models
  • Collaborate with researchers running human studies to translate human evaluation frameworks into automated or semi-automated systems
  • Partner with model researchers to identify failure cases and build targeted evaluation harnesses
  • Maintain dashboards, reporting tools, and alerting systems to surface evaluation results to stakeholders
  • Stay current with emerging evaluation techniques in generative AI, multimodal LLMs, and perceptual quality assessment

Requirements

  • Master's or PhD in Computer Science, Machine Learning, or a related technical field (or equivalent industry experience)
  • 3+ years of experience building ML evaluation systems, model pipelines, or large-scale infrastructure
  • Hands-on experience working with visual data (images and/or video), including evaluation, modeling, or data preparation
  • Proficiency in Python and ML frameworks (PyTorch, JAX, or TensorFlow)
  • Familiarity with human-in-the-loop evaluation workflows and how to scale them with automation
  • Strong background in machine learning, with experience in generative models (diffusion, LLMs, multimodal architectures)
  • Strong software engineering skills (CI/CD, testing, data pipelines, distributed systems)
  • Nice to have: Experience with reinforcement learning or reward modeling
  • Nice to have: Prior work on perceptual metrics, multimodal evaluation benchmarks, or retrieval-based evaluation
  • Nice to have: Background in large-scale model training or evaluation infrastructure
  • Nice to have: Experience designing metrics for perceptual quality
  • Nice to have: Familiarity with creative media workflows (film, VFX, animation, digital art)
  • Nice to have: Contributions to open-source evaluation libraries or benchmarks
Spotify

Staff Research Engineer – Music

Spotify
Leadfull-time$215k–$307k / yearNew York · 🇺🇸 United States
Posted: 1 day agoSource: jobs.lever.co
AWSAzureCloudGoogle Cloud PlatformPyTorch
Rockwell Automation

Research Engineer

Rockwell Automation
Mid · Seniorfull-timeOhio, Wisconsin · 🇺🇸 United States
Posted: 2 days agoSource: rockwellautomation.wd1.myworkdayjobs.com
GraphQLPythonSOAPVMware
Dora Factory

Cryptographic Research Engineer

Dora Factory
Mid · Seniorfull-time🇺🇸 United States
Posted: 2 days agoSource: jobs.ashbyhq.com
RustSolidityTypeScript
Bristol Myers Squibb

Research Engineering Intern

Bristol Myers Squibb
Entryinternship$27–$43New Jersey · 🇺🇸 United States
Posted: 9 days agoSource: bristolmyerssquibb.wd5.myworkdayjobs.com
AWSEC2LinuxPythonShell Scripting
Adobe

Research Engineer

Adobe
Mid · Seniorfull-time$121k–$229k / yearCalifornia · 🇺🇸 United States
Posted: 10 days agoSource: adobe.wd5.myworkdayjobs.com
Python