Webflow

Research Engineer, Scaling

Webflow

full-time

Posted on:

Origin:  • 🇺🇸 United States • California

Visit company website
AI Apply
Apply

Salary

💰 $180,000 - $300,000 per year

Job Level

Mid-LevelSenior

Tech Stack

LinuxNode.jsPythonPyTorch

About the role

  • Build systems that let every team and every robot go faster: training more often, evaluating more reliably, and deploying better models to our growing fleet
  • Transform prototypes into production-scale infrastructure for learning and inference, enabling larger training runs and maximizing edge compute utilization
  • High agency and ownership on scaling capabilities in distributed training and/or inference
  • Ensure that compute is never the bottleneck and we always have more compute available than data
  • Enable large-scale (1000+ GPU) training on billion frames+ of robot data, including fault tolerance, distributed ops, and experiment management
  • Optimize high-throughput datacenter scale distributed inference for world models, including building the world's fastest diffusion inference engine
  • Improve low-latency on-device inference for robot policies with quantization, scheduling, distillation and more

Requirements

  • You must be scaling-pilled, and believe that scale will enable humanoid robots to exist
  • Python and/or C++ programming experience
  • An intuitive understanding of training or inference scaling and what makes models run fast or slow
  • Hands-on experience with distributed training (TorchTitan/Accelerate/DeepSpeed, FSDP/ZeRO, NCCL)
  • Multi-node debugging and experiment management experience
  • Depth in inference performance: TensorRT or similar graph compilers, batching/scheduling, and serving systems
  • Real familiarity with quantization (PTQ, QAT; calibration strategies; INT8/FP8; libraries such as TensorRT ModelOpt, bitsandbytes, or equivalent)
  • Experience writing or tuning CUDA/Triton kernels and leveraging vectorization, tensor cores, and memory hierarchy
  • Familiarity with Linux, PyTorch, Triton/CUDA (Tech Stack: Linux Python / C++ PyTorch / TorchTitan / TensorRT Triton / CUDA)
  • Degree in Computer Science or a related field (listed under Ideal Experiences)
Aleph Alpha

Applied Research Engineer – LLM Training

Aleph Alpha
Mid · Seniorfull-time🇩🇪 Germany
Posted: 8 days agoSource: jobs.ashbyhq.com
GoPythonPyTorchRust
Tiger Analytics

Manager/Senior Manager - Recommendation Systems

Tiger Analytics
Seniorfull-time🇨🇦 Canada
Posted: 6 hours agoSource: apply.workable.com
NumpyPySparkPythonPyTorchScikit-LearnSQLTensorflow
Two Six Technologies

AI/ML Data Pipeline Engineer

Two Six Technologies
Mid · Seniorfull-time$130k–$175k / year🇺🇸 United States
Posted: 14 days agoSource: boards.greenhouse.io
AWSCloudDockerElasticSearchKubernetesLinuxNoSQLPostgresPythonSQL
Reddit, Inc.

Software Engineer, ML Feature Platform

Reddit, Inc.
Junior · Midfull-time$186k–$260k / year🇺🇸 United States
Posted: 7 days agoSource: boards.greenhouse.io
CassandraDistributed SystemsGoKafkaKubernetesOpen SourcePythonRedisScala
The Walt Disney Company

Senior Machine Learning Engineer

The Walt Disney Company
Seniorfull-time$139k–$204k / yearCalifornia, New York, Washington · 🇺🇸 United States
Posted: 27 days agoSource: disney.wd5.myworkdayjobs.com
AirflowAWSCloudDockerETLJenkinsKafkaPythonScalaSpark