Lemurian Labs

Runtime Engineer

Lemurian Labs

full-time

Posted on:

Location Type: Hybrid

Location: San FranciscoCaliforniaUnited States

Visit company website

Explore more

AI Apply
Apply

Tech Stack

About the role

  • Design, develop, maintain and improve our multi-target runtime
  • Use the latest techniques in parallelization and partitioning to automate generation and exploit highly optimized kernels
  • Rapid prototyping and data driven exploration of new ideas
  • Benchmark and analyze the outputs produced by our optimizing compiler on target hardware
  • Work closely with our product team to understand the evolving needs of ML engineers and drive improvements in architecture
  • Build tools to collect and analyze performance bottlenecks

Requirements

  • A deep understanding of asynchronous, concurrent programming.
  • 4+ years of experience with C/C++ (C++14 or newer).
  • An understanding of HW architecture (vector vs scalar registers and instructions, memory hierarchies).
  • Knowledge of operating system kernel development or hypervisor development.
  • Experience developing or maintaining libraries like CUDA or ROCm.
  • Experience with GPU programming.
  • Experience with high performance computing (HPC).
  • Masters or PhD degree in computer science, or equivalent practical experience.
  • Knowledge of DL frameworks such as PyTorch, JAX or Triton.
  • Experience with programming large compute clusters.
Benefits
  • equity
  • company bonus opportunities
  • medical, dental, and vision benefits
  • retirement savings plan
  • supplemental wellness benefits
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
CC++C++14asynchronous programmingconcurrent programmingGPU programminghigh performance computingCUDAROCmDL frameworks
Soft Skills
collaborationcommunicationproblem-solvingdata-driven explorationbenchmarkinganalysis
Certifications
Masters degree in computer sciencePhD in computer science