Liquid AI

Technical Staff Member – Edge Inference Engineer

Liquid AI

full-time

Posted on:

Location Type: Hybrid

Location: San FranciscoCaliforniaUnited States

Visit company website

Explore more

AI Apply
Apply

Job Level

Tech Stack

About the role

  • Implement and optimize inference kernels for CPU, NPU, and GPU architectures across diverse edge hardware
  • Develop quantization strategies (INT4, INT8, FP8) that maximize compression while preserving model quality under strict memory budgets
  • Contribute to llama.cpp and other open-source inference frameworks, including new model architectures (audio, vision)
  • Profile and optimize end-to-end inference pipelines to achieve sub-100ms time-to-first-token on target devices
  • Collaborate with ML researchers to understand model architectures and identify optimization opportunities specific to Liquid Foundation Models

Requirements

  • 5+ years of experience in systems programming with strong C++ proficiency
  • Embedded software engineering experience or work on resource-constrained systems
  • Understanding of ML fundamentals at the linear algebra level (how matrix operations, attention, and quantization work)
  • Experience with hardware architecture concepts: cache hierarchies, memory bandwidth, SIMD/vectorization
  • Contributions to llama.cpp, ExecuTorch, or similar inference frameworks (nice-to-have)
  • Experience with Rust for systems programming (nice-to-have)
  • Background in custom accelerator development (TPU, NPU) or work at companies like SambaNova, Cerebras, Groq, or Google/Amazon accelerator teams (nice-to-have)
  • Quantitative degree (mathematics, physics, or similar) combined with engineering experience (nice-to-have)
Benefits
  • Competitive base salary with equity in a unicorn-stage company
  • We pay 100% of medical, dental, and vision premiums for employees and dependents
  • 401(k) matching up to 4% of base pay
  • Unlimited PTO plus company-wide Refill Days throughout the year
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
C++Rustquantization strategiesinference kernelsend-to-end inference pipelinesmatrix operationsattention mechanismsSIMD/vectorizationembedded software engineeringcustom accelerator development