Apply

Ready to go for it?

AI Apply speeds things up—apply directly if you prefer.

FREE ACCESS
5,000–10,000 jobs/day
JobTailor Logo

See all jobs on JobTailor

Search thousands of fresh jobs every day.

Discover
  • Fresh listings
  • Fast filters
  • No subscription required
Create a free account and start exploring right away.
Abridge

Engineering Manager, Model Inference

Abridge

Engineering Manager leading AI inference team for Abridge, transforming healthcare delivery with generative AI. Focusing on building fast, reliable inference systems for clinical conversations.

Posted 5/20/2026full-timeSan Francisco • California • 🇺🇸 United StatesMid-LevelSenior💰 $220,000 - $270,000 per yearWebsite

Tech Stack

Tools & technologies
PyTorchTensorflow

About the role

Key responsibilities & impact
  • Lead and grow a high-performing team of AI inference engineers focused on building and scaling infrastructure for Abridge’s products and APIs
  • Own the technical direction of our inference systems—making key decisions around batching, throughput, latency, and GPU utilization
  • Architect and scale inference infrastructure for reliability, efficiency, and observability; lead incident response
  • Benchmark and eliminate bottlenecks throughout the inference stack
  • Partner with ML Research teams on model optimization, quantization, and deployment
  • Develop APIs for AI inference used by both internal teams and external customers
  • Recruit, mentor, and develop engineering talent; establish team processes, engineering standards, and operational excellence
  • Work closely with the GenAI Platform, Data, and Product teams to plan and execute projects that directly impact clinicians and patients

Requirements

What you’ll need
  • 5+ years of engineering experience with 1+ years in a technical leadership or management role
  • Deep, hands-on experience with ML systems and inference frameworks (e.g., PyTorch, TensorRT, vLLM, TensorFlow)
  • Strong understanding of LLM architecture (eg. Multi-Head Attention, Multi/Grouped-Query Attention, and common transformer components)
  • Experience with inference optimizations (eg. batching, quantization, kernel fusion, FlashAttention)
  • Familiarity with GPU characteristics, roofline models, and performance analysis
  • Experience deploying reliable, distributed, real-time systems at scale
  • Experience with parallelism strategies: tensor parallelism, pipeline parallelism, expert parallelism
  • Skilled at hiring and mentorship, with a demonstrated track record of helping engineers grow their skills and careers
  • Strong technical communication and cross-functional collaboration skills
  • Comfortable giving constructive feedback on technical designs and code reviews
  • Has thrived in a fast-growing startup and knows how to operate with urgency and focus.

Benefits

Comp & perks
  • Generous Time Off: 14 paid holidays, flexible PTO for salaried employees, and accrued time off for hourly employees
  • Comprehensive Health Plans: Medical, Dental, and Vision coverage for all full-time employees and their families.
  • Generous HSA Contribution: If you choose a High Deductible Health Plan, Abridge makes monthly contributions to your HSA.
  • Paid Parental Leave: Generous paid parental leave for all full-time employees.
  • Family Forming Benefits: Resources and financial support to help you build your family.
  • 401(k) Matching: Contribution matching to help invest in your future.
  • Personal Device Allowance: Tax free funds for personal device usage.
  • Pre-tax Benefits: Access to Flexible Spending Accounts (FSA) and Commuter Benefits.
  • Lifestyle Wallet: Monthly contributions for fitness, professional development, coworking, and more.
  • Mental Health Support: Dedicated access to therapy and coaching to help you reach your goals.
  • Sabbatical Leave: Paid Sabbatical Leave after 5 years of employment.
  • Compensation and Equity: Competitive compensation and equity grants for full time employees.

ATS Keywords

✓ Tailor your resume
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
AI inferenceinference frameworksPyTorchTensorRTvLLMTensorFlowinference optimizationsbatchingquantizationkernel fusion
Soft Skills
technical leadershipmentorshiptechnical communicationcross-functional collaborationconstructive feedbackteam processesoperational excellencehiringcoachingurgency