Apply

Ready to go for it?

AI Apply speeds things up—apply directly if you prefer.

FREE ACCESS
5,000–10,000 jobs/day
JobTailor Logo

See all jobs on JobTailor

Search thousands of fresh jobs every day.

Discover
  • Fresh listings
  • Fast filters
  • No subscription required
Create a free account and start exploring right away.
Mirantis

Product Manager – AI Inference, Model Serving

Mirantis

Product Manager driving AI inference and model serving for Mirantis, a cloud infrastructure company. Leading product strategy and lifecycle for cutting-edge AI-focused solutions.

Posted 5/21/2026full-timeRemote • Texas • 🇺🇸 United StatesSeniorLeadWebsite

About the role

Key responsibilities & impact
  • Own product strategy, roadmap, and lifecycle for inference and model serving, including serverless inference, dedicated endpoints, autoscaling, routing, KV cache management, and the related observability
  • Lead deep technical discovery with NeoClouds, sovereign clouds, and enterprise platform teams, and translate findings into prioritized requirements and architecture direction
  • Partner with engineering on system design trade-offs across runtime integration, GPU scheduling, network, storage, and serving topology, including disaggregated serving and multi-model serving
  • Define positioning grounded in measurable outcomes: latency distributions, throughput per GPU, utilization, tail reliability, and cost per tokens
  • Drive go-to-market execution: pricing and packaging, reference architectures, sizing guides, PoC playbooks, and direct engagement with customers, analysts, and ecosystem partners

Requirements

What you’ll need
  • 7+ years in product management, technical product management, or a senior technical role owning AI/ML and inference product(s)
  • Strong understanding of production AI inference, including model serving, serverless execution, dedicated endpoints, autoscaling, routing, workload placement, observability, and reliability
  • Proven capability to reason about performance trade-offs across GPU, network, storage, orchestration, and runtime layers, and to translate low-level technical capability into business value such as TTFT, throughput per GPU, and TCO
  • Working knowledge of modern inference runtimes (vLLM, SGLang, TensorRT-LLM, Dynamo, Triton) and the optimization patterns that matter in production: continuous batching, KV cache management, cold starts, prefill versus decode, disaggregated serving, and multi-model serving
  • Credibility with engineering leaders and infrastructure operators, including comfort in production architecture reviews and technical commercial conversations with platform engineering buyers.

Benefits

Comp & perks
  • Work with an established Silicon Valley leader in the cloud infrastructure industry.
  • Work with exceptionally passionate, talented and engaging colleagues, helping Fortune 500 and Global 2000 customers implement next-generation cloud technologies.
  • Be a part of cutting-edge, open-source innovation.
  • Thrive in the high-energy environment of a young company where openness, collaboration, risk-taking, and continuous growth are valued.
  • Professional development and training.
  • Attend conferences and working groups.
  • Customized workstation (macOS, Windows).
  • A competitive compensation package with strong benefits plan and stock options.

ATS Keywords

✓ Tailor your resume
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
AI inferencemodel servingserverless executionautoscalingroutingobservabilityGPU schedulingstorageinference runtimesperformance trade-offs
Soft Skills
product strategytechnical discoverysystem designgo-to-market executioncustomer engagementanalytical thinkingcommunicationcollaborationleadershipprioritization