Product Manager – AI Inference, Model Serving

Mirantis

Product Manager driving AI inference and model serving for Mirantis, a cloud infrastructure company. Leading product strategy and lifecycle for cutting-edge AI-focused solutions.

Posted 5/21/2026full-timeRemote • Texas • 🇺🇸 United StatesSeniorLeadWebsite

About the role

Key responsibilities & impact

Own product strategy, roadmap, and lifecycle for inference and model serving, including serverless inference, dedicated endpoints, autoscaling, routing, KV cache management, and the related observability
Lead deep technical discovery with NeoClouds, sovereign clouds, and enterprise platform teams, and translate findings into prioritized requirements and architecture direction
Partner with engineering on system design trade-offs across runtime integration, GPU scheduling, network, storage, and serving topology, including disaggregated serving and multi-model serving
Define positioning grounded in measurable outcomes: latency distributions, throughput per GPU, utilization, tail reliability, and cost per tokens
Drive go-to-market execution: pricing and packaging, reference architectures, sizing guides, PoC playbooks, and direct engagement with customers, analysts, and ecosystem partners

Requirements

What you’ll need

7+ years in product management, technical product management, or a senior technical role owning AI/ML and inference product(s)
Strong understanding of production AI inference, including model serving, serverless execution, dedicated endpoints, autoscaling, routing, workload placement, observability, and reliability
Proven capability to reason about performance trade-offs across GPU, network, storage, orchestration, and runtime layers, and to translate low-level technical capability into business value such as TTFT, throughput per GPU, and TCO
Working knowledge of modern inference runtimes (vLLM, SGLang, TensorRT-LLM, Dynamo, Triton) and the optimization patterns that matter in production: continuous batching, KV cache management, cold starts, prefill versus decode, disaggregated serving, and multi-model serving
Credibility with engineering leaders and infrastructure operators, including comfort in production architecture reviews and technical commercial conversations with platform engineering buyers.

Benefits

Comp & perks

Work with an established Silicon Valley leader in the cloud infrastructure industry.
Work with exceptionally passionate, talented and engaging colleagues, helping Fortune 500 and Global 2000 customers implement next-generation cloud technologies.
Be a part of cutting-edge, open-source innovation.
Thrive in the high-energy environment of a young company where openness, collaboration, risk-taking, and continuous growth are valued.
Professional development and training.
Attend conferences and working groups.
Customized workstation (macOS, Windows).
A competitive compensation package with strong benefits plan and stock options.

ATS Keywords

✓ Tailor your resume

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools

AI inferencemodel servingserverless executionautoscalingroutingobservabilityGPU schedulingstorageinference runtimesperformance trade-offs

Soft Skills

product strategytechnical discoverysystem designgo-to-market executioncustomer engagementanalytical thinkingcommunicationcollaborationleadershipprioritization