Parasail.ai

Capacity and Infrastructure Operations Manager

Parasail.ai

full-time

Posted on:

Location Type: Hybrid

Location: San MateoCaliforniaUnited States

Visit company website

Explore more

AI Apply
Apply

Tech Stack

About the role

  • Own real-time fleet utilization: identify and resolve idle capacity, inefficiencies, and demand/supply mismatches.
  • Define utilization targets and operating policies that balance performance, reliability, and cost.
  • Develop policies and processes for lifecycle management of vendor-sourced instances (bring-up, steady state, rebalancing, decommissioning).
  • Partner with Engineering to define requirements and prioritize automations for capacity acquisition, scaling, rebalancing, failovers, and cost controls.
  • Model and monitor GPU unit economics: cost per GPU-hr, marginal cost, blended vendor rates, and cost leakage.
  • Partner with Finance & Product to align customer pricing with underlying vendor economics.
  • Deliver monthly/quarterly reporting on supply-side cost trends and margin performance, including key drivers and recommended actions.
  • Recommend improvements to pricing, contract mix, vendor allocation, and operational policies to expand gross margin.
  • Build and maintain forecasting models to predict demand, burst behavior, seasonality, and reserve requirements.
  • Determine the optimal mix of contract types (on-demand, committed use, short-term) to maximize flexibility and margin.
  • Maintain capacity buffers and contingency plans to protect against vendor outages, degraded performance, or sudden demand spikes.
  • Source, evaluate, and manage relationships with neocloud and GPU infrastructure providers.
  • Negotiate pricing, SLAs, commitments, contractual flexibility, and scaling terms.
  • Create and maintain vendor scorecards (pricing, reliability, latency, responsiveness, and fit).
  • Identify emerging vendors, negotiate trial capacity, and assess cost–performance tradeoffs.
  • Develop a multi-vendor redundancy strategy to minimize single-provider risk.
  • Stand up the core dashboards and operating cadence to monitor and manage (examples): per GPU family utilization, utilization by contract type, idle capacity, blended cost per GPU hour, vendor latency / performance.
  • Help define a practical tool stack for capacity planning and financial analysis.

Requirements

  • 5+ years in capacity operations, infrastructure operations, technical operations, cloud supply/vendor ops, or a closely related role.
  • Demonstrated experience managing external infrastructure vendors (performance management, SLAs, commercial terms, escalation paths).
  • Strong analytical skills and comfort with unit economics (cost drivers, margin, pricing inputs, forecasting).
  • Experience building operating cadences and dashboards that drive action (not just reporting).
  • Clear communicator who can align Engineering, Finance, Product, and GTM around priorities and tradeoffs.
Benefits
  • Health insurance
  • 401(k) matching
  • Flexible work hours
  • Paid time off
  • Professional development opportunities
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
capacity operationsinfrastructure operationscloud supply operationsvendor managementunit economicsforecastingdashboard creationoperating policiescost analysisvendor scorecards
Soft Skills
analytical skillscommunicationcollaborationnegotiationproblem-solvingrelationship managementstrategic thinkingreportingaction-orientedflexibility