FREE ACCESS
5,000–10,000 jobs/day

See all jobs on JobTailor
Search thousands of fresh jobs every day.
Discover
- Fresh listings
- Fast filters
- No subscription required
Create a free account and start exploring right away.

Senior Technical Product Manager, Observability
VultrSenior Technical Product Manager owning the Observability Platform, providing telemetry and monitoring solutions at Vultr. Collaborating with teams to ensure observability for GPU clusters and cloud environments.
Tech Stack
Tools & technologiesCloudDistributed SystemsKubernetes
About the role
Key responsibilities & impact- Own the end-to-end Observability Platform roadmap across telemetry ingestion, querying, visualization, alerting, and retention for large-scale GPU clusters and multi-tenant cloud environments
- Define Vultr's observability strategy across bare metal, VMs, Kubernetes, and managed services, aligned to infrastructure roadmap, reliability goals, and customer experience
- Drive the customer-facing observability surface across dashboards, APIs, telemetry pipelines, and topology-aware insights
- Translate low-level signals across GPU, CPU, memory, storage, and network into actionable health views, alerts, and debugging workflows for customers
- Work closely with engineering on technical tradeoffs across metrics agents, collectors, data models, telemetry pipelines, APIs, and retention architecture
- Build products for distributed AI environments by understanding how training and inference workloads behave across nodes, clusters, schedulers, and network fabrics
- Define health models that help customers quickly identify degraded nodes, performance anomalies, and cluster bottlenecks at fleet scale
- Ensure new infrastructure and platform launches are observable by design through strong partnership with compute, network, and platform teams
- Stay current on modern observability stacks and AI infrastructure trends, including how GPU workloads change performance analysis, cost attribution, and operational workflows
Requirements
What you’ll need- 7+ years of product management experience in cloud infrastructure, observability, monitoring, or developer platforms
- Deep understanding of observability and monitoring systems, including metrics, logging, tracing, alerting, and telemetry pipeline architecture
- Experience defining product strategy and roadmaps for platform or infrastructure products at scale
- Strong technical background — ability to engage with engineering on telemetry agents, data models, query engines, retention, and distributed systems
- Experience with GPU, AI/ML, or HPC infrastructure monitoring and the unique observability challenges of training and inference workloads
- Track record of shipping developer- and operator-facing products with measurable impact on reliability, time-to-detect, or operational efficiency
- Experience working across cross-functional teams (engineering, design, marketing, sales) in a fast-paced environment
- Excellent written and verbal communication skills, with the ability to translate complex technical concepts for diverse audiences
- Bachelor's degree in Computer Science, Engineering, or a related field (or equivalent experience).
Benefits
Comp & perks- 100% company-paid insurance premiums for employee medical, dental and vision plans.
- 401(k) plan that matches 100% up to 4%, with immediate vesting
- Professional Development Reimbursement of $2,500 each year
- 11 Holidays + Paid Time Off Accrual + Rollover Plan
- Commitment matters to Vultr! Increased PTO at 3 year and 10 year anniversary + 1 month paid sabbatical every 5 years + Anniversary Bonus each year
- $500 stipend for remote office setup in first year + $400 each following year
- Internet reimbursement up to $75 per month
- Gym membership reimbursement up to $50 per month
- Company paid Wellable subscription
ATS Keywords
✓ Tailor your resumeApplicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
observabilitymonitoring systemstelemetry pipeline architecturemetricsloggingtracingalertingGPU monitoringAI/ML infrastructureHPC infrastructure
Soft Skills
product managementcross-functional collaborationcommunication skillstechnical engagementstrategic thinkingproblem-solvingcustomer experience focusteam leadershipadaptabilityanalytical skills