Apply

Ready to go for it?

AI Apply speeds things up—apply directly if you prefer.

FREE ACCESS
5,000–10,000 jobs/day
JobTailor Logo

See all jobs on JobTailor

Search thousands of fresh jobs every day.

Discover
  • Fresh listings
  • Fast filters
  • No subscription required
Create a free account and start exploring right away.
Hyperbolic

VP of Engineering

Hyperbolic

VP of Engineering leading the design and evolution of AI cloud infrastructure at Hyperbolic Labs. Building GPU-native cloud systems and managing world-class engineering teams.

Posted 6/12/2026full-timeSan Francisco • California • 🇺🇸 United StatesLeadWebsite

Tech Stack

Tools & technologies
CloudDistributed SystemsKubernetesLinuxRay

About the role

Key responsibilities & impact
  • Lead the design and evolution of our AI cloud platform
  • Define the architecture for GPU orchestration, compute scheduling, networking, storage, and distributed systems
  • Make critical decisions regarding cloud infrastructure, bare-metal deployments, and platform scalability
  • Personally participate in architecture reviews and key technical initiatives
  • Build and scale large GPU clusters supporting customer workloads
  • Design systems for GPU provisioning, scheduling, utilization optimization, and capacity management
  • Drive platform reliability and performance for AI training and inference workloads
  • Partner closely with engineering teams on infrastructure requirements for next-generation AI systems
  • Remain deeply involved in engineering decisions and technical direction
  • Contribute directly to infrastructure design and implementation efforts
  • Review architecture proposals, system designs, and major infrastructure changes
  • Act as the technical escalation point for complex infrastructure challenges
  • Establish best practices for Kubernetes, observability, CI/CD, security, and operational excellence
  • Build SRE and Platform Engineering functions from the ground up
  • Define reliability standards including SLOs, SLIs, incident response processes, and capacity planning
  • Drive automation across infrastructure operations
  • Recruit and develop world-class Infrastructure, Platform, and SRE teams
  • Build a high-performance engineering culture focused on ownership and execution
  • Partner with executive leadership on company strategy and infrastructure investments
  • Manage infrastructure budgets, vendor relationships, and capacity planning

Requirements

What you’ll need
  • 12+ years building and operating large-scale infrastructure systems
  • Experience leading infrastructure organizations while remaining hands-on technically
  • Previous experience building or operating a cloud platform at scale
  • Experience building GPU infrastructure or AI/ML compute platforms
  • Proven track record scaling infrastructure in high-growth startup environments
  • Expert-level Kubernetes knowledge
  • Experience designing and operating multi-region cloud infrastructure
  • Strong understanding of Linux, networking, distributed systems, and storage architecture
  • Experience with Infrastructure-as-Code and automation frameworks
  • Deep expertise in observability, monitoring, and reliability engineering
  • Experience building highly available production systems
  • Strongly Preferred: Experience with GPU scheduling, Slurm, Kubernetes GPU operators, Ray, or distributed training systems
  • Experience managing thousands of GPUs in production environments
  • Background supporting AI training and inference platforms

Benefits

Comp & perks
  • Health insurance
  • Professional development
  • Flexible work arrangements
  • Paid time off

ATS Keywords

✓ Tailor your resume
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
GPU orchestrationcompute schedulingdistributed systemsKubernetesInfrastructure-as-Codeautomation frameworksobservabilitymonitoringreliability engineeringAI/ML compute platforms
Soft Skills
leadershiptechnical decision-makingcollaborationproblem-solvingteam developmentstrategic planningexecutioncommunicationownershipoperational excellence