
AI Model Serving Specialist
Rackspace Technology
full-time
Posted on:
Location Type: Remote
Location: Remote • 🇺🇸 United States
Visit company websiteSalary
💰 $82,300 - $140,580 per year
Job Level
Mid-LevelSenior
Tech Stack
CloudDockerGrafanaKubernetesPrometheusPythonVMware
About the role
- Enable enterprise customers to operationalize AI workloads by deploying and optimizing model-serving platforms (e.g., NVIDIA Triton, vLLM, KServe) within Rackspace’s Private Cloud and Hybrid environments.
- Package and deploy ML/LLM models on Triton, vLLM, or KServe within Kubernetes clusters.
- Tune performance for latency and throughput SLAs.
- Work with VMware VCF9, NSX-T, and vSAN ESA to ensure GPU resource allocation and multi-tenancy.
- Implement RBAC, encryption, and compliance controls for sovereign/private cloud customers.
- Integrate models with Rackspace’s Unified Inference API and API Gateway for multi-tenant routing.
- Support RAG and agentic workflows by connecting to vector databases and context stores.
- Configure telemetry for GPU utilization, request tracing, and error monitoring.
Requirements
- Hands-on experience with **NVIDIA Triton**, **vLLM**, or similar serving stacks.
- Strong knowledge of **Kubernetes**, **GPU scheduling**, and **CUDA/MIG**.
- Familiarity with **VMware VCF9**, NSX-T networking, and vSAN storage classes.
- Proficiency in **Python** and containerization (Docker).
- Understanding of **observability stacks** (Prometheus, Grafana) and **FinOps principles**.
- Exposure to **RAG architectures**, vector DBs, and secure multi-tenant environments.
- Excellent problem-solving and customer-facing communication skills.
Benefits
- Our compensation reflects the cost of labor across several US geographic markets.
- Compensation package may also include incentive compensation opportunities in the form of annual bonus or incentives, equity awards and an Employee Stock Purchase Plan (ESPP).
- Learn more about benefits at Rackspace.
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard skills
NVIDIA TritonvLLMKServeKubernetesGPU schedulingCUDAPythonDockerPrometheusGrafana
Soft skills
problem-solvingcustomer-facing communication