
Platform Engineer
Vectara
full-time
Posted on:
Location Type: Hybrid
Location: Palo Alto • California • United States
Visit company websiteExplore more
Tech Stack
About the role
- Build and maintain infrastructure-as-code (Terraform, Helm) for our AWS EKS and GCP GKE clusters, plus on-premises deployments (including Tanzu and air-gapped environments).
- Own CI/CD pipelines (GitHub Actions, Bazel, ArgoCD) and drive GitOps adoption.
- Deploy, scale, and optimize ML/NLP inference workloads (vLLM, PyTorch, GPU scheduling with various Kubernetes scalers).
- Build and improve observability: Prometheus, Grafana, Datadog, and OpenTelemetry.
- Collaborate with Field Engineering to support PoCs and platform deployments in customer cloud VPCs and on-prem environments.
- Contribute to backend services (Java 21, Python, gRPC) and platform features.
- Improve system reliability, scalability, and developer experience across the engineering org.
Requirements
- 2+ years in platform engineering, DevOps, SRE, or backend infrastructure roles.
- Strong Kubernetes experience (deployment, debugging, scaling — not just `kubectl apply`).
- Hands-on with infrastructure-as-code: Terraform, Helm, or Pulumi.
- Experience with at least one major cloud provider (AWS preferred; GCP or Azure also valued).
- Proficiency in one or more of: Go, Python, Java. Comfortable reading and contributing to backend codebases.
- Working knowledge of CI/CD systems (GitHub Actions, Bazel, ArgoCD, or similar).
- Solid fundamentals in Linux, networking, and distributed systems.
Benefits
- 100% paid Medical, Dental, Vision for employees.
- Option of Health Savings Account (HSA) or Flexible Savings Account (FSA).
- Generous paid time off (PTO) plus paid sick time and holidays.
- Professional development and training opportunities.
- Company virtual happy hours and fun team building activities and more.
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
infrastructure-as-codeTerraformHelmAWS EKSGCP GKECI/CDGitOpsML/NLPJavaPython
Soft Skills
collaborationproblem-solvingcommunicationreliability improvementscalability enhancementdeveloper experience