Tech Stack
AWSAzureCloudDockerGoogle Cloud PlatformGraphQLKubernetesLinuxSpringTerraform
About the role
- Act as first and second line of defense for technical issues (VMs, networking, API errors, orchestration tools, GPU utilization, etc.)
- Manage tickets, live chat, and calls across Premium/Platinum support tiers and communicate clearly and empathetically to technical and non-technical users
- Collaborate cross-functionally: escalate critical bugs, provide logs and context to engineering, and contribute to product improvement feedback loops
- Write clear, technically accurate documentation and playbooks to improve support efficiency and self-service
- Improve diagnostics tooling, chatbots, and macros to reduce MTTR
- Support QBRs and onboarding sessions for top-tier customers alongside Customer Success Managers
- Help enterprise ML engineers, DevOps teams, and CTOs solve complex challenges involving GPU workloads, infrastructure orchestration, and deployment at scale
Requirements
- 2–4 years in technical support, cloud operations, or SRE environments
- Hands-on experience with Linux environments (logs, bash, process/thread management)
- Hands-on experience with cloud compute platforms (AWS/GCP/Azure or similar)
- Hands-on experience with containerization or orchestration tools (Kubernetes, Docker, Terraform, SLURM)
- Hands-on experience with AI/ML workloads (inference, training, fine-tuning)
- Troubleshooting APIs and REST/GraphQL calls
- Strong communicator with the ability to simplify complex technical issues
- Comfort in a fast-paced startup environment where priorities shift rapidly
- Familiarity with GPU-accelerated cloud environments