Tech Stack
CloudDistributed SystemsKubernetesNFS
About the role
- Own critical customer case escalations end-to-end, including deep root cause analysis and mitigation strategies.
- Act as the technical expert for Infinia.
- Build tooling to improve TTR.
- Write code fixes and enhancements to improve Infinia.
- Utilize AI-powered debugging, log analysis, and system pattern recognition tools to accelerate resolution.
- Be the subject-matter expert on Infinia internals: metadata handling, storage fabric interfaces, performance tuning, AI integration, etc.
- Reproduce complex customer issues and propose product improvements and workarounds.
- Author and maintain detailed runbooks, performance tuning guides, and RCA documentation.
- Partner with Field CTOs, Solutions Architects, and Sales Engineers to ensure customer success.
Requirements
- 10+ years in enterprise storage, distributed systems, or cloud infrastructure support/engineering
- Deep understanding of file systems (POSIX, NFS, S3)
- Proven debugging skills at system/protocol/app levels (e.g., strace, tcpdump, perf)
- Hands-on experience with AI/ML data pipelines, container orchestration (Kubernetes), and GPU-based architectures
- Exceptional communication and executive reporting skills
- Experience using AI tools (e.g., log pattern analysis, LLM-based summarization, automated RCA tooling) to accelerate diagnostics and reduce MTTR
- Competitive salary
- Full-time remote work option
- Professional development opportunities
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard skills
root cause analysisdebuggingperformance tuningfile systemsAI integrationcontainer orchestrationAI/ML data pipelinescode fixessystem pattern recognitionmetadata handling
Soft skills
communicationexecutive reportingcustomer success