Specialty Solutions Engineer – AI-Modern Data Center
Thinkahead Consultant Psychologist Pty Ltd
full-time
Posted on:
Location Type: Remote
Location: United States
Visit company websiteExplore more
Salary
💰 $250,000 - $300,000 per year
About the role
- Lead technical discovery and solution positioning for enterprise AI infrastructure opportunities, translating business outcomes into reference architectures and value propositions.
- Own pre-sales deliverables including architectures, diagrams, sizing, BOMs, proposals.
- Deliver executive and technical presentations focused on NVIDIA AI Enterprise (NVAIE), LLM training/inference, and accelerated analytics.
- Guide clients through technology selection, roadmap development, and business case creation for large-scale AI initiatives.
- Architect end-to-end AI platforms using NVIDIA DGX/HGX, Blackwell (B100/B200), Hopper (H100/H200), Grace/Grace-Hopper (GH200), L40S, NVLink/NVSwitch, InfiniBand (NVIDIA Quantum), RoCE, and DPU offload patterns.
- Design solutions leveraging AMD Instinct (MI300/MI300X) as appropriate, articulating trade-offs in CPU/GPU/DPU, interconnect topology, and cluster scale-out.
- Integrate NVIDIA AI Enterprise components (CUDA, cuDNN, TensorRT, Triton Inference Server, RAPIDS) and common ML frameworks (PyTorch, TensorFlow) with orchestration platforms.
- Experience integrating on-prem GPU clusters with cloud AI services (AWS SageMaker, Azure ML, GCP Vertex AI) for hybrid bursting and workload mobility.
- Advise on MLOps platforms (MLflow, Kubeflow, Weights & Biases), CI/CD, and governance for multi-tenant AI environments.
- Build and maintain relationships with NVIDIA, AMD, Run:AI, OEMs, and networking vendors, aligning campaigns with partner programs and incentives.
- Contribute feedback to vendor engineering and product teams, coordinating joint enablement and reference designs.
- Create repeatable assets such as validated designs, sizing calculators, POV guides, deployment runbooks, and competitive playbooks.
- Mentor SEs and delivery consultants, leading internal training on AI scheduling, performance tuning, and operational best practices.
- Lead proof-of-value (POV) and proof-of-concept (POC) engagements, including success criteria, benchmarking, and recommendations.
Requirements
- Proven experience architecting and deploying NVIDIA GPU-based AI platforms (NVAIE, DGX/HGX, Blackwell, Hopper, Grace, L40S, H100/H200, B100/B200, GH200) and/or AMD Instinct MI300/MI300X.
- Experience with Run:AI, NVIDIA Base Command, Kubernetes (GPU Operator), Slurm, and/or vSphere with Tanzu for AI/ML workloads.
- Advanced knowledge of AI/ML frameworks and libraries (PyTorch, TensorFlow, RAPIDS, Triton, CUDA, cuDNN, TensorRT).
- Strong understanding of high-speed networking for AI (InfiniBand, RoCE, DPU integration, NVLink, NVSwitch).
- Experience integrating on-prem AI infrastructure with public cloud AI services (AWS SageMaker, Azure ML, GCP Vertex AI) and hybrid architectures.
- Experience leading pre-sales campaigns, POV/POC management, and executive presentations.
- Ability to identify and leverage emerging datacenter and AI technologies to drive innovative solutions.
- Strong analytical skills for troubleshooting complex environments, including storage, compute, networking, and AI workloads.
- Skilled at guiding clients through decision-making with clear, strategic recommendations.
- Proven track record of working effectively across sales, engineering, and vendor teams.
- Knowledge of datacenter security best practices and regulatory compliance.
Benefits
- Medical, Dental, and Vision Insurance
- 401(k)
- Paid company holidays
- Paid time off
- Paid parental and caregiver leave
- Plus more! See benefits https://www.aheadbenefits.com/ for additional details.
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
NVIDIA AI EnterpriseNVIDIA DGXNVIDIA HGXAMD InstinctPyTorchTensorFlowCUDAcuDNNTensorRTMLOps
Soft Skills
analytical skillsstrategic recommendationsmentoringrelationship buildingcommunicationleadershiptroubleshootingguiding clientscollaborationpresentation skills