Artificial Intelligence

• Design and execute performance benchmarks across AI, HPC, and storage platforms
• Run and tune AI inference workloads using frameworks such as PyTorch, TensorFlow, Triton, NVIDIA NIMs, and vector databases
• Benchmark large-scale RAG pipelines including data ingestion, retrieval, and inference performance
• Profile and optimize MPI and multi-node distributed applications
• Compile and debug C/C++, Python, and CUDA-based codes across heterogeneous systems
• Generate automated test scripts and benchmarking workflows (e.g., with Bash, Python, or Slurm job scripts)
• Analyze and visualize results using Excel, Jupyter, or reporting tools; create comparison graphs and KPIs
• Write clear, concise performance reports for both technical and non-technical stakeholders
• Present findings internally and externally, translating results into architectural guidance for field engineers and sales teams
• Collaborate with system engineers, product managers, and partners to tune and improve software/hardware stack performance
• Validate and tune performance on storage systems including parallel file systems (e.g., Lustre, GPFS), object storage, and NVMe over Fabrics
• Contribute to internal tooling to automate test cycles and performance regression tracking

Senior Benchmark and Performance Engineer – AI and Storage Systems

Job Level

Tech Stack

About the role

Requirements