Red Hat

Senior Performance Engineer – AI Platforms

Red Hat

full-time

Posted on:

Location Type: Hybrid

Location: BostonMassachusettsNorth CarolinaUnited States

Visit company website

Explore more

AI Apply
Apply

Salary

💰 $136,320 - $225,090 per year

Job Level

About the role

  • Define and track key performance indicators (KPIs) and service level objectives (SLOs) for large-scale, distributed LLM inference services in Kubernetes/OpenShift
  • Participate in the performance roadmap for distributed inference, including multi-node and multi-GPU scaling studies, interconnect performance analysis, and competitive benchmarking
  • Formulate performance test plans and execute performance benchmarks to characterize performance, drive improvements, and detect performance issues through data analysis and visualization
  • Develop and maintain tools, scripts, and automated solutions that streamline performance benchmarking tasks.
  • Collaborate with cross-functional engineering teams to identify and address performance issues.
  • Partner with DevOps to bake performance gates into GitHub Actions/OpenShift Pipelines.
  • Explore and experiment with emerging AI technologies relevant to software development, proactively identifying opportunities to incorporate new AI capabilities into existing workflows and tooling.
  • Triage field and customer escalations related to performance; distill findings into upstream issues and product backlog items.
  • Publish results, recommendations, and best practices through internal reports, presentations, external blogs, and official documentation.
  • Represent the team at internal and external conferences, presenting key findings and strategies.

Requirements

  • 5+ years of overall software engineering experience, including at least 3 years focused on performance engineering or systems-level development.
  • Strong understanding of operating systems and distributed systems
  • Foundational knowledge of AI and LLM inference workflows
  • Proficiency in Python for data and machine learning workflows, along with strong Linux and Bash skills
  • Excellent communication skills, with the ability to translate performance data into clear business and customer value
  • Passion for and commitment to open source principles
  • Master’s or PhD in Computer Science, AI, or a related field is considered a plus
  • Experience contributing to open source projects or leading community initiatives
  • Hands on experience with Kubernetes or OpenShift
  • Familiarity with performance and observability tools such as perf, eBPF tools, Nsight Systems, and PyTorch Profiler
  • Experience with modern LLM inference stacks such as vLLM, TensorRT LLM, Hugging Face TGI, and Triton Inference Server
Benefits
  • Comprehensive medical, dental, and vision coverage
  • Flexible Spending Account - healthcare and dependent care
  • Health Savings Account - high deductible medical plan
  • Retirement 401(k) with employer match
  • Paid time off and holidays
  • Paid parental leave plans for all new parents
  • Leave benefits including disability, paid family medical leave, and paid military leave
  • Additional benefits including employee stock purchase plan, family planning reimbursement, tuition reimbursement, transportation expense account, employee assistance program, and more!
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
performance engineeringsystems-level developmentPythonLinuxBashKubernetesOpenShiftAILLM inference workflowsperformance benchmarking
Soft Skills
communicationcollaborationdata analysisproblem-solvingpresentationreportingcommunity leadershipcustomer focusopen source commitmentproactive exploration
Certifications
Master’s in Computer SciencePhD in Computer ScienceAI certification