
Senior Performance Engineer – AI Platforms
Red Hat
full-time
Posted on:
Location Type: Hybrid
Location: Boston • Massachusetts • North Carolina • United States
Visit company websiteExplore more
Salary
💰 $136,320 - $225,090 per year
Job Level
About the role
- Define and track key performance indicators (KPIs) and service level objectives (SLOs) for large-scale, distributed LLM inference services in Kubernetes/OpenShift
- Participate in the performance roadmap for distributed inference, including multi-node and multi-GPU scaling studies, interconnect performance analysis, and competitive benchmarking
- Formulate performance test plans and execute performance benchmarks to characterize performance, drive improvements, and detect performance issues through data analysis and visualization
- Develop and maintain tools, scripts, and automated solutions that streamline performance benchmarking tasks.
- Collaborate with cross-functional engineering teams to identify and address performance issues.
- Partner with DevOps to bake performance gates into GitHub Actions/OpenShift Pipelines.
- Explore and experiment with emerging AI technologies relevant to software development, proactively identifying opportunities to incorporate new AI capabilities into existing workflows and tooling.
- Triage field and customer escalations related to performance; distill findings into upstream issues and product backlog items.
- Publish results, recommendations, and best practices through internal reports, presentations, external blogs, and official documentation.
- Represent the team at internal and external conferences, presenting key findings and strategies.
Requirements
- 5+ years of overall software engineering experience, including at least 3 years focused on performance engineering or systems-level development.
- Strong understanding of operating systems and distributed systems
- Foundational knowledge of AI and LLM inference workflows
- Proficiency in Python for data and machine learning workflows, along with strong Linux and Bash skills
- Excellent communication skills, with the ability to translate performance data into clear business and customer value
- Passion for and commitment to open source principles
- Master’s or PhD in Computer Science, AI, or a related field is considered a plus
- Experience contributing to open source projects or leading community initiatives
- Hands on experience with Kubernetes or OpenShift
- Familiarity with performance and observability tools such as perf, eBPF tools, Nsight Systems, and PyTorch Profiler
- Experience with modern LLM inference stacks such as vLLM, TensorRT LLM, Hugging Face TGI, and Triton Inference Server
Benefits
- Comprehensive medical, dental, and vision coverage
- Flexible Spending Account - healthcare and dependent care
- Health Savings Account - high deductible medical plan
- Retirement 401(k) with employer match
- Paid time off and holidays
- Paid parental leave plans for all new parents
- Leave benefits including disability, paid family medical leave, and paid military leave
- Additional benefits including employee stock purchase plan, family planning reimbursement, tuition reimbursement, transportation expense account, employee assistance program, and more!
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
performance engineeringsystems-level developmentPythonLinuxBashKubernetesOpenShiftAILLM inference workflowsperformance benchmarking
Soft Skills
communicationcollaborationdata analysisproblem-solvingpresentationreportingcommunity leadershipcustomer focusopen source commitmentproactive exploration
Certifications
Master’s in Computer SciencePhD in Computer ScienceAI certification