Principal Software Engineer, AI Model Serving

Red Hat

full-time

Posted on: 8/28/2025

Location: North Carolina • 🇺🇸 United States

Visit company website

✨ AI Apply

Apply

Salary

💰 $148,540 - $245,050 per year

Job Level

Lead

Tech Stack

AWSAzureCloudGoKubernetesLinuxOpenShiftOpen SourcePythonPyTorchTensorflow

About the role

Lead the team strategy and implementation for Kubernetes-native components in Model Serving, including Custom Resources, Controllers, and Operators.
Be an influencer and leader in MLOps-related open source communities to help build an active MLOps open source ecosystem for Open Data Hub and OpenShift AI
Act as an MLOps SME within Red Hat by supporting customer-facing discussions, presenting at technical conferences, and evangelizing OpenShift AI within the internal community of practices
Architect and design new features for open-source MLOps communities such as KubeFlow and KServe
Provide technical vision and leadership on critical and high-impact projects
Mentor, influence, and coach a team of distributed engineers
Ensure non-functional requirements including security, resiliency, and maintainability are met
Write unit and integration tests and work with quality engineers to ensure product quality
Use CI/CD best practices to deliver solutions as productization efforts into RHOAI
Contribute to a culture of continuous improvement by sharing recommendations and technical knowledge with team members
Collaborate with product management, other engineering, and cross-functional teams to analyze and clarify business requirements
Communicate effectively to stakeholders and team members to ensure proper visibility of development efforts
Give thoughtful and prompt code reviews
Represent RHOAI in external engagements including industry events, customer meetings, and open-source communities
Proactively utilize AI-assisted development tools (e.g., GitHub Copilot, Cursor, Claude Code) for code generation, auto-completion, and intelligent suggestions to accelerate development cycles and enhance code quality.
Explore and experiment with emerging AI technologies relevant to software development, proactively identifying opportunities to incorporate new AI capabilities into existing workflows and tooling.

Requirements

Proven expertise with Kubernetes API development and testing (CRs, Operators, Controllers), including reconciliation logic.
Strong background with model serving (like KServe, vLLM) and distributed inference strategies for LLMs (tensor, pipeline, data parallelism).
Deep understanding of GPU optimization, autoscaling (KEDA/Knative), and low-latency networking (e.g., NVLink, P2P GPU).
Experience architecting resilient, secure, and observable systems for model serving, including metrics and tracing.
Advanced skills in Go and Python; ability to design APIs for high-performance inference and streaming.
Excellent system troubleshooting skills in cloud environments and the ability to innovate in fast-paced environments.
Strong communication and leadership skills to mentor teams and represent projects in open-source communities.
Autonomous work ethic and passion for staying at the forefront of AI and open source.
The following will be considered a plus: An existing contributor in one or more MLOps open source projects such as KubeFlow, KServe, RayServe, and vLLM is a huge plus
Familiarity with optimization techniques for LLMs (quantization, TensorRT, Hugging Face Accelerate).
Knowledge of end-to-end MLOps workflows, including model registry, explainability, and drift detection.
Bachelor's degree in statistics, mathematics, computer science, operations research, or a related quantitative field, or equivalent expertise; Master’s or PhD is a big plus
Understanding of how Open Source and Free Software communities work
Experience with development for public cloud services (AWS, GCE, Azure)
Experience in engineering, consulting or another field related to model serving and monitoring, model registry, explainable AI, deep neural networks, in a customer environment or supporting a data science team
Highly experienced in OpenShift
Familiarity with popular Python machine learning libraries such as PyTorch, Tensorflow, and Hugging Face

Principal Software Engineer, AI Model Serving

Salary

Job Level

Tech Stack

About the role

Requirements

Similar jobs on JobTailor

Staff Software Engineer, Appointment Management

Staff Software Engineer, AI

Staff Software Engineer – Scheduling

Software Engineer Apprentice

Principal Software Engineer