Apply

Ready to go for it?

AI Apply speeds things up—apply directly if you prefer.

FREE ACCESS
5,000–10,000 jobs/day
JobTailor Logo

See all jobs on JobTailor

Search thousands of fresh jobs every day.

Discover
  • Fresh listings
  • Fast filters
  • No subscription required
Create a free account and start exploring right away.
Cloudera

Staff Software Engineer – AI Systems, Runtimes

Cloudera

Staff Software Engineer leading architecture and delivery of cloud-native AI platform at Cloudera. Focusing on Kubernetes and AI capabilities across hybrid clouds and data centers.

Posted 6/2/2026full-timeSan Jose • California • 🇺🇸 United StatesLeadWebsite

Tech Stack

Tools & technologies
DockerGoJavaScriptKubernetesNode.jsPythonRust

About the role

Key responsibilities & impact
  • Design and implement elegant, scalable application services (Go/Node.js) that wrap AI capabilities for enterprise use.
  • Lead the deployment of inference servers (vLLM, Triton) using KServe, KubeRay, or Knative to ensure serverless-style scaling for AI workloads.
  • Build internal tooling, SDKs, and "AI Gateways" that enhance team agility and simplify the integration of Foundation Models (Llama, GPT) into product features.
  • Architect robust Retrieval-Augmented Generation (RAG) pipelines and prompt management services that integrate seamlessly with vector databases and enterprise data sources.
  • Partner with UI engineers, UX designers, and Product Management to ensure the AI platform is not just powerful, but highly usable for internal developers.
  • Ensure AI workloads are secure, multi-tenant, and optimized for GPU resource scheduling (MIG, fractional GPUs) within Kubernetes.

Requirements

What you’ll need
  • Bachelor’s degree with 6+ years of software engineering experience (or equivalent Masters/PhD tenure), with at least 2+ years focused on AI/ML systems.
  • Expert proficiency in Python (for AI ecosystem) and strong competence in a systems language like Go or Rust/C++ (for high-performance serving layers).
  • Deep understanding of LLM deployment challenges and runtimes (e.g., vLLM, ONNX, TorchServe, Triton).
  • Familiarity with quantization techniques (AWQ, GPTQ) to optimize model size/speed.
  • Experience building complex workflows using tools like LangChain or LlamaIndex, and deploying them on containerized infrastructure (Docker/Kubernetes).
  • Ability to navigate the rapidly changing AI landscape, filtering hype from practical engineering solutions, and driving technical alignment across teams.

Benefits

Comp & perks
  • Generous PTO Policy
  • Support work life balance with Unplugged Days
  • Flexible WFH Policy
  • Mental & Physical Wellness programs
  • Phone and Internet Reimbursement program
  • Access to Continued Career Development
  • Comprehensive Benefits and Competitive Packages
  • Paid Volunteer Time
  • Employee Resource Groups

ATS Keywords

✓ Tailor your resume
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
GoNode.jsPythonRustC++vLLMTritonKubernetesDockerLangChain
Soft Skills
leadershipteam agilitytechnical alignmentcommunication
Certifications
Bachelor’s degreeMaster’s degreePhD