FREE ACCESS
5,000–10,000 jobs/day

See all jobs on JobTailor
Search thousands of fresh jobs every day.
Discover
- Fresh listings
- Fast filters
- No subscription required
Create a free account and start exploring right away.

Principal Software Engineer – DGX Cloud
NVIDIAPrincipal Software Engineer developing scalable automation solutions for NVIDIA's DGX Cloud team. Leading technical efforts and mentoring engineers in building foundational systems for AI and cloud computing.
Posted 4/29/2026full-timeSanta Clara • California, Washington • 🇺🇸 United StatesLead💰 $272,000 - $431,250 per yearWebsite
Tech Stack
Tools & technologiesAWSAzureCloudDistributed SystemsDockerGoGoogle Cloud PlatformGrafanaJavaKubernetesPrometheusPython
About the role
Key responsibilities & impact- Lead the build and development of next-generation APIs, state management, and workflow orchestration systems that automate fleet lifecycle operations at a massive scale
- Drive technical alignment across dependent systems and partner teams to ensure cohesive integration, clear interfaces, and reliable end-to-end workflows, with a strong focus on delivery
- Act as a force-multiplier by coaching, mentoring, and encouraging senior engineers, elevating the technical standards and guidelines across the organization
- Maintain an incredible focus on the customer experience and product requirements, translating deep technical insight into high-impact business solutions
- Partner with executive and engineering leadership to codify critical business processes into self-measuring, scalable, and operationally consistent platforms, drastically reducing manual toil
- Direct the integration strategy for key technologies, including common AI schedulers (e.g., Kubernetes, Slurm) and innovative observability systems (e.g., Prometheus, OpenTelemetry, Grafana)
Requirements
What you’ll need- 16+ years of progressive industry experience
- Master's or Bachelor's degree, or equivalent experience defining and shipping complex distributed systems
- Deep, hands-on expertise in establishing, operating, and scaling services in a fast paced, high-reliability environment
- Outstanding proficiency in modern systems programming languages such as Go, Java, or Python
- Proven track record of defining, owning, and evolving the architecture of high-scale distributed systems, including advanced patterns for APIs, control planes, and data pipelines
- Deep understanding of global cloud infrastructure (AWS, GCP, Azure) and container ecosystems (Docker, Kubernetes)
- Demonstrated ability to drive technical strategy and influence outcomes across organizational boundaries
- Outstanding ability to communicate complex technical concepts, drive organizational consensus, and mentor high-performing engineers
Benefits
Comp & perks- equity
- comprehensive benefits package
ATS Keywords
✓ Tailor your resumeApplicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
API developmentstate managementworkflow orchestrationsystems programmingGoJavaPythondistributed systems architecturecloud infrastructurecontainer ecosystems
Soft Skills
coachingmentoringtechnical alignmentcustomer experience focuscommunicationorganizational consensusinfluenceleadershipproblem-solvingcollaboration
Certifications
Master's degreeBachelor's degree