Architect and optimize CI/CD pipelines for large-scale, high-availability simulation services, ensuring fast, reliable, and secure deployments.
Drive automation across infrastructure provisioning, configuration management, and monitoring to support rapid development cycles and minimize manual intervention.
Collaborate with software engineering and product teams to design and implement scalable, cloud-native solutions that meet evolving business needs.
Promote standard processes in infrastructure as code, containerization, and cloud security, ensuring compliance and resilience across environments.
Monitor, troubleshoot, and resolve infrastructure and deployment issues, maximizing uptime and ensuring efficient performance for internal and external customers.
Evaluate and integrate new tools and technologies to continually enhance the reliability, observability, and efficiency of our DevOps ecosystem.
Participate in incident response and post-mortem processes, driving root cause analysis and systemic improvements.
Requirements
BSc or above in Computer Science, Computer Engineering, or a related field, or equivalent experience.
5+ overall years of hands-on experience in DevOps or Site Reliability Engineering roles.
Proven expertise in designing, building, and maintaining CI/CD pipelines (e.g., Jenkins, GitLab CI, GitHub Actions, or similar).
Deep knowledge of cloud platforms (AWS, preferably), On-Prem deployment, container orchestration (Kubernetes, Docker), and infrastructure as code.
Strong scripting and automation skills (Python, Bash, or similar).
Experience with monitoring, logging, and observability tools (Prometheus, Grafana, ELK, etc.).
Proven understanding of security standard methodologies in cloud & on-prem DevOps environments.
Excellent communication and interpersonal skills, with a track record of multi-functional collaboration.
Experience supporting large-scale, high-availability production systems.
Benefits
NVIDIA is committed to fostering a diverse work environment.
Equal-opportunity employer.
Reasonable accommodation for individuals with disabilities during the job application or interview process.
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard skills
CI/CD pipelinescloud-native solutionsinfrastructure as codecontainerizationcloud securityscriptingautomationmonitoringloggingobservability
Soft skills
communicationinterpersonal skillscollaborationproblem-solvingroot cause analysis
Certifications
BSc in Computer ScienceBSc in Computer Engineering