BMW Group

Senior AI Infrastructure Engineer

BMW Group

full-time

Posted on:

Location Type: Office

Location: MunichGermany

Visit company website

Explore more

AI Apply
Apply

Job Level

About the role

  • Design, build and operate GPU-centric AI infrastructure (primarily NVIDIA) in on-premise and cloud environments, with a strong focus on performance, scalability and efficiency.
  • Responsible for the architecture and operation of high-performance compute environments for distributed training and optimized model execution.
  • Optimize compute, memory, and high-performance networking (e.g., InfiniBand, NCCL) to enable large-scale AI workloads in industrial contexts.
  • Develop and operate infrastructure components such as scheduling and resource management systems (e.g., SLURM, Ray, Run:ai) to ensure efficient utilization of shared GPU resources.
  • Create and maintain automated, reproducible infrastructure using modern tools (e.g., Docker, Kubernetes, Terraform, Ansible, CI/CD).
  • Contribute to BMW-specific AI use cases by providing reliable and scalable infrastructure.
  • Technical ownership of the AI infrastructure stack, defining best practices and mentoring less experienced engineers.

Requirements

  • University degree in Computer Science, Computer/Electrical Engineering, or a related field
  • Several years of professional experience (8–10 years) in industry building and operating AI and HPC infrastructures
  • Solid hands-on experience with GPU systems (particularly NVIDIA), including drivers, CUDA, and performance optimization
  • Experience with distributed systems and high-performance networking (e.g., InfiniBand, NCCL), as well as cloud environments (AWS, Azure) in addition to on-premise infrastructure
  • Practical experience with resource scheduling and workload orchestration (e.g., SLURM, Ray, NVIDIA Run:ai)
  • Extensive experience in infrastructure automation (e.g., Docker, Kubernetes, Terraform, Ansible, CI/CD) and proficiency in Python for infrastructure and system tooling
  • Experience with training, fine-tuning, or deploying ML models in production, as well as exposure to industrial AI use cases (e.g., simulation, robotics, engineering), is a plus
Benefits
  • Challenging projects
  • Wide range of personal and professional development opportunities
  • Attractive, fair, and performance-based compensation
  • High job security
  • Annual special payments such as holiday pay, Christmas bonus, and profit-sharing
  • Flexible working hours, including six weeks of annual leave and overtime compensation
  • Discounted BMW & MINI benefits
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
GPU systemsNVIDIACUDAperformance optimizationdistributed systemshigh-performance networkingresource schedulingworkload orchestrationinfrastructure automationPython
Soft Skills
technical ownershipmentoringbest practices