Salesforce

Software Engineer – Machine Learning Infrastructure

Salesforce

full-time

Posted on:

Location Type: Hybrid

Location: SeattleTexasWashingtonUnited States

Visit company website

Explore more

AI Apply
Apply

Salary

💰 $164,000 - $313,700 per year

About the role

  • Design, build, and operate systems to train, serve, and deploy machine learning models at scale, with a focus on reliability, performance, and operational simplicity
  • Evolve GPU backed inference infrastructure to support high throughput, latency sensitive workloads, including large scale model serving
  • Architect and optimize distributed training and data processing systems using platforms such as Ray, Airflow, Spark, or similar technologies
  • Build and maintain Kubernetes based platforms and orchestration layers using tools such as KubeRay, vLLM, and internally developed services
  • Architect solutions that bridge legacy systems with modern technologies while maintaining monolithic application stability
  • Develop robust monitoring, observability, and alerting for production ML workloads to ensure operational excellence
  • Partner closely with AI Platform, ML modeling, security, and product engineering teams to design infrastructure that supports evolving AI use cases
  • Provide technical leadership through design reviews, mentorship, and by setting engineering standards and long term architectural direction for ML infrastructure
  • Author technical design and architecture documentation, and contribute thought leadership through engineering blog posts

Requirements

  • Significant professional experience in software engineering with a strong focus on infrastructure, backend systems, platform engineering, or MLOps
  • Deep experience building and operating distributed systems, including expert level knowledge of Kubernetes and container based platforms
  • Hands on experience with modern ML infrastructure and serving stacks such as Ray or KubeRay, vLLM, or similar training and inference orchestration frameworks
  • Experience working with GPU infrastructure, including performance optimization and operational management at scale
  • Strong experience with data infrastructure and orchestration technologies such as Airflow, Spark, or similar systems
  • Experience building and operating cloud native systems on public cloud platforms such as AWS, GCP, or Azure, including infrastructure as code
  • Excellent written communication
  • A related technical degree required
Benefits
  • time off programs
  • medical, dental, vision
  • mental health support
  • paid parental leave
  • life and disability insurance
  • 401(k)
  • employee stock purchasing program
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
machine learningdistributed systemsinfrastructurebackend systemsplatform engineeringMLOpsperformance optimizationdata processinginfrastructure as codetechnical design
Soft Skills
technical leadershipmentorshipwritten communication
Certifications
related technical degree