Inworld AI

Staff MLOps Engineer

Inworld AI

full-time

Posted on:

Location Type: Hybrid

Location: Mountain View • California • 🇺🇸 United States

Visit company website
AI Apply
Apply

Salary

💰 $180,000 - $280,000 per year

Job Level

Lead

Tech Stack

AnsibleAzureC++CloudGoGoogle Cloud PlatformKubernetesOpen SourceOraclePythonTerraform

About the role

  • Build and scale MLOps systems to streamline the end-to-end ML model lifecycle on the Inworld AI platform, from training to deployment.
  • Design and implement robust model training, evaluation, and release pipelines.
  • Collaborate cross-functionally with ML and backend teams to design, deploy, and maintain scalable secure infrastructure for Inworld’s AI Engine and Studio.
  • Facilitate a "you build it, you run it" culture by providing the necessary tools and processes for monitoring the reliability, availability, and performance of services.
  • Manage CI/CD pipelines to ensure smooth and efficient code integration and deployment.
  • Identify and implement opportunities to enhance engineering speed and efficiency.
  • Provide technical leadership in ML engineering best practices, raise the technical bar, and mentor junior engineers in MLOps principles.

Requirements

  • 7+ years of software engineering experience, with 5+ years of infrastructure-as-code
  • Proficiency in managing Kubernetes clusters and applications, including creating Helm charts/Kustomize manifests for new applications.
  • Experience in creating and maintaining CI/CD pipelines for both applications and infrastructure deployments (using tools like Terraform/Terragrunt, ArgoCD, GitHub Actions, Ansible, etc.).
  • Deep knowledge of at least one major cloud provider (Google Cloud Platform, Microsoft Azure, Oracle Cloud).
  • Proficient in at least one backend programming/scripting languages such as Golang, Python, and Bash.
  • Knowledge of SLURM or similar job schedulers for distributed training.
  • Experience with data pipeline and workflow management tools
  • Familiarity with open source LLM and open source serving solution (e.g. vLLM or llama.cpp, kserve, etc) is a plus.
  • Experience with bare metal GPUs (optional).
  • Desire to work at a fast-growing Series A startup, comfortable with uncertainty, owning and scaling new products, and embracing an experimental and iterative development process.
Benefits
  • equity
  • benefits 📊 Resume Score Upload your resume to see if it passes auto-rejection tools used by recruiters Check Resume Score

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard skills
MLOpsCI/CDKubernetesTerraformArgoCDGitHub ActionsGolangPythonBashSLURM
Soft skills
technical leadershipmentoringcollaborationproblem-solvingadaptabilitycommunicationefficiency enhancementprocess improvementownershipexperimental mindset