Menlo

MLOps Engineer

Menlo

full-time

Posted on:

Location Type: Hybrid

Location: Ho Chi Min CityVietnam

Visit company website

Explore more

AI Apply
Apply

About the role

  • Own and evolve the infrastructure behind PyTorch-based training and inference workloads
  • Build and maintain training and inference pipelines using PyTorch
  • Own and evolve inference serving infrastructure
  • Write and maintain robust tooling in Python and C++
  • Optimize compute workloads for bare-metal environments
  • Troubleshoot low-level networking issues
  • Set up and manage ML environments
  • Establish CI/CD patterns for AI workloads
  • Integrate monitoring, alerting, and incident response

Requirements

  • Deep expertise in PyTorch internals
  • Strong programming skills in Python and C++
  • Solid computer science fundamentals
  • Hands-on experience with vLLM and SGLang
  • Experience with RLHF and PPO training pipelines
  • Strong understanding of distributed training setups
  • Experience debugging and tuning bare-metal Linux servers
  • Familiarity with job schedulers such as Airflow
  • Strong grasp of containerized and cloud-native environments
Benefits
  • Flexibility in work arrangements
  • Professional development opportunities
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
PyTorchPythonC++vLLMSGLangRLHFPPOdistributed trainingLinuxCI/CD