
MLOps Engineer
Menlo
full-time
Posted on:
Location Type: Hybrid
Location: Ho Chi Min City • Vietnam
Visit company websiteExplore more
About the role
- Own and evolve the infrastructure behind PyTorch-based training and inference workloads
- Build and maintain training and inference pipelines using PyTorch
- Own and evolve inference serving infrastructure
- Write and maintain robust tooling in Python and C++
- Optimize compute workloads for bare-metal environments
- Troubleshoot low-level networking issues
- Set up and manage ML environments
- Establish CI/CD patterns for AI workloads
- Integrate monitoring, alerting, and incident response
Requirements
- Deep expertise in PyTorch internals
- Strong programming skills in Python and C++
- Solid computer science fundamentals
- Hands-on experience with vLLM and SGLang
- Experience with RLHF and PPO training pipelines
- Strong understanding of distributed training setups
- Experience debugging and tuning bare-metal Linux servers
- Familiarity with job schedulers such as Airflow
- Strong grasp of containerized and cloud-native environments
Benefits
- Flexibility in work arrangements
- Professional development opportunities
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
PyTorchPythonC++vLLMSGLangRLHFPPOdistributed trainingLinuxCI/CD