Machine Learning Engineer

• Own the end-to-end lifecycle of ML model deployment—from training artifacts to production inference services.
• Design, build, and maintain scalable inference pipelines using modern orchestration frameworks (e.g., Kubeflow, Airflow, Ray, MLflow).
• Implement and optimize model serving infrastructure for latency, throughput, and cost efficiency across GPU and CPU clusters.
• Develop and tune Retrieval-Augmented Generation (RAG) systems, including vector database configuration, embedding optimization, and retriever–generator orchestration.
• Collaborate with product and platform teams to integrate model APIs and agentic workflows into customer-facing systems.
• Evaluate, benchmark, and optimize large language and multimodal models using quantization, pruning, and distillation techniques.
• Design CI/CD workflows for ML systems, ensuring reproducibility, observability, and continuous delivery of model updates.
• Contribute to the development of internal tools for dataset management, feature stores, and evaluation pipelines.
• Monitor production model performance, detect drift, and drive improvements to reliability and explainability.
• Explore and integrate emerging agentic and orchestration frameworks (LangChain, LangGraph, CrewAI, etc.) to accelerate development of intelligent systems.

Senior ML Engineer – ML/Inference

Job Level

Tech Stack

About the role

Requirements

Applicant Tracking System Keywords

Hard skills

Soft skills