Veeam Software

ML/AI Ops Engineer

Veeam Software

full-time

Posted on:

Location Type: Remote

Location: Costa Rica

Visit company website

Explore more

AI Apply
Apply

Job Level

About the role

  • Own the end-to-end operationalization of ML and AI solutions—from development to scalable, reliable production systems that integrate seamlessly with other digital tools.
  • Design, automate, and maintain CI/CD pipelines for model training, testing, deployment, and retraining (Azure DevOps, Databricks).
  • Build, optimize, and version model lifecycle workflows, ensuring reproducibility, lineage, and governance across the ML/AI platform.
  • Monitor production models for performance, drift, reliability, and resource usage; implement automated retraining workflows.
  • Optimize compute, storage, and orchestration across the Databricks platform to ensure efficient, cost-effective operations.
  • Collaborate closely with ML/AI Scientists, Data Engineers, and DWH team to transform research-grade models into production-ready services.
  • Contribute to advancing our ML/AI platform, tooling, automation standards, and best practices.

Requirements

  • Solid experience in operationalizing ML/AI models, including deployment, automation, monitoring, and lifecycle management.
  • Strong programming skills in Python, PySpark, and SQL with clean, efficient, production-ready code.
  • Experienced in feature engineering with a practical understanding of data engineering fundamentals - designing, validating, and optimizing feature pipelines, and ensuring feature consistency
  • Experience in building Vector embeddings & RAG systems.
  • Familiarity in ML and LLM models development and libraries used.
  • Experience with MLflow (or similar tools) for model tracking, registry management, and lifecycle operations.
  • Familiarity with CI/CD pipelines (Azure DevOps preferred)
  • Strong grasp of data versioning, model versioning, reproducibility, and data lineage within governed ML/AI environments.
  • Experience designing, consuming, or integrating REST APIs to expose ML/AI models as services and support real-time or near-real-time inference.
  • Experience monitoring production models, identifying drift or performance issues, and implementing corrective workflows.
  • A collaborative, systems-thinking mindset, working closely with ML/AI Scientists, Data Engineers, and Data Warehouse team.
Benefits
  • Two weeks of paid vacation, 12 statutory holidays, plus 4 extra global VeeaMe Days for self-care and 24 paid volunteer hours annually through Veeam Cares
  • Paid parental leave: 8 days for fathers, 122 days for birthing parents, 92 days for adoptive parents
  • Medical, dental, and vision coverage fully funded through INS Premium for employees and dependents
  • Mental health support, therapy sessions, and virtual care via our Employee Assistance Program
  • Retirement and social security contributions through Costa Rica’s statutory programs
  • Life insurance equal to 24x monthly salary, plus disability and funeral coverage
  • Daily cafeteria subsidy
  • Fertility, adoption, and surrogacy support, plus 24 paid volunteer hours through Veeam Cares
  • Opportunities to learn and grow through on-demand libraries (LinkedIn Learning, O’Reilly), mentoring, workshops, and learning events like our annual Global Day of Learning
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
operationalizing ML modelsoperationalizing AI modelsdeploymentautomationmonitoringlifecycle managementPythonPySparkSQLfeature engineering
Soft Skills
collaborative mindsetsystems thinking