Newsela

AI Operations Engineer

Newsela

contract

Posted on:

Location Type: Remote

Location: Anywhere in Latin America

Visit company website

Explore more

AI Apply
Apply

About the role

  • Design and maintain CI/CD pipelines for ML model training, packaging, and deployment across our microservices.
  • Manage containerized services on AWS ECS, optimizing for cost, latency, and availability.
  • Automate infrastructure provisioning and service configuration with Terraform.
  • Work to maintain and scale services that make use of third party LLM providers.
  • Build and improve data pipelines that feed models from BigQuery, S3, and DynamoDB into training and inference workflows.
  • Instrument services with observability tooling (Datadog, OpenTelemetry, Langfuse) and establish SLOs for model-serving endpoints.
  • Collaborate with ML engineers to productionize new models using BentoML, FastAPI, and container-based serving.

Requirements

  • 2-3 years in ML Ops supporting ML/AI features, systems and workflows with 3-4 years prior experience in DevOps, CloudOps or SRE.
  • Strong proficiency in Python.
  • Hands-on experience with Docker containerization and container orchestration.
  • Solid understanding of CI/CD for ML workflows in an enterprise production environment.
  • Experience with Infrastructure as Code, preferably Terraform.
  • Familiarity with cloud platforms — specifically AWS (ECS, ECR, S3, DynamoDB, CloudWatch) and GCP (BigQuery, Vertex AI).
  • Experience with LLM integration and observability (OpenAI API, Google GenAI, Langfuse tracing).
  • Experience building and maintaining data pipelines for ML training and feature engineering
  • Familiarity with ML modeling workflows — training, evaluation, experiment tracking (e.g. MLFlow, Weights & Biases), and model versioning
  • Experience monitoring and flagging model drift over time.
  • Exposure to NLP/NLU models and frameworks such as Hugging Face Transformers, spaCy, or sentence-transformers
  • Knowledge of vector databases (LanceDB, FAISS) and embedding-based retrieval systems
  • Experience with scaling and maintaining deep learning frameworks (TensorFlow, PyTorch) in production settings
  • Familiarity with classical ML libraries (scikit-learn, XGBoost, LightGBM) and model explainability tools (SHAP)
  • Working knowledge of ML serving frameworks such as BentoML or similar.
  • Comfort working with FastAPI or similar async Python web frameworks.
Benefits
  • Competitive salary
  • Professional development opportunities
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
PythonDockerCI/CDTerraformAWSGCPMLFlowTensorFlowPyTorchFastAPI