ML Platform Engineer – MLOps

Docuvera

full-time

Posted on: 12/25/2025

Location Type: Hybrid

Location: Wellington • New Zealand

Visit company website

Explore more

Platform Engineer jobs

✨ AI Apply

Apply

Job Level

Mid-Level Senior

Tech Stack

AWS Docker DynamoDB Kubernetes Neo4j Postgres Python SQL Terraform

About the role

You’ll design, build, and run the AI infrastructure that powers Docuvera’s enterprise AI and our AI-driven content management platform.
Day to day, you’ll turn machine-learning models into reliable products, build scalable data pipelines, and provide the foundation for AI-powered workflows.
You’ll lead MLOps practices, stand up vector databases and knowledge graphs, and work closely with data scientists and engineers to deploy and monitor models in production.
Your work enables key programs like company-wide knowledge assistants, AI-assisted bug triage, and intelligent content curation.
You’ll automate ML workflows, keep AI services highly available, and ensure everything meets life-sciences regulations (FDA 21 CFR Part 11, GxP).

Requirements

Some or all of these technical skills, experience and knowledge AWS AI/ML & data: Bedrock, SageMaker, Lambda, S3, Aurora PostgreSQL (Aurora/RDS), DynamoDB, EventBridge, SQS, EKS, ECS, Neptune, OpenSearch; hands-on architecture and cost tuning.
Relational data & PostgreSQL: Strong SQL, schema design, indexing/partitioning, query tuning, connection management (RDS Proxy), HA/DR (Multi-AZ, read replicas, PITR), CDC/outbox patterns.
MLOps platforms: MLflow, Kubeflow, SageMaker Pipelines for lifecycle management, experiment tracking, and automated deployments.
Event-driven systems: EventBridge (rules, schedules, schema registry) and SQS (FIFO/Standard, DLQs, ordering/deduplication) for loosely coupled services at scale.
Vector search & RAG: Implementing and tuning Pinecone/Milvus/Weaviate/OpenSearch and embedding workflows in production RAG systems.
Data pipelines: Real-time ingestion with Glue, Kinesis, Lambda; integrating enterprise APIs/webhooks; EventBridge buses and SQS workers for reliable, idempotent processing.
Containers & Kubernetes: Docker, EKS, and serverless model serving; autoscaling for AI workloads.
Graph databases: Neptune or Neo4j with Gremlin/Cypher/SPARQL.
Programming & automation: Python/Bash and IaC (CDK, Terraform, CloudFormation).
Model operations: Deploying/monitoring LLMs, embeddings, and custom ML models with performance optimization.
Enterprise integration: Model Context Protocol (MCP), API gateways, and connectors for systems like Confluence, Jira, and SharePoint.
Observability & resilience: CloudWatch/New Relic dashboards, SLOs/SLIs, synthetic checks; queue latency/depth alerts; EventBridge failure handling; DB health and slow-query monitoring.
AI governance: Model risk management, validation frameworks, and compliance logging for regulated AI apps.

Benefits

offering a digital first, fully flexible working style.
We've embraced asynchronous, hybrid working as a norm.
modern tools and systems, with a big focus on our use of AI
a focus on personal growth with career, learning, and development tools available, plus some dedicated ‘tools down’ personal development time in NZ,
an additional week of paid leave;
staff appreciation leave at Christmas plus a day off for your birthday,
all within in a tight knit, supportive, and inclusive global community.

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools

machine learningdata pipelinesMLOpsSQLPythonautomationmodel operationsevent-driven systemsgraph databasescontainers

Soft Skills

leadershipcollaborationcommunicationproblem-solvingorganizational skills