Docuvera

ML Platform Engineer – MLOps

Docuvera

full-time

Posted on:

Location Type: Hybrid

Location: Wellington • 🇳🇿 New Zealand

Visit company website
AI Apply
Apply

Job Level

Mid-LevelSenior

Tech Stack

AWSDockerDynamoDBKubernetesNeo4jPostgresPythonSQLTerraform

About the role

  • You’ll design, build, and run the AI infrastructure that powers Docuvera’s enterprise AI and our AI-driven content management platform.
  • Day to day, you’ll turn machine-learning models into reliable products, build scalable data pipelines, and provide the foundation for AI-powered workflows.
  • You’ll lead MLOps practices, stand up vector databases and knowledge graphs, and work closely with data scientists and engineers to deploy and monitor models in production.
  • Your work enables key programs like company-wide knowledge assistants, AI-assisted bug triage, and intelligent content curation.
  • You’ll automate ML workflows, keep AI services highly available, and ensure everything meets life-sciences regulations (FDA 21 CFR Part 11, GxP).

Requirements

  • Some or all of these technical skills, experience and knowledge AWS AI/ML & data: Bedrock, SageMaker, Lambda, S3, Aurora PostgreSQL (Aurora/RDS), DynamoDB, EventBridge, SQS, EKS, ECS, Neptune, OpenSearch; hands-on architecture and cost tuning.
  • Relational data & PostgreSQL: Strong SQL, schema design, indexing/partitioning, query tuning, connection management (RDS Proxy), HA/DR (Multi-AZ, read replicas, PITR), CDC/outbox patterns.
  • MLOps platforms: MLflow, Kubeflow, SageMaker Pipelines for lifecycle management, experiment tracking, and automated deployments.
  • Event-driven systems: EventBridge (rules, schedules, schema registry) and SQS (FIFO/Standard, DLQs, ordering/deduplication) for loosely coupled services at scale.
  • Vector search & RAG: Implementing and tuning Pinecone/Milvus/Weaviate/OpenSearch and embedding workflows in production RAG systems.
  • Data pipelines: Real-time ingestion with Glue, Kinesis, Lambda; integrating enterprise APIs/webhooks; EventBridge buses and SQS workers for reliable, idempotent processing.
  • Containers & Kubernetes: Docker, EKS, and serverless model serving; autoscaling for AI workloads.
  • Graph databases: Neptune or Neo4j with Gremlin/Cypher/SPARQL.
  • Programming & automation: Python/Bash and IaC (CDK, Terraform, CloudFormation).
  • Model operations: Deploying/monitoring LLMs, embeddings, and custom ML models with performance optimization.
  • Enterprise integration: Model Context Protocol (MCP), API gateways, and connectors for systems like Confluence, Jira, and SharePoint.
  • Observability & resilience: CloudWatch/New Relic dashboards, SLOs/SLIs, synthetic checks; queue latency/depth alerts; EventBridge failure handling; DB health and slow-query monitoring.
  • AI governance: Model risk management, validation frameworks, and compliance logging for regulated AI apps.
Benefits
  • offering a digital first, fully flexible working style.
  • We've embraced asynchronous, hybrid working as a norm.
  • modern tools and systems, with a big focus on our use of AI
  • a focus on personal growth with career, learning, and development tools available, plus some dedicated ‘tools down’ personal development time in NZ,
  • an additional week of paid leave;
  • staff appreciation leave at Christmas plus a day off for your birthday,
  • all within in a tight knit, supportive, and inclusive global community.

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard skills
machine learningdata pipelinesMLOpsSQLPythonautomationmodel operationsevent-driven systemsgraph databasescontainers
Soft skills
leadershipcollaborationcommunicationproblem-solvingorganizational skills