
ML Ops Engineer
Nift
full-time
Posted on:
Location Type: Remote
Location: Remote • 🇺🇸 United States
Visit company websiteJob Level
Mid-LevelSenior
Tech Stack
AWSCloudDockerKafkaKubernetesPySparkPythonTerraform
About the role
- ML platform: Productionize training and inference (batch/real-time), establish CI/CD for models, data/versioning practices, and model governance
- Feature & model lifecycle: Centralize feature generation (e.g., feature store patterns), manage model registry/metadata, and streamline deployment workflows
- Observability & quality: Implement monitoring for data quality, drift, model performance/latency, and pipeline health with clear alerting and dashboards
- Engineering excellence: Refactor research code into reusable components, enforce repo structure, testing, logging, and reproducibility
- Cross-functional collaboration: Work with DS/Analytics/Engineers to turn prototypes into production systems, provide mentorship and technical guidance
- Roadmap & standards: Drive the technical vision for ML platform capabilities and establish architectural patterns that become team standards
Requirements
- Experience: 5+ years in ML Ops, including ownership of ML infrastructure for large-scale systems
- Software engineering strength: Strong coding, debugging, performance analysis, testing, and CI/CD discipline; reproducible builds. Extensive commercial experience with Python developing automated pipelines bringing ML models to production
- Cloud & containers: Production experience on AWS, DataBricks, Docker + Kubernetes (EKS/ECS or equivalent)
- IaC: Terraform or CloudFormation for managed, reviewable environments
- ML tooling: MLflow/SageMaker (or similar) with a track record of production ML pipelines
- Monitoring/observability: ML monitoring (quality, drift, performance) and pipeline alerting
- Collaboration: Excellent communication, comfortable working with data scientists, analysts, and engineers in a fast-paced startup
- PySpark/Glue/Dask/Kafka: Experience with large-scale batch/stream processing
- Analytics platforms: Experience integrating 3rd party data
- Model serving patterns: Familiarity with real-time endpoints, batch scoring, and feature stores
- Governance & security: Exposure to model governance/compliance and secure ML operations
- Be mission-oriented: Proactive and self-driven with a strong sense of initiative; takes ownership, goes beyond expectations, and does what's needed to get the job done
Benefits
- Competitive compensation, flexible remote work
- Unlimited Responsible PTO
- Great opportunity to join a growing, cash-flow-positive company while having a direct impact on Nift's revenue, growth, scale, and future success
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard skills
ML OpsPythonCI/CDTerraformCloudFormationMLflowSageMakerPySparkDockerKubernetes
Soft skills
collaborationcommunicationmentorshipproactiveself-driveninitiativeownershiptechnical guidancefast-paced environmentteam standards