
Member of Technical Staff – Data, World Models
Moonvalley
full-time
Posted on:
Location Type: Remote
Location: Canada
Visit company websiteExplore more
Job Level
About the role
- Design, automate, maintain, and optimize Python ETL pipelines (Spark/Ray) for large-scale multimodal data.
- Build and maintain data cataloging, lineage, quality tooling, integrity verification, access controls, and lifecycle management systems.
- Provide guidance, internal tools, and documentation to colleagues on data best practices.
- Serve as a custodian of the company’s datasets, ensuring overall data health, quality, and discoverability.
Requirements
- Knowledge of Python ETL pipelines and supporting infrastructure, data formats, and storage systems at scale.
- Experience managing datasets, annotations, and data versioning for model training.
- Solid grasp of ML fundamentals is essential to collaborate effectively with researchers.
- Skilled at writing high-quality specifications for AI agents.
Benefits
- Competitive salary and equity
- Private health coverage
- Pension contribution (UK, Canada, US)
- Unlimited paid vacation
- Fully-distributed, async-first culture
- Hardware setup of your choice
- Stipends for phone, internet, and meals
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
PythonETLSparkRaydata catalogingdata lineagedata qualityintegrity verificationdata versioningmachine learning
Soft Skills
guidancedocumentationdata best practicescollaborationspecification writing