FREE ACCESS
5,000–10,000 jobs/day

See all jobs on JobTailor
Search thousands of fresh jobs every day.
Discover
- Fresh listings
- Fast filters
- No subscription required
Create a free account and start exploring right away.
Tech Stack
Tools & technologiesAirflowOpen SourcePythonPyTorchRaySpark
About the role
Key responsibilities & impact- Work with model researchers to define what “good data” means for our models, including quality metrics, validation checks, and acceptance thresholds
- Explore open source datasets and create internal ones most suitable to build fundamental World Models
- Build algorithms for automated data quality assessment, data domain mixtures, and domain adaptation from synthetic to real data.
- Track datasets, metadata, provenance, and versions so experiments are reproducible and it’s clear what data went into which training and evaluation runs
- Own CI/CD and development tooling for the data stack (GitHub, Python, PyTorch), and automate repetitive workflows to reduce friction
- Track and optimize throughput, storage, and compute utilization across pipelines and related assets
Requirements
What you’ll need- Strong ML and deep learning fundamentals with experience building and operating large-scale data and/or compute systems
- Comfortable moving between research questions and production engineering: you can dig into data, run analyses, and also ship reliable systems
- Demonstrated research experience with data compositions, quality, and dataset releases
- Ability to design and execute experiments with convincing unbiased outcomes
- Practical experience with distributed processing and orchestration (Spark, Ray, Airflow, or equivalents)
- Solid Python skills, and familiarity with the tooling around modern model training workflows (datasets, checkpoints, experiment tracking)
- Strong instincts around data quality: how to measure it, how to monitor it, and how to prevent regressions as things scale
- Able to work in a fast-moving environment, prioritize what matters, and communicate clearly with both researchers and engineers
- Bonus: experience with large video datasets, dataset curation for training, or building internal tooling for evaluation/analysis in ML environments
Benefits
Comp & perks- Flexible work arrangements
ATS Keywords
✓ Tailor your resumeApplicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
machine learningdeep learningdata quality assessmentdata domain mixturesdomain adaptationdistributed processingorchestrationPythonPyTorchSpark
Soft Skills
analytical skillscommunicationprioritizationcollaborationexperiment designproblem-solvingadaptabilityattention to detailresearch experienceunbiased outcomes
