Collect, clean, and preprocess user-generated text and image data for fine-tuning large models
Design and manage scalable data labeling pipelines, leveraging both crowdsourcing and in-house labeling teams
Build and maintain automated datasets for content moderation (e.g., safe vs unsafe content)
Collaborate with researchers and engineers to ensure datasets are high-quality, diverse, and aligned with model training needs
Fine-tune state-of-the-art models, design evaluation frameworks, and bring AI features into production
Ensure models are safe, trustworthy, and impactful at scale
Work closely with regional teams across product, engineering, operations, infrastructure and data
Hybrid work combining flexible remote work and in-office collaboration at HQ
Requirements
機械学習や大規模モデルのファインチューニング用データセット準備の実務経験
テキストおよび画像データにおけるデータクレンジング、前処理、変換スキル
データラベリングワークフローやラベルデータの品質保証に関する実務経験
モデレーションデータセット(安全性、コンプライアンス、フィルタリング)の構築・維持経験
Python、SQL などのスクリプト言語に精通し、大規模データパイプラインの運用経験
Proven experience preparing datasets for machine learning or fine-tuning large models
Strong skills in data cleaning, preprocessing, and transformation for both text and image data
Hands-on experience with data labeling workflows and quality assurance for labeled data
Familiarity with building and maintaining moderation datasets (safety, compliance, and filtering)
Proficiency in scripting (Python, SQL) and working with large-scale data pipelines
Benefits
フラットな組織構造と本当のオーナーシップ
プロダクト方向性や意思決定への全面的な関与
柔軟な勤務形態
プロダクト・データ・エンジニアリングを横断する高インパクトな役割
市場最高水準の給与と成果に基づくボーナス
グローバルなプロダクト開発への参画機会
充実した福利厚生 —— 住宅補助、高品質な社員食堂、残業食事補助
健康・歯科・眼科保険
グローバル旅行保険(本人および扶養家族対象)
無制限で柔軟な有給休暇制度
Flat structure & real ownership
Full involvement in direction and consensus decision making
Flexibility in work arrangement
High-impact role with visibility across product, data, and engineering
Top-of-market compensation and performance-based bonuses
Global exposure to product development
Lots of perks - housing rental subsidies, a quality company cafeteria, and overtime meals
Health, dental & vision insurance
Global travel insurance (for you & your dependents)
Unlimited, flexible time off
ATS Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard skills
data cleaningdata preprocessingdata transformationdata labeling workflowsquality assurancemoderation datasetsPythonSQLlarge-scale data pipelinesfine-tuning large models