
Senior Applied Research Scientist
Grupo Protege
full-time
Posted on:
Location Type: Remote
Location: Brazil
Visit company websiteExplore more
Job Level
About the role
- Design and apply statistical and machine learning methods to curate, filter, and enrich large-scale unstructured datasets
- Develop frameworks to assess data diversity, duplication, and informativeness. Design statistical approaches to de-risk training datasets
- Collaborate with model training teams to identify data bottlenecks and optimize dataset performance. Emphasis on ability to collaborate with large foundational models and smaller startups
- Provide leadership on data quality strategy and shape internal best practices
- Evaluate external datasets for integration, focusing on scalability, quality, and relevance to model performance. Help build data scorecards
- Contribute to research and development of tools that automate data preprocessing and validation
Requirements
- PhD or equivalent Master's Degree + 4+ years industry experience in machine learning, economics, mathematics, engineering, computer science, statistics, or a related quantitative field
- Strong understanding of AI model training pipelines, including pre-processing and evaluation
- Experience working with large, unstructured datasets, especially text
- Background in statistical analysis, bias detection, and data validation
- Able to identify high-impact problems and drive independent solutions.
Benefits
- Health insurance
- Flexible work arrangements
- Professional development opportunities
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
statistical methodsmachine learningdata preprocessingdata validationstatistical analysisbias detectiondata diversity assessmentdata duplication assessmentAI model training pipelineslarge-scale unstructured datasets
Soft Skills
leadershipcollaborationproblem identificationindependent solutions
Certifications
PhDMaster's Degree