
Staff AI Scientist
HackerRank
full-time
Posted on:
Location Type: Hybrid
Location: Bangalore • India
Visit company websiteExplore more
Job Level
About the role
- Design, prepare, and curate high-quality evaluation datasets with defensible methodology.
- Define criteria for dataset construction, ensuring statistical rigor, reproducibility, and fairness.
- Develop new metrics and evaluation frameworks to measure model performance in nuanced ways.
- Evaluate LLMs and other pre-trained models using carefully chosen datasets and metrics.
- Build scalable pipelines for training, fine-tuning, and benchmarking models.
- Contribute to projects involving fine-tuning, retrieval-augmented generation (RAG), and other adaptation methods.
- Partner with product and engineering to align scientific rigor with business outcomes.
- Define evaluation standards and ML lifecycle practices that raise the bar across the company.
- Mentor scientists and engineers, guiding best practices in experimentation, statistics, and ML development.
Requirements
- Master’s degree (PhD preferred) in Computer Science, Statistics, Machine Learning, or a related quantitative field.
- Strong background in mathematical and statistical foundations of machine learning (probability, linear algebra, optimization, experimental design).
- Demonstrated experience in end-to-end ML lifecycle: dataset preparation, model training, evaluation, deployment, and monitoring.
- Proven expertise in evaluation dataset design and metric creation, not just using existing benchmarks but knowing when and how to improve them.
- Experience with LLM evaluation, fine-tuning, and RAG, with the engineering skills to build production-ready pipelines.
- Track record of strategic impact at a staff or principal level setting evaluation and research standards across teams.
Benefits
- Equal opportunity employer
- Affirmative action employer
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard skills
machine learningstatistical rigordataset preparationmodel trainingmodel evaluationmodel deploymentfine-tuningretrieval-augmented generationevaluation metricsexperimental design
Soft skills
mentoringcollaborationstrategic impactguiding best practicescommunication
Certifications
Master’s degreePhD