Eli Lilly and Company

Machine Learning Scientist – Federated Benchmarking, Validation Engineering

Eli Lilly and Company

full-time

Posted on:

Location Type: Hybrid

Location: IndianapolisCaliforniaMassachusettsUnited States

Visit company website

Explore more

AI Apply
Apply

Salary

💰 $151,500 - $244,200 per year

Job Level

About the role

  • Architect and implement privacy-preserving protocols for constructing representative test sets across distributed partner datasets, ensuring statistical validity while maintaining data isolation.
  • Create comprehensive benchmark suites covering small molecules (ADMET, solubility, permeability), antibodies (affinity, stability, immunogenicity), and RNA therapeutics (stability, delivery, off-target effects).
  • Develop validation strategies that assess model generalization across different experimental protocols, cell lines, species, and therapeutic indications while respecting partner data boundaries.
  • Systematically benchmark federated models against public datasets (ChEMBL, PubChem, PDB, Therapeutic Antibody Database) to establish performance baselines and identify gaps.
  • Implement time-split or proper scaffold-split validation protocols that assess model performance on prospective data, simulating real-world deployment scenarios and detecting concept drift.
  • Build robust MLOps pipelines ensuring complete reproducibility of federated experiments, including versioning of data snapshots, model checkpoints, and hyperparameter configurations.
  • Design statistically powered validation studies accounting for multiple testing, hierarchical data structures, and non-independent observations common in drug discovery datasets.
  • Develop comprehensive performance profiling across diverse molecular scaffolds, target classes, and property ranges, identifying systematic biases and failure modes.
  • Collaborate with engineering teams to integrate validation frameworks with the TuneLab federated learning platform built on NVIDIA FLARE, ensuring scalable and automated testing across partner networks.

Requirements

  • PhD in Computational Biology, Bioinformatics, Cheminformatics, Computer Science, Statistics, or related field from an accredited college or university
  • 2+ years of experience in the biopharmaceutical industry or related fields, with demonstrated expertise in drug discovery and early development
  • Strong foundation in experimental design, statistical validation, and hypothesis testing
  • Experience with ML model validation, cross-validation strategies, and performance metrics
  • Proficiency in data engineering, pipeline development, and automation
  • Experience with federated learning platforms and distributed computing
  • Knowledge of regulatory requirements for AI/ML in pharmaceutical development
  • Expertise in ADMET assay development and validation
  • Understanding of antibody engineering and characterization methods
  • Familiarity with RNA therapeutic design and delivery systems
  • Experience with clinical biomarker validation and translational research
  • Proficiency in workflow orchestration tools (Airflow, Kubeflow, Prefect)
  • Strong knowledge of containerization and cloud computing (Docker, Kubernetes)
  • Publications on model validation, benchmarking, or reproducibility
  • Experience with GxP compliance and quality management systems
  • Exceptional attention to detail and commitment to scientific rigor
  • Strong technical writing skills for regulatory documentation
  • Portfolio mindset balancing rigorous validation with rapid deployment for partner value.
Benefits
  • Health insurance
  • 401(k) matching
  • Pension
  • Vacation benefits
  • Medical benefits
  • Dental benefits
  • Vision benefits
  • Prescription drug benefits
  • Flexible benefits
  • Life insurance
  • Death benefits
  • Time off benefits
  • Leave of absence benefits
  • Well-being benefits
  • Employee assistance program
  • Fitness benefits
  • Employee clubs and activities
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
privacy-preserving protocolsbenchmark suitesvalidation strategiesmodel generalizationMLOps pipelinesperformance profilingexperimental designstatistical validationhypothesis testingADMET assay development
Soft Skills
attention to detailscientific rigortechnical writing
Certifications
PhD in Computational BiologyPhD in BioinformaticsPhD in CheminformaticsPhD in Computer SciencePhD in Statistics