
Principal Data Scientist – R&D, Therapeutics Discovery
Johnson & Johnson
full-time
Posted on:
Location Type: Hybrid
Location: Spring House • California • Massachusetts • United States
Visit company websiteExplore more
Salary
💰 $117,000 - $201,250 per year
Job Level
About the role
- Develop ML/AI models that support discovery workflows, including target prioritization, multi‑omics integration, and mechanistic inference.
- Apply modern ML approaches (e.g., deep learning, graph learning, foundation models, generative models) to chemical, biological, imaging, and assay datasets.
- Build and optimize models for real‑world R&D use cases, ensuring scalability, interpretability, and scientific rigor.
- Design, build, and maintain robust data pipelines that curate, standardize, and integrate diverse R&D datasets (chemical, biological, multi‑omics, imaging, biophysical, automation logs, etc.).
- Partner with platform teams to implement best‑practice MLOps/DevOps workflows and deploy ML models into production R&D environments.
- Develop tooling that accelerates dataset preparation, feature engineering, and model lifecycle management across TD.
- Work hand‑in‑hand with TD scientists to understand key biological and chemical questions and shape computational strategy accordingly.
- Translate sparse, heterogeneous experimental datasets into insights that guide decision‑making in hit discovery, mechanism studies, perturbation experiments, and compound optimization.
- Participate in design, interpretation, and iterative refinement of discovery experiments.
- Partner with cross-functional teams in R&D Data Science, IT, platform engineering, and therapeutic area groups to drive AI/ML adoption.
- Contribute to evaluating new analytical methods, automation technologies, and data platforms supporting next‑generation discovery science.
- Champion high standards for data quality, documentation, governance, and reproducibility.
Requirements
- Master’s or Ph.D. in Computational Biology, Bioinformatics, Data Science, Chemistry, Chemical Biology, Biomedical Engineering, Computer Science, or related field.
- Experience applying ML/AI in scientific domains (drug discovery, biology, chemistry, systems biology, imaging, or related areas).
- Strong programming skills in Python (preferred) and experience with scientific/ML libraries (PyTorch, TensorFlow, scikit‑learn, RDKit, etc.).
- Practical experience with data engineering , including data modeling, workflow orchestration, ETL/ELT pipelines, and cloud computing environments (AWS, GCP, or Azure).
- Ability to work directly with experimental scientists to solve real R&D challenges.
Benefits
- Vacation –120 hours per calendar year
- Sick time - 40 hours per calendar year; for employees who reside in the State of Colorado –48 hours per calendar year; for employees who reside in the State of Washington –56 hours per calendar year
- Holiday pay, including Floating Holidays –13 days per calendar year
- Work, Personal and Family Time - up to 40 hours per calendar year
- Parental Leave – 480 hours within one year of the birth/adoption/foster care of a child
- Bereavement Leave – 240 hours for an immediate family member: 40 hours for an extended family member per calendar year
- Caregiver Leave – 80 hours in a 52-week rolling period10 days
- Volunteer Leave – 32 hours per calendar year
- Military Spouse Time-Off – 80 hours per calendar year
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
ML modelsAI modelsdeep learninggraph learningfoundation modelsgenerative modelsdata pipelinesfeature engineeringdata modelingworkflow orchestration
Soft Skills
collaborationproblem-solvingcommunicationinterdisciplinary teamworkdecision-makingadaptabilitycritical thinkingscientific rigordata quality advocacydocumentation
Certifications
Master’s degreePh.D.