Cell and Gene Therapy Catapult

Data Scientist

Cell and Gene Therapy Catapult

full-time

Posted on:

Location Type: Hybrid

Location: London • 🇬🇧 United Kingdom

Visit company website
AI Apply
Apply

Job Level

Mid-LevelSenior

Tech Stack

PythonPyTorchTensorflow

About the role

  • By joining our dynamic Scale Enabling Technologies (SET) Team at the forefront of intelligent manufacturing for gene and cell therapies, the Data Scientist specialising in Machine Learning, Modelling, and AI, will collaborate closely with multidisciplinary data and analytical scientists, as well as bioprocess engineers, to design, build, and deploy cutting-edge predictive models and machine learning solutions.
  • Contributions will directly accelerate the digital transformation of cell and gene therapy manufacturing, advancing our mission to deliver life-changing treatments to patients worldwide.
  • The Data Scientist specialising in Machine Learning, Modelling, and AI will own the development of robust data pipelines, execute advanced feature engineering, drive rigorous model validation and deployment, and assist with implementing decisions based on model output.
  • The Data Scientist specialising in Machine Learning, Modelling, and AI will be empowered to extract actionable insights from complex datasets and build scalable, automated decision-making systems, ensuring all analytical solutions adhere to best practices in statistical modelling and machine learning.

Requirements

  • Relevant experience, including either a PhD or MSc in Computer Science, Engineering, Physics, Bioinformatics, or related STEM field
  • Strong programming experience in Python and R for data science and machine learning applications
  • Strong experience in developing algorithms for supervised and unsupervised learning using machine learning techniques and tools
  • Hands-on experience with large language models (LLMs), AI agents, and code assistants to accelerate data analysis, automate workflows, and produce insights
  • Hands-on expertise in building and validating data models and simulations, including predictive analytics for complex datasets
  • Experience applying data ontology and taxonomy principles to model, organize, and standardize biological and clinical datasets, ensuring semantic consistency and interoperability with industry standards (e.g., FAIR, ISA 95/88, OBO Foundry ontologies
  • Ambitious and highly motivated self-starter who is passionate about pushing the boundaries of Industry 4.0 and making a tangible impact in the biotech sector.
  • Experience using code version control (e.g., git)
  • Desirable
  • Exposure to bioprocess data such as iPSC, AAV, CAR-T
  • Experience in closed-loop control strategies and their implementation
  • Working experience in time-series, omics datasets analysis and knowledge graph architecture
  • Experience applying model validation strategies to ensure accuracy, reliability, and generalizability of predictive models
  • Working experience in ML-specific frameworks (i.e. Tensorflow, pytorch, scipy, etc.)
  • Experience working with MATLAB
Benefits
  • Comprehensive training and ongoing development opportunities will be provided to help you excel and grow with us.

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard skills
machine learningdata pipelinesfeature engineeringmodel validationpredictive analyticsalgorithm developmentsupervised learningunsupervised learningdata modelingdata analysis
Soft skills
self-starterambitiousmotivatedcollaborationproblem-solvingcommunicationadaptabilitycritical thinkinginnovationpassion for biotech
Certifications
PhDMSc