
Staff Data Scientist
Switzerland Global Enterprise
full-time
Posted on:
Location Type: Remote
Location: Remote • 🇺🇸 United States
Visit company websiteSalary
💰 $103,900 - $173,100 per year
Job Level
Lead
Tech Stack
NumpyPandasPythonScikit-Learn
About the role
- Lead teams developing statistical, machine learning, and AI solutions for Gas Power stakeholders
- Perform comprehensive exploratory data analysis (EDA) on diverse and complex business datasets, using statistical analysis, Natural Language Processing (NLP), and unsupervised clustering techniques to uncover patterns, identify quality issues, and extract meaningful insights
- Collaborate closely with business Subject Matter Experts (SMEs) to translate their deep domain knowledge into structured, AI-ready datasets for use in prompt engineering, Retrieval-Augmented Generation (RAG), and model fine-tuning
- Develop and prepare "golden datasets" that serve as pristine examples of our business processes, significantly reducing the iteration time for prompt engineering and AI development teams
- Design, create, and maintain a suite of data benchmarks that represent our core business use cases
- Establish and enforce rigorous data quality standards and validation protocols, ensuring the accuracy, relevance, and integrity of all data used in our GenAI applications
- Proactively identify and document potential data biases, working with stakeholders to develop mitigation strategies that promote responsible and fair AI outcomes
- Serve as the primary steward for the BU’s curated AI datasets, defining and implementing a clear data management strategy that includes versioning, access controls, and a lifecycle management plan
- Create and maintain comprehensive documentation for all curated datasets (e.g., "datasheets for datasets"), detailing their origin, schema, limitations, and intended use to ensure transparency and reusability
- Continuously survey the BU's data landscape to identify new high-value data sources and champion their integration into our Generative AI ecosystem
- Act as the primary data liaison between the Business Unit, prompt engineers, and the central AI Foundry
- Rigorously test and validate the effectiveness of generalized tools and methods provided by the AI Foundry against your BU-specific data benchmarks
- Provide precise, data-driven feedback and recommendations to the Foundry, collaborating to refine and enhance central AI capabilities to ensure they meet our specific business needs
- Communicate methods, findings, and hypotheses with stakeholders
Requirements
- Bachelor’s or Master’s degree in a quantitative field such as Data Science, Computer Science, Statistics, Economics, or a related discipline
- 3-5+ years of professional experience as a Data Scientist, Data Analyst, or in a similar role with a heavy emphasis on data exploration, manipulation, and preparation
- Strong proficiency in Python and core data science libraries (e.g., pandas, NumPy, scikit-learn, spaCy, NLTK)
- Demonstrated experience with a wide range of exploratory data analysis and unsupervised machine learning techniques (e.g., clustering, topic modeling, dimensionality reduction)
- Proven ability to work with messy, unstructured, and semi-structured data, especially text
- Exceptional communication and interpersonal skills, with a talent for translating complex technical concepts to non-technical audiences and building strong relationships with business stakeholders
- Hands-on experience preparing data specifically for Generative AI systems (e.g., creating datasets for RAG, few-shot prompting, or supervised fine-tuning)
- Familiarity with the architecture of modern LLMs and the role of vector databases (e.g., Pinecone, Milvus, Weaviate)
- Experience in establishing data quality frameworks, data governance policies, or data management best practices
- Prior experience working in a federated analytics or data science model where collaboration between central and business-embedded teams was required
- Domain expertise relevant to our Business Unit
Benefits
- medical, dental, vision, and prescription drug coverage
- access to Health Coach from GE Vernova, a 24/7 nurse-based resource
- access to the Employee Assistance Program, providing 24/7 confidential assessment, counseling and referral services
- GE Vernova Retirement Savings Plan, a tax-advantaged 401(k) savings opportunity with company matching contributions and company retirement contributions
- access to Fidelity resources and financial planning consultants
- tuition assistance
- adoption assistance
- paid parental leave
- disability benefits
- life insurance
- 12 paid holidays
- permissive time off
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard skills
statistical analysismachine learningNatural Language Processingunsupervised clusteringexploratory data analysisPythonpandasNumPyscikit-learnspaCy
Soft skills
communicationinterpersonal skillscollaborationrelationship buildingdata-driven feedbacktranslating technical conceptsproblem-solvingdocumentationstakeholder engagementdata stewardship