Innodata Inc.

Senior Language Data Scientist

Innodata Inc.

full-time

Posted on:

Location Type: Remote

Location: New JerseyUnited States

Visit company website

Explore more

AI Apply
Apply

Job Level

Tech Stack

About the role

  • You can lead long-term projects with high complexity and ambiguity from first discussion with the client to completion
  • Design/improve workflows to create data for AI/ML training and evaluation. Includes human annotation and data-collection workflows, as well as synthetic ones
  • Dive deep into existing workflows and processes to gather data and insights, make recommendations, and drive improvement through innovation and cross-functional collaboration with customers
  • Critically assess annotation tooling and workflows
  • Quantitatively analyze large datasets, perform statistical analysis, calculate metrics, and make recommendations to improve accuracy and performance
  • Work closely with client stakeholders on understanding goals, gathering requirements, proposing solutions, and executing them.
  • Set an ambitious research agenda for improving our products and services
  • Contribute to establishing best practices and standards for generative AI development with customers and within the organization

Requirements

  • MA in (computational) linguistics, data science, computer science (AI / ML / NLU), quantitative social sciences or a related scientific / quantitative field, PhD strongly preferred
  • Ability to collaborate directly with technical stakeholders including senior project managers, data engineers, and research scientists.
  • Collaborating with cross-functional teams to define AI project requirements and objectives, ensuring alignment with overall business goals
  • Design efficient data strategies for complex long-term projects, potentially involving multiple teams and workflows.
  • Knowledge of how components of GenAI products or services combine to work
  • Developing clear and concise documentation, including technical specifications, user guides, and presentations, to communicate complex AI concepts to both technical and nontechnical stakeholders
  • Familiarity with GenAI technologies that enables you to improve existing processes to handle future challenges.
  • Extensive experience working with human language data and designing human evaluation tasks, including multi-phase and complex workflows.
  • Deep understanding of language and its relationship with culture
  • Ability to identify ambiguity and subjectivity in language
  • Ability to work with multi-lingual and multi-modal projects
  • Advanced knowledge of statistics, metrics (e.g. f1 score, inter-rater reliability metrics), and data analysis methods such as sampling.
  • Experience with Natural Language Processing (NLP) techniques and tools, such as SpaCy, NLTK, or Hugging Face.
  • Proficiency in Python to handle / transform large datasets (e.g. pre- and postprocessing data, pandas) perform quantitative analyses visualize data (for example matplotlib, seaborn)
  • Deep understanding of data pipelines to support ML and NLP workflows,
  • Knowledge of efficient data collection, transformation, and storage
  • Knowledge of data structures, algorithms, and data engineering principles
  • Excellent interpersonal skills for effective cross-functional stakeholder engagement
  • Excellent problem-solving skills, with the ability to think critically and creatively to develop innovative AI solutions
  • Ability to work independently and collaborate as part of a team
  • Adaptable to changing technologies and methodologies
  • Ability to translate experience, research and development information to understand client products and services.
Benefits
  • Providing technical mentorship and guidance to junior team members
  • Professional development opportunities
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
data analysisstatistical analysisNatural Language ProcessingPythondata pipelinesdata collectiondata transformationdata storagemetricshuman evaluation tasks
Soft Skills
interpersonal skillsproblem-solving skillscollaborationcritical thinkingadaptabilitycommunicationinnovationindependencecreativitystakeholder engagement