Evaluation: Rating/assessing the performance of AI models or algorithms based on their output or behavior through a set of evaluative questions.
Annotation Labeling: Labeling elements of a piece of content rather than the content as a whole.
Classification: Assigning predefined categories or labels to items.
Content Quality: Evaluating the perceived quality and/or appropriateness of content.
Content Understanding: Generating labels to advance understanding of a concept, trend etc.
Data Augmentation: Creation of additional training data for machine learning models by applying transformations to the original data, such as modifying images (rotation, flipping, cropping), generating new text (paraphrasing, summarization), or altering audio/video signals (speed modification, pitch shifting) to reduce overfitting and increase dataset diversity.
Grading: Reviewing data and identifying whether or not a product feature works as intended based on the project's guidelines.
Identification Labeling: Labeling model outputs to identify if a piece of content is or isn't something (e.g., identify clickbait; identifying gaming videos; identifying branded content).
Preference Ranking: Ordering or ranking items based on a set of preferences or criteria.
Prompt Generation: Creating prompts or questions that will be used to generate responses from a language model or other AI system.
Relevance Evaluation: Projects that evaluate the relevance of content based on a relevancy scale (1-3, 1-5, etc.).
Response Generation: Generating responses to prompts or questions using a language model or other AI system.
Response Rewrite: Rewriting existing text while preserving the original meaning, often to improve clarity or style and adherence to guidelines.
Response Summarization: Producing concise summaries of longer pieces of text or data.
Similarity Evaluation: Projects where content is compared in order to drive a determination.
Transcription: Converting spoken language or audio content into written text.
Translation: Converting text or spoken language from one language to another.
Data Collection: Gathering and compiling various forms of data to be used for training, evaluating, or fine-tuning the AI models.
Requirements
A Bachelor’s degree or higher in a humanities specialization is required.
Advanced degrees are strongly preferred (Master’s or PhD).
Professional or Expert level proficiency (C1/C2) in English and Polish.
Applicants must be legally authorized to work in the United States at the time of hire.
Innodata is unable to provide visa sponsorship now or in the future for this position.