BJAK

Data Engineer

BJAK

full-time

Posted on:

Location Type: Remote

Location: Remote • 🇮🇪 Ireland

Visit company website
AI Apply
Apply

Job Level

Mid-LevelSenior

Tech Stack

PythonSQL

About the role

  • Collect, clean, and preprocess user-generated text and image data for fine-tuning large models
  • Design and manage scalable data labeling pipelines, leveraging both crowdsourcing and in-house labeling teams
  • Build and maintain automated datasets for content moderation (e.g., safe vs unsafe content)
  • Collaborate with researchers and engineers to ensure datasets are high-quality, diverse, and aligned with model training needs

Requirements

  • Proven experience preparing datasets for machine learning or fine-tuning large models
  • Strong skills in data cleaning, preprocessing, and transformation for both text and image data
  • Hands-on experience with data labeling workflows and quality assurance for labeled data
  • Familiarity with building and maintaining moderation datasets (safety, compliance, and filtering)
  • Proficiency in scripting (Python, SQL) and working with large-scale data pipelines
Benefits
  • Flat structure & real ownership
  • Full involvement in direction and consensus decision making
  • Flexibility in work arrangement
  • High-impact role with visibility across product, data, and engineering
  • Top-of-market compensation and performance-based bonuses
  • Global exposure to product development
  • Lots of perks - housing rental subsidies, a quality company cafeteria, and overtime meals
  • Health, dental & vision insurance
  • Global travel insurance (for you & your dependents)
  • Unlimited, flexible time off

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard skills
data cleaningdata preprocessingdata transformationdata labeling workflowsquality assurancePythonSQLlarge-scale data pipelinescontent moderationfine-tuning large models