Reddit, Inc.

Senior Data Engineer, ML Platform

Reddit, Inc.

full-time

Posted on:

Location: 🇺🇸 United States

Visit company website
AI Apply
Apply

Salary

💰 $190,800 - $267,100 per year

Job Level

Senior

Tech Stack

AirflowKafkaSparkSQL

About the role

  • Lead development of data pipelines and workflow for large scale ML models at Reddit.
  • Design and implement scalable and secure data processing pipelines and storage environments that prepare our source of truth datasets for our models.
  • Ensure data is cleansed, mapped, transformed, and otherwise optimized for storage and use according to business and technical requirements.
  • Build effective data pipelines and workflows to streamline data ingestion, processing, and distribution tasks.
  • Setting up and operating data workflow management tools for SQL code versioning, dependency tracing, etc
  • Load transformed data into storage and reporting structures in destinations including data warehouse, reporting systems and analytics applications.
  • Monitor and troubleshoot issues with the data environment to maintain high availability and performance.
  • Support monitoring and observability across training datasets, model metrics and implement diagnostic tools for metric movements.
  • Maintain effective documentation regarding data procedures, systems, and architectures to maintain clarity and enable easy collaboration.

Requirements

  • 5+ years of experience in Data Engineering or ML Infrastructure
  • Experience with large scale data transforms to prepare graph data
  • Experience with Graph DB, Spark, Kafka pipelines
  • Experience working with Airflow and MLFlow
  • Experience with storage frameworks like BQ, parquet, iceberg
  • Awareness of ML models and architectures is a huge plus.
  • Strong focus on scalability, reliability, performance, and ease of use.
  • Strong organizational & communication skills
General Motors

Senior Data Engineer

General Motors
Seniorfull-timeMissouri · 🇺🇸 United States
Posted: 1 hour agoSource: generalmotors.wd5.myworkdayjobs.com
AzureCloudDistributed SystemsETLPythonSparkSQL
Abbott

Senior Data Engineer

Abbott
Seniorfull-time$97k–$195k / year🇺🇸 United States
Posted: 1 hour agoSource: abbott.wd5.myworkdayjobs.com
ApacheAWSAzureCloudETLGraphQLKafkaMapReducePySparkSparkSQLTableau
TASC

Principal Data Engineer

TASC
Leadfull-time$138k–$265k / yearMontana, New York · 🇺🇸 United States
Posted: 1 hour agoSource: mastercard.wd1.myworkdayjobs.com
AWSAzureCloudDistributed SystemsETLHadoopJenkinsPySparkPythonSQL
Woolpert

Cloud Data Engineer

Woolpert
Juniorfull-time$60k–$65k / year🇺🇸 United States
Posted: 3 hours agoSource: boards.greenhouse.io
AzureCloudJavaScriptPythonSQLTableau
General Dynamics Information Technology

Cloud/Data Engineer – TS/SCI with Polygraph

General Dynamics Information Technology
Senior · Leadfull-time$161k–$207k / yearVirginia · 🇺🇸 United States
Posted: 4 hours agoSource: gdit.wd5.myworkdayjobs.com
ApacheAWSCloudSpark