Eli Lilly and Company

Senior Data Engineer – Lakehouse Architecture

Eli Lilly and Company

full-time

Posted on:

Location Type: Office

Location: Boston • Massachusetts • 🇺🇸 United States

Visit company website
AI Apply
Apply

Salary

💰 $70,500 - $200,200 per year

Job Level

Senior

Tech Stack

AirflowApacheAWSAzureCloudDistributed SystemsDockerKafkaKubernetesPythonSparkSQLTerraform

About the role

  • Design and implement comprehensive Lakehouse architecture solutions using technologies like Databricks, Snowflake, or equivalent platforms
  • Build and maintain real-time and batch data processing systems using Apache Spark, Kafka, and similar technologies
  • Architect scalable data pipelines that handle structured, semi-structured, and unstructured data to deliver AI ready data
  • Develop data transformation workflows using tools like DBT, Airflow, or Databricks
  • Lead the technical strategy for data lake and data warehouse integration, ensuring optimal performance and cost efficiency
  • Implement data governance frameworks, including data quality monitoring, lineage tracking, data time travel and security protocols
  • Implement centralized data catalog system and enhance data discovery using technologies like Elastic Search / Open Search
  • Establish monitoring and alerting systems for data pipeline health using technologies like Apache Superset
  • Drive adoption of modern data engineering best practices including Infrastructure as Code, CI/CD, and automated testing
  • Collaborate with data scientists, analysts, and business stakeholders to translate requirements into robust technical solutions
  • Mentor a team of 3-5 data engineers
  • Foster a collaborative team culture focused on continuous learning and innovation

Requirements

  • Master’s degree in computer science, Engineering, or related technical field
  • 3+ years of hands-on experience with Lakehouse architectures (Databricks, Snowflake, or similar)
  • 7+ years of overall data engineering experience with large-scale distributed systems
  • Experience with streaming data technologies (Kafka)
  • Familiarity with data cataloging tools (Apache Atlas or DataHub)
  • Familiarity with high performance data service framework (Arrow Flight)
  • Industry certifications in cloud platforms or big data technologies
  • Expert-level proficiency in Python and SQL for data transformation and pipeline development
  • Strong experience with Apache Spark for big data processing and analytics
  • Hands-on experience with cloud platforms (AWS or Azure) and their data services
  • Proficiency with Infrastructure as Code tools (Terraform, CloudFormation)
  • Experience with containerization (Docker, Kubernetes) and orchestration platforms
  • Knowledge of data modeling techniques for both analytical and operational workloads
  • Understanding of data governance, security, and compliance requirements
  • Knowledge in the pharmaceutical or life sciences domain
Benefits
  • Health insurance
  • 401(k)
  • Pension
  • Vacation benefits
  • Eligibility for medical, dental, vision and prescription drug benefits
  • Flexible benefits (e.g., healthcare and/or dependent day care flexible spending accounts)
  • Life insurance and death benefits
  • Certain time off and leave of absence benefits
  • Well-being benefits (e.g., employee assistance program, fitness benefits, and employee clubs and activities)

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard skills
Lakehouse architectureDatabricksSnowflakeApache SparkKafkaDBTAirflowPythonSQLTerraform
Soft skills
leadershipcollaborationmentoringcommunicationteam culturecontinuous learninginnovation
Certifications
cloud platform certificationsbig data technology certifications
Recorded Future

Manager, Data Engineering – Data Collection

Recorded Future
Mid · Seniorfull-timeMassachusetts · 🇺🇸 United States
Posted: 4 days agoSource: boards.greenhouse.io
Cyber SecurityPython