Senior Data Engineer – Lakehouse Architecture

Eli Lilly and Company

full-time

Posted on: 10/4/2025

Location Type: Office

Location: Boston • Massachusetts • 🇺🇸 United States

✨ AI Apply

💰 $70,500 - $200,200 per year

Senior

AirflowApacheAWSAzureCloudDistributed SystemsDockerKafkaKubernetesPythonSparkSQLTerraform

About the role

Design and implement comprehensive Lakehouse architecture solutions using technologies like Databricks, Snowflake, or equivalent platforms
Build and maintain real-time and batch data processing systems using Apache Spark, Kafka, and similar technologies
Architect scalable data pipelines that handle structured, semi-structured, and unstructured data to deliver AI ready data
Develop data transformation workflows using tools like DBT, Airflow, or Databricks
Lead the technical strategy for data lake and data warehouse integration, ensuring optimal performance and cost efficiency
Implement data governance frameworks, including data quality monitoring, lineage tracking, data time travel and security protocols
Implement centralized data catalog system and enhance data discovery using technologies like Elastic Search / Open Search
Establish monitoring and alerting systems for data pipeline health using technologies like Apache Superset
Drive adoption of modern data engineering best practices including Infrastructure as Code, CI/CD, and automated testing
Collaborate with data scientists, analysts, and business stakeholders to translate requirements into robust technical solutions
Mentor a team of 3-5 data engineers
Foster a collaborative team culture focused on continuous learning and innovation

Master’s degree in computer science, Engineering, or related technical field
3+ years of hands-on experience with Lakehouse architectures (Databricks, Snowflake, or similar)
7+ years of overall data engineering experience with large-scale distributed systems
Experience with streaming data technologies (Kafka)
Familiarity with data cataloging tools (Apache Atlas or DataHub)
Familiarity with high performance data service framework (Arrow Flight)
Industry certifications in cloud platforms or big data technologies
Expert-level proficiency in Python and SQL for data transformation and pipeline development
Strong experience with Apache Spark for big data processing and analytics
Hands-on experience with cloud platforms (AWS or Azure) and their data services
Proficiency with Infrastructure as Code tools (Terraform, CloudFormation)
Experience with containerization (Docker, Kubernetes) and orchestration platforms
Knowledge of data modeling techniques for both analytical and operational workloads
Understanding of data governance, security, and compliance requirements
Knowledge in the pharmaceutical or life sciences domain

Benefits

Health insurance
401(k)
Pension
Vacation benefits
Eligibility for medical, dental, vision and prescription drug benefits
Flexible benefits (e.g., healthcare and/or dependent day care flexible spending accounts)
Life insurance and death benefits
Certain time off and leave of absence benefits
Well-being benefits (e.g., employee assistance program, fitness benefits, and employee clubs and activities)