
Senior Data Engineer, Databricks
Hypersonix Inc.
full-time
Posted on:
Location Type: Remote
Location: United States
Visit company websiteExplore more
Job Level
About the role
- Design and implement enterprise-scale data pipelines using Databricks on AWS, leveraging both cluster-based and serverless compute paradigms
- Architect and maintain medallion architecture (Bronze/Silver/Gold) data lakes and lakehouses
- Develop and optimize Delta Lake tables for ACID transactions and efficient data management
- Build and maintain real-time and batch data processing workflows
- Create reusable, modular data transformation logic using DBT to ensure data quality and consistency across the organization
- Develop complex Python applications for data ingestion, transformation, and orchestration
- Write optimized SQL queries and implement performance tuning strategies for large-scale datasets
- Implement comprehensive data quality checks, testing frameworks, and monitoring solutions
- Design and implement CI/CD pipelines for automated testing, deployment, and rollback of data artifacts
- Configure and optimize Databricks clusters, job scheduling, and workspace management
- Implement version control best practices using Git and collaborative development workflows
- Partner with data analysts, data scientists, and business stakeholders to understand requirements and deliver solutions
- Mentor junior engineers and promote best practices in data engineering
- Document technical designs, data lineage, and operational procedures
- Participate in code reviews and contribute to team knowledge sharing
Requirements
- 5+ years of experience in data engineering roles
- Expert-level proficiency in Databricks (Unity Catalog, Delta Live Tables, Workflows, SQL Warehouses)
- Strong understanding of cluster configuration, optimization, and serverless SQL compute
- Advanced SQL skills including query optimization, indexing strategies, and performance tuning
- Production experience with DBT (models, tests, documentation, macros, packages)
- Proficient in Python for data engineering (PySpark, pandas, data validation libraries)
- Hands-on experience with Git workflows (branching strategies, pull requests, code reviews)
- Proven track record implementing CI/CD pipelines (Jenkins, GitLab CI)
- Working knowledge of Snowflake architecture and migration patterns
Benefits
- Monitoring and analyzing Databricks DBU (Databricks Unit) consumption and cloud infrastructure costs
- Implementing cost optimization strategies including cluster right-sizing, autoscaling configurations, and spot instance usage
- Optimizing job scheduling to leverage off-peak pricing and minimize idle cluster time
- Establishing cost allocation tags and chargeback models for different teams and projects
- Conducting regular cost reviews and providing recommendations for efficiency improvements
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard skills
data engineeringDatabricksDelta LakePythonSQLDBTCI/CDGitdata qualityperformance tuning
Soft skills
mentoringcollaborationcommunicationproblem-solvingdocumentation