
Databricks Developer – AWS
Elfonze Technologies
full-time
Posted on:
Location Type: Remote
Location: India
Visit company websiteExplore more
About the role
- Design and implement scalable ETL/ELT pipelines in Databricks using PySpark/Spark SQL/Scala.
- Build and manage Delta Lake tables with ACID transactions, schema evolution, time travel, Z-Ordering, and Optimize/Vacuum routines.
- Develop batch and near-real-time pipelines (e.g., Structured Streaming, Kafka/MSK, Kinesis).
- Implement robust data quality checks (e.g., expectations/constraints, anomaly detection) and unit/integration tests.
- Integrate Databricks with Amazon S3 (bronze/silver/gold zones), AWS Glue Data Catalog or Unity Catalog, and Lake Formation where applicable.
- Configure IAM roles & instance profiles.
- Orchestrate jobs and workflows using Databricks Workflows and/or AWS Step Functions, Airflow.
- Implement CI/CD using GitHub; manage repos with Git.
- Utilize observability: CloudWatch, Databricks audit logs, metrics, cost monitors, and alerting.
Requirements
- Bachelor’s in Computer Science, Engineering, or related field (or equivalent experience).
- 3–7 years in data engineering, including 2+ years hands-on with Databricks on AWS.
- Strong in Apache Spark (optimizations, joins, partitioning, caching)
- Solid experience with Delta Lake, S3, Glue Catalog / Unity Catalog, and Lakehouse design.
- Proficiency with SQL, performance tuning, and cost optimization on Databricks.
- Familiar with AWS services: S3, IAM, Glue, Lambda, CloudWatch, KMS, Step Functions, MSK/Kinesis, VPC.
- Version control (Git) and CI/CD for Databricks (e.g., Repos, Databricks CLI, Terraform, GitHub Actions/CodePipeline).
- Experience in Agile/Scrum delivery.
Benefits
- Health insurance
- Flexible work arrangements
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
ETLELTPySparkSpark SQLScalaDelta LakeSQLApache SparkCI/CDAgile