Databricks Developer – AWS

Elfonze Technologies

full-time

Posted on: 3/7/2026

Location Type: Remote

Location: India

✨ AI Apply

About the role

Design and implement scalable ETL/ELT pipelines in Databricks using PySpark/Spark SQL/Scala.
Build and manage Delta Lake tables with ACID transactions, schema evolution, time travel, Z-Ordering, and Optimize/Vacuum routines.
Develop batch and near-real-time pipelines (e.g., Structured Streaming, Kafka/MSK, Kinesis).
Implement robust data quality checks (e.g., expectations/constraints, anomaly detection) and unit/integration tests.
Integrate Databricks with Amazon S3 (bronze/silver/gold zones), AWS Glue Data Catalog or Unity Catalog, and Lake Formation where applicable.
Configure IAM roles & instance profiles.
Orchestrate jobs and workflows using Databricks Workflows and/or AWS Step Functions, Airflow.
Implement CI/CD using GitHub; manage repos with Git.
Utilize observability: CloudWatch, Databricks audit logs, metrics, cost monitors, and alerting.

Bachelor’s in Computer Science, Engineering, or related field (or equivalent experience).
3–7 years in data engineering, including 2+ years hands-on with Databricks on AWS.
Strong in Apache Spark (optimizations, joins, partitioning, caching)
Solid experience with Delta Lake, S3, Glue Catalog / Unity Catalog, and Lakehouse design.
Proficiency with SQL, performance tuning, and cost optimization on Databricks.
Familiar with AWS services: S3, IAM, Glue, Lambda, CloudWatch, KMS, Step Functions, MSK/Kinesis, VPC.
Version control (Git) and CI/CD for Databricks (e.g., Repos, Databricks CLI, Terraform, GitHub Actions/CodePipeline).
Experience in Agile/Scrum delivery.

Benefits

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools

ETLELTPySparkSpark SQLScalaDelta LakeSQLApache SparkCI/CDAgile