Elfonze Technologies

Databricks Developer – AWS

Elfonze Technologies

full-time

Posted on:

Location Type: Remote

Location: India

Visit company website

Explore more

AI Apply
Apply

About the role

  • Design and implement scalable ETL/ELT pipelines in Databricks using PySpark/Spark SQL/Scala.
  • Build and manage Delta Lake tables with ACID transactions, schema evolution, time travel, Z-Ordering, and Optimize/Vacuum routines.
  • Develop batch and near-real-time pipelines (e.g., Structured Streaming, Kafka/MSK, Kinesis).
  • Implement robust data quality checks (e.g., expectations/constraints, anomaly detection) and unit/integration tests.
  • Integrate Databricks with Amazon S3 (bronze/silver/gold zones), AWS Glue Data Catalog or Unity Catalog, and Lake Formation where applicable.
  • Configure IAM roles & instance profiles.
  • Orchestrate jobs and workflows using Databricks Workflows and/or AWS Step Functions, Airflow.
  • Implement CI/CD using GitHub; manage repos with Git.
  • Utilize observability: CloudWatch, Databricks audit logs, metrics, cost monitors, and alerting.

Requirements

  • Bachelor’s in Computer Science, Engineering, or related field (or equivalent experience).
  • 3–7 years in data engineering, including 2+ years hands-on with Databricks on AWS.
  • Strong in Apache Spark (optimizations, joins, partitioning, caching)
  • Solid experience with Delta Lake, S3, Glue Catalog / Unity Catalog, and Lakehouse design.
  • Proficiency with SQL, performance tuning, and cost optimization on Databricks.
  • Familiar with AWS services: S3, IAM, Glue, Lambda, CloudWatch, KMS, Step Functions, MSK/Kinesis, VPC.
  • Version control (Git) and CI/CD for Databricks (e.g., Repos, Databricks CLI, Terraform, GitHub Actions/CodePipeline).
  • Experience in Agile/Scrum delivery.
Benefits
  • Health insurance
  • Flexible work arrangements
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
ETLELTPySparkSpark SQLScalaDelta LakeSQLApache SparkCI/CDAgile