Zantech

Senior Data Engineer

Zantech

full-time

Posted on:

Origin:  • 🇺🇸 United States • District of Columbia, Washington

Visit company website
AI Apply
Apply

Job Level

Senior

Tech Stack

ApacheAWSDistributed SystemsETLPySparkPythonSparkSQL

About the role

  • Develop Spark applications in AWS Databricks, utilizing Python, pySpark, SQL and to meet project requirements and data processing needs.
  • Design and implement robust ETL pipelines using Apache Spark in Databricks, ensuring data integrity, efficiency, and scalability.
  • Collaborate with cross-functional teams to understand business requirements and design solutions that leverage structured, semi-structured, and unstructured data effectively.
  • Write high-quality code in a timely manner, adhering to coding standards, best practices, and established development processes.
  • Utilize version control systems like Git to manage codebase and ensure seamless collaboration within the team.
  • Merge and consolidate various data sets using Pyspark code, enabling streamlined data processing and analysis.
  • Work with APIs to facilitate data ingestion from diverse sources and integrate data into the ecosystem.
  • Apply expertise in Databricks delta lake to optimize data storage, query performance, and overall data processing efficiency.
  • Demonstrate knowledge of application development life cycles and promote continuous integration/deployment practices for efficient project delivery.
  • Perform query tuning, performance tuning, troubleshooting, and debugging for Spark and other big data solutions to enhance system efficiency and reliability.
  • Exhibit expertise in database concepts and SQL to efficiently manipulate, process, and extract insights from complex datasets.
  • Apply database engineering and design principles to ensure data infrastructure meets high standards of scalability, reliability, and performance.
  • Leverage previous experience in handling large-scale distributed systems to deliver and operate data solutions efficiently.
  • Demonstrate a successful track record of extracting value from extensive, disconnected datasets to drive data-driven decision-making.

Requirements

  • A minimum of 8+ years of hands-on experience in Spark, with proficiency in either Python or pySpark.
  • Databricks Certified Data Engineer Associate or Professional Certification preferred.
  • Strong knowledge of the Databricks platform and previous experience working with it.
  • Extensive experience with Apache Spark and a proven history of successful development in this environment.
  • Proficiency in at least one programming language (Python, pySpark).
  • Previous experience in ETL and data application development, coupled with expertise in version control systems like Git.
  • Ability to write Pyspark code for data merging and transformation.
  • Experience working with APIs for data ingestion and integration.
  • Familiarity with Databricks delta lake and expertise in query optimization techniques.
  • Sound understanding of application development lifecycles and continuous integration/deployment practices.
  • Proven experience in query tuning, performance tuning, troubleshooting, and debugging Spark and other big data solutions.
  • Solid knowledge of database concepts and SQL.
  • Strong background in handling large and complex datasets from various sources and databases.
  • Proficient understanding of database engineering and design principles.
  • Required Security Clearance: US Citizenship and the ability to obtain and maintain an active Public trust or higher clearance, per contract requirements.
Ambush

Data Engineer

Ambush
Mid · Seniorfull-time🇺🇸 United States
Posted: 29 days agoSource: getambush-talent.freshteam.com
ApacheAWSPySparkPythonSparkSQL
Interface Systems

Data Engineer - Senior Manager

Interface Systems
Seniorfull-time$130k–$150k / year🇺🇸 United States
Posted: 36 days agoSource: interfacesystems.wd5.myworkdayjobs.com
ApacheAzureCloudDockerETLKubernetesPySparkPythonScalaSparkSQLTerraform
Brillio

Lead Data Engineer

Brillio
Seniorfull-time$120k–$130k / yearCalifornia · 🇺🇸 United States
Posted: 1 day agoSource: jobs.lever.co
AWSCloudDistributed SystemsEC2ETLPySparkPythonSparkSQLTableau
Paradise Fruits | By Jahncke

Python Developer

Paradise Fruits | By Jahncke
Mid · Seniorfull-time🇲🇹 Malta
Posted: 9 days agoSource: ats.rippling.com
AWSAzureCloudETLGoogle Cloud PlatformPySparkPythonSQL
Remote Recruitment

Developer, Para Markets

Remote Recruitment
Junior · Midfull-time🇪🇸 Spain
Posted: 16 days agoSource: apply.workable.com
AWSCloudEC2JavaPySparkPythonSQL