eSimplicity

Staff Data Engineer

eSimplicity

full-time

Posted on:

Location Type: Remote

Location: Remote • Maryland • 🇺🇸 United States

Visit company website
AI Apply
Apply

Salary

💰 $149,200 - $172,600 per year

Job Level

Lead

Tech Stack

AirflowAmazon RedshiftAWSCassandraCloudEC2ETLHadoopJavaKafkaNoSQLPostgresPythonScalaSparkSQL

About the role

  • Identifies and owns all technical solution requirements in developing enterprise-wide data architecture.
  • Creates project-specific technical design, product and vendor selection, application, and technical architectures.
  • Provides subject matter expertise on data and data pipeline architecture and leads the decision process to identify the best options.
  • Serves as the owner of complex data architectures, with an eye toward constant reengineering and refactoring to ensure the simplest and most elegant system possible to accomplish the desired need.
  • Ensure strategic alignment of technical design and architecture to meet business growth and direction and stay on top of emerging technologies.
  • Develops and manages product roadmaps, backlogs, and measurable success criteria and writes user stories.
  • Responsible for expanding and optimizing our data and data pipeline architecture, as well as optimizing data flow and collection for cross-functional teams.
  • Support software developers, database architects, data analysts, and data scientists on data initiatives and ensure that the optimal data delivery architecture is consistent throughout ongoing projects.
  • Creates new pipeline development and maintains existing pipeline; updates Extract, Transfer, Load (ETL) process; creates new ETL feature development; builds PoCs with Redshift Spectrum, Databricks, etc.
  • Implements, with the support of project data specialists, large dataset engineering: data augmentation, data quality analysis, data analytics (anomalies and trends), data profiling, data algorithms, and (measure/develop) data maturity models and develop data strategy recommendations.
  • Assemble large, complex data sets that meet non-functional and functional business requirements.
  • Identify, design, and implement internal process improvements, including re-designing data infrastructure for greater scalability, optimizing data delivery, and automating manual processes.
  • Building required infrastructure for optimal extraction, transformation, and loading of data from various data sources using AWS and SQL technologies.
  • Building analytical tools to utilize the data pipeline, providing actionable insight into key business performance metrics, including operational efficiency and customer acquisition.
  • Working with stakeholders, including data, design, product, and government stakeholders, and assisting them with data-related technical issues.
  • Write unit and integration tests for all data processing code.
  • Work with DevOps engineers on CI, CD, and IaC.
  • Read specs and translate them into code and design documents.
  • Perform code reviews and develop processes for improving code quality.

Requirements

  • All candidates must pass public trust clearance through the U.S. Federal Government.
  • Bachelor’s degree in Computer Science, Engineering, or a related technical field; OR In lieu of a degree, 10 additional years of relevant professional experience and 8 years of specialized experience may be substituted.
  • 8+ years of total professional experience in the technology or data engineering field.
  • Extensive Data pipeline experience using Python, Java, and cloud technologies.
  • Expert data pipeline builder and data wrangler who enjoys optimizing data systems and building them from the ground up.
  • Self-sufficient and comfortable supporting the data needs of multiple teams, systems, and products.
  • Experienced in designing data architecture for shared services, scalability, and performance.
  • Experienced in designing data services including API, metadata, and data catalog.
  • Experienced in data governance process to ingest (batch, stream), curate, and share data with upstream and downstream data users.
  • Ability to build and optimize data sets, ‘big data’ data pipelines, and architecture.
  • Ability to perform root cause analysis on external and internal processes and data to identify opportunities for improvement and answer questions.
  • Excellent analytic skills associated with working on unstructured datasets.
  • Ability to build processes that support data transformation, workload management, data structures, dependency, and metadata.
  • Demonstrated understanding and experience using software and tools, including big data tools like Kafka, Spark, and Hadoop; relational NoSQL and SQL databases including Cassandra and Postgres; workflow management and pipeline tools such as Airflow, Luigi and Azkaban; AWS cloud services including Redshift, RDS, EMR, and EC2; stream-processing systems like Spark-Streaming and Storm; and object function/object-oriented scripting languages including Scala, C++, Java, and Python.
  • Flexible and willing to accept a change in priorities as necessary.
  • Ability to work in a fast-paced, team-oriented environment.
  • Experience with Agile methodology, using test-driven development.
  • Experience with Atlassian Jira/Confluence.
  • Excellent command of written and spoken English.
  • Ability to obtain and maintain a Public Trust; residing in the United States.
Benefits
  • full healthcare benefits

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard skills
data architecturedata pipelineETLdata quality analysisdata analyticsdata governancedata transformationroot cause analysisbig dataAgile methodology
Soft skills
self-sufficientanalytical skillsflexibilityteam-orientedcommunication
Certifications
Bachelor's degree in Computer SciencePublic Trust clearance