
Staff Data Engineer
eSimplicity
full-time
Posted on:
Location Type: Remote
Location: Remote • Maryland • 🇺🇸 United States
Visit company websiteSalary
💰 $149,200 - $172,600 per year
Job Level
Lead
Tech Stack
AirflowAmazon RedshiftAWSCassandraCloudEC2ETLHadoopJavaKafkaNoSQLPostgresPythonScalaSparkSQL
About the role
- Identifies and owns all technical solution requirements in developing enterprise-wide data architecture.
- Creates project-specific technical design, product and vendor selection, application, and technical architectures.
- Provides subject matter expertise on data and data pipeline architecture and leads the decision process to identify the best options.
- Serves as the owner of complex data architectures, with an eye toward constant reengineering and refactoring to ensure the simplest and most elegant system possible to accomplish the desired need.
- Ensure strategic alignment of technical design and architecture to meet business growth and direction and stay on top of emerging technologies.
- Develops and manages product roadmaps, backlogs, and measurable success criteria and writes user stories.
- Responsible for expanding and optimizing our data and data pipeline architecture, as well as optimizing data flow and collection for cross-functional teams.
- Support software developers, database architects, data analysts, and data scientists on data initiatives and ensure that the optimal data delivery architecture is consistent throughout ongoing projects.
- Creates new pipeline development and maintains existing pipeline; updates Extract, Transfer, Load (ETL) process; creates new ETL feature development; builds PoCs with Redshift Spectrum, Databricks, etc.
- Implements, with the support of project data specialists, large dataset engineering: data augmentation, data quality analysis, data analytics (anomalies and trends), data profiling, data algorithms, and (measure/develop) data maturity models and develop data strategy recommendations.
- Assemble large, complex data sets that meet non-functional and functional business requirements.
- Identify, design, and implement internal process improvements, including re-designing data infrastructure for greater scalability, optimizing data delivery, and automating manual processes.
- Building required infrastructure for optimal extraction, transformation, and loading of data from various data sources using AWS and SQL technologies.
- Building analytical tools to utilize the data pipeline, providing actionable insight into key business performance metrics, including operational efficiency and customer acquisition.
- Working with stakeholders, including data, design, product, and government stakeholders, and assisting them with data-related technical issues.
- Write unit and integration tests for all data processing code.
- Work with DevOps engineers on CI, CD, and IaC.
- Read specs and translate them into code and design documents.
- Perform code reviews and develop processes for improving code quality.
Requirements
- All candidates must pass public trust clearance through the U.S. Federal Government.
- Bachelor’s degree in Computer Science, Engineering, or a related technical field; OR In lieu of a degree, 10 additional years of relevant professional experience and 8 years of specialized experience may be substituted.
- 8+ years of total professional experience in the technology or data engineering field.
- Extensive Data pipeline experience using Python, Java, and cloud technologies.
- Expert data pipeline builder and data wrangler who enjoys optimizing data systems and building them from the ground up.
- Self-sufficient and comfortable supporting the data needs of multiple teams, systems, and products.
- Experienced in designing data architecture for shared services, scalability, and performance.
- Experienced in designing data services including API, metadata, and data catalog.
- Experienced in data governance process to ingest (batch, stream), curate, and share data with upstream and downstream data users.
- Ability to build and optimize data sets, ‘big data’ data pipelines, and architecture.
- Ability to perform root cause analysis on external and internal processes and data to identify opportunities for improvement and answer questions.
- Excellent analytic skills associated with working on unstructured datasets.
- Ability to build processes that support data transformation, workload management, data structures, dependency, and metadata.
- Demonstrated understanding and experience using software and tools, including big data tools like Kafka, Spark, and Hadoop; relational NoSQL and SQL databases including Cassandra and Postgres; workflow management and pipeline tools such as Airflow, Luigi and Azkaban; AWS cloud services including Redshift, RDS, EMR, and EC2; stream-processing systems like Spark-Streaming and Storm; and object function/object-oriented scripting languages including Scala, C++, Java, and Python.
- Flexible and willing to accept a change in priorities as necessary.
- Ability to work in a fast-paced, team-oriented environment.
- Experience with Agile methodology, using test-driven development.
- Experience with Atlassian Jira/Confluence.
- Excellent command of written and spoken English.
- Ability to obtain and maintain a Public Trust; residing in the United States.
Benefits
- full healthcare benefits
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard skills
data architecturedata pipelineETLdata quality analysisdata analyticsdata governancedata transformationroot cause analysisbig dataAgile methodology
Soft skills
self-sufficientanalytical skillsflexibilityteam-orientedcommunication
Certifications
Bachelor's degree in Computer SciencePublic Trust clearance