Salary
💰 $123,656 per year
Tech Stack
AWSCloudDockerETLJavaKafkaPythonSQL
About the role
- OVERVIEW: Design, build, and implement generalized large-scale, sophisticated data pipelines using Nifi for downstream analytics and data science for our Sport Performance products.
- Design and develop scalable Nifi ingestion pipelines within AWS cloud services to consume real-time and batch data from external sources.
- Design and create data models for use throughout the ETL system.
- Utilize Kafka to efficiently and to effectively store data to move throughout the data pipeline and for downstream data science and analytics usage.
- POSITION: Position permits telecommuting from anywhere in the U.S.
- THE CHALLENGE: Ensure the seamless integration of AWS-based tools for data storage, processing, and analytics.
- Responsible for ETL development and warehousing using Python and Java.
- Create data pipeline triggers and filters within ETL (extract, transform, and load) process to ensure appropriate optimization of data flowing through system and resource usage.
- Implement monitoring and error handling for all new parts of the data pipeline to ensure observability and alerting is available.
- Establish rigorous unit testing across the data pipeline to ensure robustness of the system.
- Design and create data models for use throughout the ETL system.
- Utilize Kafka to efficiently and to effectively store data to move throughout the data pipeline and for downstream data science and analytics usage.
- Design data architecture and data models for both internal and external representations of data.
- Build the data transforms within the data pipeline to convert data from external to internal representations.
- Conduct data analytics and debugging bad data by writing SQL queries.
- Build automated cleaning of data to remove bad or unusable data from downstream consumers with logging to understand the frequency and depth of the underlying issues.
- Collaborate with other engineering teams to adopt standard methodologies, drive scalability, and increase consistency across systems.
- Maintain awareness of company standards and technology guidance; use JIRA, an Agile project management tool, to ensure efficient data development; collaborate with peers to align projects with overall direction.
- Follow best practices across Data Engineering to ensure scalable, consistent data architecture and system.
- Utilize Java language to build data processors in Nifi framework.
- Utilize Docker to ensure a consistent, repeatable, and isolated environment for software development and testing.
- Work in a self-driven, independent fashion to meet Sport driven deadlines.
Requirements
- Master’s degree in computer science, computer engineering, or closely related field
- 1 year experience as a data engineer or related occupation
- Must possess 1 year of experience with: Python, Java, Kafka, AWS, and Docker
- ETL Development and Warehousing
- Analytics and debugging using SQL
- Agile development environment
- Designing data architecture