Salary
💰 $152,200 - $213,900 per year
Tech Stack
AirflowAmazon RedshiftAndroidApacheAWSBigQueryCloudDynamoDBGoGraphQLKubernetesMicroservicesPythonRedisScalaSparkSQLTerraform
About the role
- Architect, design, and code shared libraries in Scala and Python that abstract complex business logic for data pipelines
- Build software for a large-scale data processing ecosystem supporting real-time and batch pipelines for analytics, data science, and operations
- Architect and build modularized tooling for environment simplification, migration, and maintenance
- Maintain software engineering and architecture best practices, standards, and a culture of quality, innovation, and experimentation
- Evangelize and evolve the platform, best-practices, and data driven decisions; identify use cases and drive adoption
- Build out observability, alerting, logging, and system control plane for diagnosing issues across data pipelines
- Maintain and expand existing software deployments while meeting strict uptime SLAs
- Develop and document internal and external standards and best practices for deployments, configurations, naming, partitioning strategies
- Maintain detailed documentation to support data quality and data governance requirements
- Participate in agile/scrum ceremonies and collaborate with product managers, architects, and other engineers
- Focus on privacy solutions and data subject rights automation, working cross-functionally with analytics, data architecture, legal and security partners
Requirements
- 7+ years of data engineering experience in software engineering in the data space
- Strong software algorithmic problem-solving expertise
- Strong fundamental Scala and Python software programming skills
- Good understanding of AWS or other cloud provider resources (S3)
- Strong SQL skills and ability to creatively problem solve and dive deep into our data and software ecosystem
- Hands-on production environment experience with distributed processing systems such as Apache Spark
- Hands-on production experience with workflow orchestration systems such as Airflow for creating and maintaining data pipelines
- Scripting language experience (Bash, PowerShell)
- Experience with technologies such as OneTrust, Databricks, Jupyter, Snowflake, Redshift, Airflow, DynamoDB, Redis, Kubernetes, Kinesis, REST APIs, Terraform, Go
- SQL, Python, Scala, other computer software languages
- Willingness and ability to learn and pick up new skillsets
- Self-starting problem solver with an eye for detail and excellent analytical and communication skills
- Bachelor’s degree in Computer Science, Information Systems, Software, Electrical or Electronics Engineering, or comparable field of study, and/or equivalent work experience
- Master’s Degree in Computer Science, Information Systems preferred
- Experience with at least one major Massively Parallel Processing (MPP) or cloud database technology (Snowflake, Redshift, Big Query) preferred
- Experience in developing APIs with GraphQL preferred
- Experience in developing microservices preferred
- Deep Understanding of AWS or other cloud providers as well as infrastructure as code preferred
- Familiarity with Data Modeling techniques and Data Warehousing standard methodologies and practices preferred
- Familiar with Scrum and Agile methodologies preferred
- Familiarity with privacy regulations and/or data subject rights preferred