Be a key player in a cross-functional team of engineers who build software for a large-scale data processing ecosystem, supporting real-time and batch data pipelines for analytics, data science, and operations
Architect, design, and code shared libraries in Scala and Python that abstract complex business logic to allow consistent functionality across all data pipelines across the Data organization
Work to architect and build world class modularized tooling to understand our very complex environments to come up with plans for simplification, migration, and maintenance.
Maintain software engineering and architecture best practices and standards within the team and wider organization, along with a culture of quality, innovation, and experimentation
Evangelize and evolve the platform, best-practices, data driven decisions; identify new use cases and features and drive adoption
Build out a robust observability, alerting, logging, and system control plane that allows easy diagnosis of any issues across all our data pipelines
Contribute to maintaining, updating, and expanding existing software deployments while maintaining strict uptime SLAs
Contribute to developing and documenting both internal and external standards and best practices for software deployments, configurations, naming conventions, partitioning strategies, and more.
Maintain detailed documentation of your work and changes to support data quality and data governance requirements
Be an active participant and advocate of agile/scrum ceremonies to collaborate and improve processes for our team
Collaborate with product managers, architects, and other engineers to drive the success of the Foundational Platform
Requirements
7+ years of data engineering experience in software engineering in the data space
Strong fundamental Scala and Python software programming skills
Good understanding of AWS or other cloud provider resources (S3)
Strong SQL skills and ability to creatively problem solve and dive deep into our data and software ecosystem
Hands-on production environment experience with distributed processing systems such as Apache Spark
Hands-on production experience with workflow orchestration systems such as Airflow for creating and maintaining data pipelines
Scripting language experience (Bash, PowerShell)
Technologies such as OneTrust, Databricks, Jupyter, Snowflake, Redshift, Airflow, DynamoDB, Redis, Kubernetes, Kinesis, REST APIs, Terraform, Go SQL, Python, Scala, other computer software languages
Willingness and ability to learn and pick up new skillsets
Self-starting problem solver with an eye for detail and excellent analytical and communication skills
Master’s Degree in Computer Science, Information Systems preferred
Experience with at least one major Massively Parallel Processing (MPP) or cloud database technology (Snowflake, Redshift, Big Query)
Experience in developing APIs with GraphQL
Experience in developing microservices
Deep Understanding of AWS or other cloud providers as well as infrastructure as code
Familiarity with Data Modeling techniques and Data Warehousing standard methodologies and practices
Familiar with Scrum and Agile methodologies
Familiarity with privacy regulations and/or data subject rights
Benefits
full range of medical, financial, and/or other benefits
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.