Design, implement, and maintain high-quality data infrastructure services, including but not limited to Data Lake, Kafka, Amazon Kinesis, and data access layers.
Develop robust and efficient DBT models and jobs to support analytics reporting and machine learning modeling.
Closely collaborating with the Analytics team for data modeling, reporting, and data ingestion.
Create scalable real-time streaming pipelines and offline ETL pipelines.
Design, implement, and manage a data warehouse that provides secure access to large datasets.
Continuously improve data operations by automating manual processes, optimizing data delivery, and redesigning infrastructure for greater scalability.
Create engineering documentation for design, runbooks, and best practices.
Requirements
A minimum of 8 years of industry experience in the data infrastructure/data engineering domain.
A minimum of 8 years of experience with Python and SQL. Java experience is a plus.
A minimum of 4 years of industry experience using DBT.
A minimum of 4 years of industry experience using Snowflake and its basic features.
A minimum of 4 years of industry experience using Infrastructure as Code tools, specifically CDK and Terraform.
Strong written and verbal communication skills for key collaboration
Familiarity with AWS services, with industry experience using Lambda, Step Functions, Glue, RDS, EKS, DMS, EMR, etc.
Industry experience with different big data platforms and tools such as Kafka, Hadoop, Hive, Spark, Cassandra, Airflow, etc.
Industry experience working with relational and NoSQL databases in a production environment.
Strong fundamentals in data structures, algorithms, and design patterns.
Benefits
Competitive pay
100% company-paid medical, dental, and vision
401(k) + company equity
Unlimited paid time off + 13 company paid holidays
Parental leave
Flex Cares Program
Free Flex subscription
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard skills
PythonSQLDBTSnowflakeInfrastructure as CodeCDKTerraformdata modelingETLdata structures