Salary
💰 $99,600 - $146,800 per year
Tech Stack
AirflowApacheAWSCloudCyber SecurityDockerKafkaKubernetesNoSQLPandasPostgresPythonRabbitMQSQLTerraform
About the role
- Design, build, and fine-tune data pipelines to load data into data warehouse from sources like databases, cloud storage, or streaming platforms.\n
- Analyze and refactor existing Python-based file ingestion processes.\n
- Troubleshoot pipeline failure and implement quality checks, retries, and safeguards to prevent pipeline failures.\n
- Reduce manual intervention through automation in data pipelines.\n
- Create detailed runbooks documenting existing and new data platform processes.\n
- Add unit tests to improve reliability and maintainability.\n
- Design modular, reusable Python code via IaC that will be able to deploy AWS resources and build data platform components.\n
- Assist with Data architecture, building and validating data architecture.\n
- Contribute to the design, development, testing, deployment, and support of data pipelines and warehouses in a cloud environment.\n
- Build and maintain data platform with goal of scalability and validation.\n
- Advocate engineering best practices, including the use of design patterns, code review, and automated unit/functional testing.\n
- Collaborate efficiently with product management, technical program management, operations, and other engineering teams.
Requirements
- Minimum 3+ years of related experience in Cloud development crafting and implementing solutions using AWS services, with an understanding of IAM, S3, ECS, Lambda, SNS\n
- Strong programming skills in Python and libraries such as pandas, pydantic, and polars\n
- Experience using source control systems (GitLab, GitHub) and CI/CD pipelines\n
- Experience working with IAC, such as Terraform, CloudFormation, AWS CDK\n
- Experience in SQL (Postgres, Snowflake, etc.) and understanding trade-offs between data storage systems and architectures (data warehouses, SQL vs NoSQL, partitioning, etc.)\n
- Experience with scripting using BASH or SHELL