Tech Stack
AWSCloudDockerKubernetesPythonRDBMSScalaSDLCSQL
About the role
- Identify, analyze, organizing, and storing raw data from various mediums (RDBMS, flat files, APIs, etc.)
- Develop, test, and deploy scalable data pipelines for critical business needs
- Evaluate, maintain, and enhance current data pipelines and architecture
- Maintaining code repositories adhering to proper branching flows
- Building and maintaining CI/CD framework and architecture
- Deploying and maintaining testing frameworks/suites for data pipelines and CI
- Developing, deploying, and maintaining APIs and endpoints for analyst and system use
- Deploying and maintenance of Kubernetes cluster(s)
- Peer code reviews
- Collaborate with business, development, and analytics teams to gather requirements
- Participate in data governance and stewardship program to enhance control and dissemination of data with best practices
- Develop and maintain technical documentation for pipeline architecture and tooling
Requirements
- Strong object-oriented/functional programming skills
- Python/Scala programming experience
- Proficient with ANSI SQL skills
- Familiarity with orchestration tools (Dagster preferred), Docker/Podman, CI/CD tools, and version control (git)
- Familiar and/or experience with lightweight directory access
- Experience and/or knowledge of Agile Development methodologies and SDLC
- Excellent problem-solving and analytical skills
- Proven ability to extract data from a variety of sources (Relational/Non-Relational Databases, APIs, FTP/SFTP, etc.)
- Comfortable using cloud technology platforms (AWS preferred)
- Proven abilities to take initiative and be innovative
- College Degree and/or 3 - 5 years related experience.
- Will you now or in the future require sponsorship for employment (e.g., H-1B visa)?