Tech Stack
BigQueryCloudGoogle Cloud PlatformJavaMicroservicesMySQLNoSQLPostgresPythonSQLTerraform
About the role
- Data Pipeline Architect & Builder: Spearhead the design, development, and maintenance of scalable data ingestion and curation pipelines from diverse sources; ensure data is standardized, high-quality, and optimized for analytical use; leverage Python, SQL, and DBT/Dataform.
- End-to-End Integration Expert: Utilize full-stack skills to contribute to seamless end-to-end development, ensuring smooth and reliable data flow from source to insight.
- GCP Data Solutions Leader: Build and manage data platforms using GCP services (BigQuery, Dataflow, Pub/Sub, Cloud Functions) to meet business needs.
- Data Governance & Security Champion: Implement and manage data governance policies, access controls, and security best practices using GCP native security features.
- Data Workflow Orchestrator: Employ Astronomer and Terraform for data workflow management and cloud infrastructure provisioning.
- Performance Optimization Driver: Monitor and improve performance, scalability, and efficiency of data pipelines and storage, optimizing resource utilization and cost-effectiveness.
- Collaborative Innovator: Collaborate with data architects, application architects, service owners, and cross-functional teams to define best practices and design patterns.
- Automation & Reliability Advocate: Proactively automate data platform processes to enhance reliability and data quality and minimize manual intervention.
- Effective Communicator: Communicate complex technical decisions to both technical and non-technical stakeholders.
- Continuous Learner & Business Impact Translator: Stay current with industry trends and translate business requirements into optimized data asset designs and efficient code.
- Documentation & Knowledge Sharer: Develop comprehensive documentation for data engineering processes to promote knowledge sharing and maintainability.
Requirements
- Bachelor's degree in Computer Science, Information Technology, Information Systems, Data Analytics, or a related field (or equivalent combination of education and experience).
- 5-7 years of experience in Data Engineering or Software Engineering, with at least 2 years of hands-on experience building and deploying cloud-based data platforms (GCP preferred).
- Strong proficiency in SQL, Java, and Python, with practical experience in designing and deploying cloud-based data pipelines using GCP services like BigQuery, Dataflow, and DataProc.
- Solid understanding of Service-Oriented Architecture (SOA) and microservices, and their application within a cloud data platform.
- Experience with relational databases (e.g., PostgreSQL, MySQL), NoSQL databases, and columnar databases (e.g., BigQuery).
- Knowledge of data governance frameworks, data encryption, and data masking techniques in cloud environments.
- Familiarity with CI/CD pipelines, Infrastructure as Code (IaC) tools like Terraform and Tekton, and other automation frameworks.
- Excellent analytical and problem-solving skills, with the ability to troubleshoot complex data platform and microservices issues.
- Experience in monitoring and optimizing cost and compute resources for processes in GCP technologies (e.g., BigQuery, Dataflow, Cloud Run, DataProc).
- A passion for data, innovation, and continuous learning.