Tech Stack
BigQueryCloudGoogle Cloud PlatformJavaMicroservicesMySQLNoSQLPostgresPythonSQLTerraform
About the role
- Design, build, and maintain scalable data ingestion and curation pipelines from diverse sources using Python, SQL, and DBT/Dataform.
- Lead end-to-end integration to ensure smooth and reliable data flow from source to insight.
- Build and manage GCP data platforms using BigQuery, Dataflow, Pub/Sub, Cloud Functions, DataProc, and Cloud Run.
- Implement and manage data governance policies, access controls, encryption, and data masking in cloud environments.
- Orchestrate data workflows using Astronomer and Terraform and apply Infrastructure as Code best practices.
- Monitor and optimize performance, scalability, cost, and compute resources for GCP-based processes.
- Collaborate with data architects, application architects, service owners, and cross-functional teams to define best practices and design patterns.
- Automate platform processes to enhance reliability, improve data quality, and reduce manual intervention.
- Communicate complex technical decisions to technical and non-technical stakeholders.
- Maintain documentation and promote knowledge sharing to ensure long-term system maintainability.
Requirements
- Bachelor's degree in Computer Science, Information Technology, Information Systems, Data Analytics, or a related field (or equivalent combination of education and experience).
- Preferred Degree: Master of Science.
- 7+ years of experience in Data Engineering or Software Engineering, with at least 2 years of hands-on experience building and deploying cloud-based data platforms (GCP preferred).
- Strong proficiency in SQL, Java, and Python, with practical experience in designing and deploying cloud-based data pipelines using GCP services like BigQuery, Dataflow, and DataProc.
- Solid understanding of Service-Oriented Architecture (SOA) and microservices, and their application within a cloud data platform.
- Experience with relational databases (e.g., PostgreSQL, MySQL), NoSQL databases, and columnar databases (e.g., BigQuery).
- Knowledge of data governance frameworks, data encryption, and data masking techniques in cloud environments.
- Familiarity with CI/CD pipelines, Infrastructure as Code (IaC) tools like Terraform and Tekton, and other automation frameworks.
- Excellent analytical and problem-solving skills, with the ability to troubleshoot complex data platform and microservices issues.
- Experience in monitoring and optimizing cost and compute resources for processes in GCP technologies (e.g., BigQuery, Dataflow, Cloud Run, DataProc).
- A passion for data, innovation, and continuous learning.
- Legally authorized to work in the United States; verification of employment eligibility will be required at the time of hire.