
Mid-Level Data Engineer, GCP Cloud
CESAR
full-time
Posted on:
Location Type: Office
Location: 🇧🇷 Brazil
Visit company websiteJob Level
Mid-LevelSenior
Tech Stack
AirflowBigQueryCloudDockerGoogle Cloud PlatformJavaPySparkPythonSparkSQLTerraform
About the role
- Develop data analytics products using Airflow, DataProc, PySpark, and BigQuery on the Google Cloud Platform, applying solid data warehouse principles.
- Build data pipelines to monitor data quality and analytics model performance.
- Maintain the data platform infrastructure using Terraform and develop, review, and deliver code through CI/CD.
- Collaborate with stakeholders in data analysis to streamline the processes of data acquisition, processing, and presentation.
- Implement an enterprise data governance model and actively promote data protection, sharing, reuse, quality, and standards.
- Improve and maintain the data platform’s DevOps capabilities.
- Continuously optimize and improve existing data solutions (pipelines, products, infrastructure) for best performance, high security, low vulnerability, low cost, and high reliability.
- Work in an agile product team with frequent code deliveries using Test-Driven Development (TDD), continuous integration, and continuous deployment (CI/CD).
- Resolve code quality issues flagged by tools like SonarQube, Checkmarx, Fossa, and Cycode throughout the development cycle.
- Perform necessary activities for data mapping, data lineage, and documentation of information flows.
- Provide visibility into data/vehicle/resource quality issues and work with business owners to resolve them.
- Demonstrate technical knowledge and communication skills with the ability to argue for and defend well-designed solutions.
Requirements
- Advanced English for daily communication
- Experience in at least three of the following languages: Java, Python, Spark, SQL.
- Cloud data/software engineering, building scalable, reliable, and cost-effective data pipelines using Google BigQuery.
- Workflow orchestration tools such as Airflow.
- REST APIs for compute, storage, operations, and security.
- DevOps tools such as Tekton, GitHub Actions, Git, GitHub, Terraform, Docker, Astronomer.
- Project management tools such as Atlassian JIRA.
- Experience working on an implementation team from concept to operations, providing deep technical expertise for a successful deployment.
- Experience implementing methods to automate all parts of the pipeline to minimize work in development and production.
- Analytical skills for data profiling and troubleshooting in data pipelines/products.
- Ability to simplify and clearly communicate complex data/software ideas/problems and work independently with cross-functional teams and all management levels.
- Strong results orientation and ability to multitask and work independently.
- Experience with performance optimization in SQL, Python, PySpark, or Java.
- Proven ability to document complex systems.
- Demonstrated commitment to quality and project deadlines.
- Contribute code committed to improving open-source data/software engineering projects.
- Experience in cloud infrastructure architecture and handling application migrations/updates (Astronomer).
- Understanding of GCP’s underlying architecture and hands-on experience with products such as BigQuery, Google Cloud Storage, Cloud SQL, Memorystore, Dataflow, Dataproc, Artifact Registry, Cloud Build, Cloud Run, Vertex AI, Pub/Sub, and GCP APIs.
Benefits
- Health and dental insurance;
- Meal/food allowance;
- Language allowance;
- Childcare assistance;
- Contact lens allowance;
- Life insurance;
- Discounts on CESAR School courses;
- Day Off (on your birthday month);
- Wellhub;
- Moodar;
- Cíngulo.
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard skills
PythonJavaSQLPySparkDataProcBigQueryTerraformTest-Driven Developmentdata profilingperformance optimization
Soft skills
communication skillsanalytical skillsresults orientationmultitaskingindependent workcross-functional collaborationproblem-solvingcommitment to qualityability to simplify complex ideasstakeholder collaboration