
Data Engineer
Robot.com
full-time
Posted on:
Location Type: Hybrid
Location: San Francisco • California • United States
Visit company websiteExplore more
About the role
- Design, develop, and maintain scalable, reliable, and efficient ETL/ELT pipelines for batch and real-time data processing (e.g., MQTT/Kafka data ingestion).
- Manage and optimize our data warehousing solutions, primarily Google BigQuery, ensuring efficient data storage, querying, and cost-effectiveness.
- Implement and maintain data quality assertions across all data pipelines to ensure data integrity from source to consumption.
- Develop and integrate new data sources into our existing data ecosystem.
- Troubleshoot and resolve data pipeline issues, ensuring minimal disruption to data availability.
- Collaborate closely with company teams to understand their data needs and develop tailored data solutions.
- Design and implement data workflows to support machine learning workflows.
- Contribute to the development of data-driven insights that improve robot autonomy and performance.
- Contribute to the design and evolution of our overall data architecture, ensuring scalability, performance, and maintainability.
- Implement and adhere to best practices for data modeling, schema design, and data governance.
- Work with cloud infrastructure (GCP preferred) to deploy and manage data services.
- Develop and maintain monitoring solutions for data pipeline health and performance.
- Ensure data consistency and accuracy in reporting tools.
- Mentor junior team members and contribute to a culture of continuous learning and knowledge sharing within the data team.
Requirements
- Strong proficiency in Python for data engineering and scripting.
- Extensive experience with SQL and relational databases (PostgreSQL preferred).
- Proven expertise with Google Cloud Platform (GCP) services, especially BigQuery, Cloud Storage, Cloud Functions.
- Experience designing, building, and maintaining robust ETL/ELT data pipelines.
- Familiarity with data orchestration tools (e.g., Apache Airflow).
- Experience with real-time data processing technologies (e.g., Kafka, MQTT).
- Understanding of data modeling techniques (e.g., dimensional modeling, Kimball).
- Familiarity with version control systems (Git/GitHub).
Benefits
- Health insurance
- Flexible work arrangements
- Professional development opportunities
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
PythonSQLETLELTdata modelingdimensional modelingdata quality assertionsdata governancereal-time data processingdata orchestration
Soft Skills
collaborationmentoringproblem-solvingcommunicationcontinuous learning