Architect, build, and maintain scalable, reliable data pipelines (batch & streaming) to ingest, transform, and deliver data for analytics, reporting, and ML use cases.
Architect software applications, test, and build automated tools.
Translate complex functional and technical requirements into architecture designs and high-performing software solutions.
Select appropriate data solution software and define hardware requirements to support performance and scalability.
Develop and implement standards and processes for data integration projects and initiatives.
Lead the design and development of software applications, testing, and building tools.
Optimize SQL queries (joins, window functions, aggregations, partitioning, indexing) and data schema performance.
Design data models, schemas, and data warehouses/data lakes (dimensional, star, snowflake schemas, normalization/denormalization).
Ensure data quality, correctness, and consistency across datasets (validation, anomaly detection, reconciliation).
Ensure database changes are reviewed and approved according to standards.
Monitor, troubleshoot, and tune performance of pipelines, databases, and workloads.
Drive adoption of engineering best practices: version control, CI/CD, testing (unit and integration for data pipelines), documentation, and code reviews.
Collaborate with software engineers to integrate data systems into production environments.
Provide technical assistance to junior members and to colleagues across the company.
Mentor and coach junior and mid-level engineers, promoting engineering discipline across the team.
Evaluate and propose new tools, frameworks, and technologies for the data platform.
Ensure data security, governance, access control, lineage, and compliance (e.g., GDPR, CCPA, internal standards).
Requirements
Bachelor’s degree in Computer Science or a related technical discipline (Master’s preferred)
5+ years of professional experience in data engineering, software engineering, or data science
Expert-level SQL, including query optimization, advanced joins, windowing, partitioning, and indexing
Proven expertise in Snowflake for data warehousing and advanced analytics
Strong background in data modeling, data engineering best practices, and distributed systems (Spark, Hadoop, Hive, Presto)
Hands-on experience designing and maintaining ETL/ELT pipelines, data integration (APIs, event streams, logs), and workflow orchestration (Airflow or Astronomer required)
Proficiency with modern data stack tools, including DBT for transformation and modeling
Experience with AWS cloud services for data engineering and infrastructure management