Affinity.co

Senior Data Engineer

Affinity.co

full-time

Posted on:

Origin:  • 🇺🇸 United States • California, New York

Visit company website
AI Apply
Manual Apply

Salary

💰 $106,200 - $200,000 per year

Job Level

Senior

Tech Stack

Amazon RedshiftApacheAWSAzureCloudETLGoogle Cloud PlatformHadoopKafkaPythonScalaSparkSQLTerraform

About the role

  • Affinity stitches together billions of data points to build a professional relationship intelligence graph.
  • Role part of the AI Insights team that extracts and retrieves information from billions of structured and unstructured data points.
  • Collaborate with machine learning engineers, software engineers, and product managers to shape Affinity's CRM platform.
  • Design scalable and reliable data pipelines to consume, integrate and analyze large volumes of complex data from different sources.
  • Help define the data roadmap and use data to shape product development.
  • Build and maintain frameworks for measuring and monitoring data quality and integrity.
  • Establish and optimize CI/CD processes, test frameworks, and infrastructure-as-code tooling.
  • Build and implement robust data solutions using Spark, Python, Databricks, Kafka, and the AWS ecosystem (S3, Redshift, EMR, Athena, Glue).
  • Identify skill and process gaps and develop processes to drive team effectiveness.
  • Articulate trade-offs of different approaches to building ETL pipelines and storage solutions.

Requirements

  • 5+ years of experience as a Data Engineer or Data Platform Engineer, working on complex, sometimes ambiguous engineering projects across team boundaries.
  • Proficiency in data modeling, data warehousing, and ETL pipeline development is essential.
  • Proven hands-on experience building scalable data platforms and reliable data pipelines using Spark and Databricks, and familiarity with Hadoop, AWS SQS, AWS Kinesis, Kafka, or similar technologies.
  • Comfortable working with large datasets and high-scale data ingestion, transformation, and distributed processing tools such as Apache Spark (Scala or Python).
  • Strong proficiency in SQL.
  • Familiar with industry-standard databases and analytics technologies, including Data Warehousing and Data Lakes.
  • Experience with cloud platforms such as AWS, Databricks, GCP, Azure or related technologies.
  • Familiar with CI/CD processes and test frameworks.
  • Comfortable partnering with product and machine learning teams on large, strategic data projects.
  • Nice to have: Hands-on experience with both relational and non-relational database/data stores, including vector databases (e.g. Weaviate, Milvus), graph databases, and text search engines (e.g. OpenSearch or Vespa clusters), with a focus on indexing and query optimization.
  • Nice to have: Experience with Infrastructure as Code (IaC) tools, such as Terraform.
  • Nice to have: Experience implementing data consistency measures using validation and monitoring tools.
  • Please include your favorite programming language at the very end of your resume, outside of your skills section, with the word '#filter' next to it.