
Lead Data Architect
Henry Schein
full-time
Posted on:
Location Type: Remote
Location: United States
Visit company websiteExplore more
Salary
💰 $181,026 - $259,094 per year
Job Level
Tech Stack
About the role
- Define and implement a scalable, enterprise-wide data architecture aligned with business and technology goals.
- Develop a data strategy roadmap, ensuring long-term sustainability, scalability, and efficiency.
- Partner with executive leadership, product teams, and engineering to ensure data initiatives drive business value.
- Establish enterprise data governance, security, and compliance frameworks leveraging tools like Collibra or Alation.
- Oversee the design and evolution of data lakes, data warehouses, and cloud-based analytics platforms using Databricks, Snowflake, BigQuery, or Redshift.
- Lead the adoption of modern data architecture patterns, including event-driven architectures, real-time data streaming (Kafka, Pulsar), and AI-driven analytics.
- Provide guidance on database optimization, indexing, partitioning, and storage strategies for tools like PostgreSQL, MySQL, and NoSQL solutions like MongoDB or Cassandra.
- Evaluate emerging technologies, making recommendations for tools and platforms that enhance data capabilities.
- Direct ETL/ELT strategies, ensuring seamless data flow across systems with Python, Apache Airflow, dbt, or Informatica.
- Architect cloud-based solutions (AWS, Azure, or GCP) using services such as AWS Glue, Azure Synapse, and Google Cloud Dataflow to support analytics, AI, and operational use cases.
- Ensure API-first design for data integration using GraphQL, RESTful APIs, or event-driven architectures (Kafka, AWS Kinesis, Pub/Sub).
- Define and oversee data quality, lineage, and cataloging efforts using Great Expectations, Monte Carlo, or DataHub.
- Develop policies for data privacy, access control, and encryption, ensuring compliance with GDPR, CCPA, HIPAA, or other relevant regulations.
- Implement enterprise-wide metadata management and data lineage tracking using Collibra, Alation, or Data Catalog solutions.
- Drive best practices for data security and compliance audits, leveraging IAM tools and cloud security solutions.
- Lead a team of data architects, engineers, and analysts, mentoring them on best practices.
- Act as a liaison between business and technical teams, translating business needs into scalable data solutions.
- Champion a culture of innovation, ensuring the data team is adopting cutting-edge methodologies.
Requirements
- 10+ years of experience in data architecture, data engineering, or related fields.
- Bachelor’s degree (Master’s preferred) in Computer Science, Applied Mathematics, Statistics, Machine Learning, or a closely related field (or foreign equivalent).
- Proven track record in designing large-scale, enterprise data architectures.
- Expertise in SQL, NoSQL, and distributed database technologies such as Snowflake, Databricks, BigQuery, Redshift, PostgreSQL, MongoDB, and Cassandra.
- Strong experience with cloud-based data platforms (AWS, Azure, GCP) and services like AWS Glue, Azure Data Factory, and Google Dataflow.
- Deep understanding of data modeling, ETL/ELT processes, and data pipeline optimization using dbt, Apache Airflow, Informatica, or Talend.
- Experience with real-time streaming technologies (Kafka, Spark Streaming, Apache Flink, AWS Kinesis).
- Strong knowledge of data security, governance, and compliance frameworks.
- Excellent verbal and written communication skills and ability to resolve disputes effectively and efficiently
- Outstanding presentation and public speaking skills
- Mastery independent decision making, analysis and problem-solving skills
- Ability to quickly understand and assess complex projects, systems and ecosystems and identify relevant relationships and connections between them
- Mastery planning and organizational skills and techniques
- Communicate effectively with senior management and key stakeholders
- Ability to influence, build relationships, understand organizational complexities, manage conflict and navigate politics
- Familiarity with the healthcare data domain with previous experience working with healthcare datasets is a plus
- Strong Python programming skills, with expertise in data manipulation and pipeline development using Pandas, PySpark, NumPy, and SQLAlchemy.
- Experience with AI/ML-driven analytics architectures and MLOps frameworks like MLflow or SageMaker.
- Hands-on experience with Infrastructure as Code (Terraform, CloudFormation).
- Familiarity with Graph databases and knowledge graphs (Neo4j, Amazon Neptune).
- Certifications in cloud data services (AWS Certified Data Analytics, Google Professional Data Engineer, Databricks Certified Data Engineer).
Benefits
- Medical, Dental and Vision Coverage
- 401K Plan with Company Match
- PTO
- Paid Parental Leave
- Income Protection
- Work Life Assistance Program
- Flexible Spending Accounts
- Educational Benefits
- Worldwide Scholarship Program
- Volunteer Opportunities
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
data architecturedata engineeringSQLNoSQLETLELTdata modelingdata pipeline optimizationreal-time data streamingcloud-based data platforms
Soft Skills
communication skillspresentation skillsproblem-solving skillsorganizational skillsrelationship buildinginfluenceconflict managementdecision makingmentoringinnovation
Certifications
AWS Certified Data AnalyticsGoogle Professional Data EngineerDatabricks Certified Data Engineer