Design, develop, and maintain scalable, secure, and efficient data pipelines and infrastructure that support analytics, reporting, and advanced data science initiatives.
Design, build, and optimize ETL/ELT processes for structured and unstructured data.
Develop and maintain data pipelines that integrate multiple internal and external data sources.
Implement data quality, validation, and monitoring processes to ensure trust in enterprise data assets.
Support real-time (streaming) and batch processing pipelines.
Collaborate with data architects, analysts, and scientists to deliver scalable data solutions.
Implement best practices in data security, governance, and compliance.
Contribute to continuous improvement efforts by evaluating and recommending new data engineering tools, techniques, and platforms.
Enable data-driven decision-making by ensuring data is reliable, high-quality, and accessible for clients.
Requirements
Proficiency in programming languages such as Python, Java, or Scala.
Advanced SQL skills for data transformation and performance optimization.
Hands-on experience with data pipeline tools (Airflow, dbt, Kafka, or equivalent).
Strong knowledge of big data processing frameworks (Apache Spark, Databricks, Flink, etc.).
Experience with cloud computing platforms (AWS, Azure, Google Cloud).
Familiarity with modern data architectures (data lakes, lakehouses, warehouses).
Exposure to containerization and orchestration tools (Docker, Kubernetes).
Understanding of data modeling, metadata management, and data lineage.
Experience implementing CI/CD pipelines for data workflows.
Familiarity with modern storage and query engines (Snowflake, Redshift, BigQuery, Delta Lake).
Strong analytical and problem-solving abilities; ability to work with large, complex datasets.
Excellent verbal and written communication skills; ability to explain technical concepts to non-technical stakeholders.
Collaborative mindset with the ability to work in cross-functional teams.
Preferred: Experience with infrastructure-as-code (Terraform, CloudFormation).
Preferred: Knowledge of data governance frameworks and compliance requirements (GDPR, HIPAA, FedRAMP).
Preferred: Familiarity with machine learning data preparation pipelines (feature engineering, MLflow).
Preferred: Background in federal or highly regulated environments is a plus.
Benefits
401(k) Plan (35% employer match per dollar up to 10% employee contribution)
Medical Coverage (3 platforms: UnitedHealthcare, Reference Based Pricing includes comprehensive member advocacy; and Kaiser)
HSA with + Employer Contribution
In-vitro Fertility (treatment coverage)
Dental
Vision (2 plans: 12-month and 24-month frames allowance)
FSA Plans (Healthcare, Dependent care and Limited Purpose)
Pre-tax Commuter Plans
Employer-paid Life Insurance
Employer-paid Short + Term Disability
Long Term Disability (2 plans: Employer-paid and Self-paid with non-taxable claim payments)
Paid Parental Leave (4 weeks at 100%)
Employee Assistance Plan
Voluntary Life Insurance
Legal/ID Theft Plans
TeleHealth Options
Wellness via Omada Health (healthy living solution)
Travel Assistance
Business Travel Accident Coverage
Employer-paid Pet Telehealth
Accident Insurance
Critical Illness Insurance
Hospital Indemnity Insurance
Volunteer Time Off
On Demand Pay (Daily Pay)
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.