Salary
💰 $190,000 - $284,000 per year
Tech Stack
AirflowApacheAWSEC2JavaScalaSDLCSparkSQL
About the role
- Lead initiatives to build, expand, and improve real-world entity identification datasets
- Coordinate with downstream stakeholders with dependencies on identification datasets
- Design and build new pipelines to increase identification coverage and detect errors
- Collaborate with a skilled data science team to enable new ML/AI model development
- Provide insights into optimizing existing pipelines for performance and cost-efficiency
- Create and document descriptive plans for new feature implementation
Requirements
- Bachelor’s degree in computer science, engineering, mathematics, or related field
- 8+ years of relevant experience
- Progressive experience in the following areas:
- Object-oriented / strongly typed programming (Scala, Java, etc.)
- Productionizing and deploying Spark pipelines
- Complex SQL
- Apache Airflow or similar orchestration tools
- Strong SDLC principles (CI/CD, unit testing, Git process, etc.)
- Solid understanding of AWS services (IAM, EC2, S3)
- An interest in data science
- up to 100% paid premiums for Medical and Vision coverage
- mental wellness resources, including access to Modern Health
- flexible PTO policy
- 15 paid holidays in 2025
- No Internal Meetings Fridays
- competitive 401(k) plan
- short-term and long-term disability coverage
- life insurance
- other valuable benefits to ensure your financial peace of mind
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard skills
ScalaJavaSparkSQLApache AirflowCI/CDunit testingGitAWSdata science
Soft skills
leadershipcollaborationcommunicationorganizationalproblem-solving