Salary
💰 $115,000 - $125,000 per year
Tech Stack
AWSAzureCloudGoogle Cloud PlatformGreenplumNumpyPandasPythonScikit-LearnSQL
About the role
- Write, maintain, and optimize production-level Python and SQL code for data pipelines, MLOps workflows, and related systems.
- Analyze structured and unstructured datasets to identify trends, patterns, and opportunities for improvement.
- Design, implement, and maintain automated data ingestion, transformation, and validation pipelines.
- Contribute to the design, testing, and deployment of predictive and prescriptive models.
- Support deployment of pipelines and ML models, including standing up and managing relevant cloud infrastructure.
- Collaborate with engineering, product, and business teams to translate requirements into scalable, code-driven solutions.
- Apply rigorous statistical and software engineering best practices to ensure accuracy, reproducibility, and reliability.
- Continuously evaluate and integrate tools, frameworks, and methods that improve efficiency, scalability, and maintainability.
- Communicate results and recommendations clearly to both technical and non-technical audiences.
- Adhere to data governance, security, and privacy standards.
Requirements
- Master’s Degree in Computer Science, Data Science, Statistics, Mathematics, or related field (Bachelor’s degree with significant relevant experience considered).
Proven track record of writing production-ready Python and SQL code.
Familiarity with common data and ML libraries (e.g., dbt, pandas, NumPy, scikit-learn).
Strong SQL skills and experience with large, complex datasets.
Experience in end-to-end data project delivery—from code development to deployment.
Familiarity with version control (Git) and collaborative coding workflows.
Strong understanding of software engineering principles in a data science context.
Experience with statistical modeling, machine learning, and A/B testing.
Ability to communicate technical concepts clearly and effectively.
Commitment to producing high-quality, maintainable, and scalable code.