Tech Stack
AzureCloudPySparkPythonScikit-LearnSQLTensorflow
About the role
- Design, build, and deploy scalable data pipelines using Databricks and PySpark.
- Design and implement machine learning models with Python libraries (e.g., Scikit-learn, TensorFlow).
- Partner with cross-functional teams to define problems and deliver data-driven solutions.
- Perform data cleansing and feature engineering to ensure accuracy and consistency.
- Document technical processes for pipelines, models, and workflows.
- Collaborate with stakeholders to prioritize data science initiatives.
- Create data visualizations and dashboards to communicate insights effectively.
- Stay current with emerging trends in data science, AI, and machine learning.
- Work alongside data engineers to uphold data quality standards.
- Participate in code reviews and help establish best practices for data science.
Requirements
- 5+ years in data science or a related field.
- 3+ years working with Databricks and PySpark.
- 2+ years with Python and ML libraries (Scikit-learn, TensorFlow).
- 2+ years with SQL and data warehousing.
- Hands-on experience in machine learning model development and deployment.
- Solid knowledge of data cleansing and feature engineering.
- Excellent collaboration and communication abilities.
- Experience with cloud-based platforms (e.g., Azure) (preferred).
- Familiarity with Agile methodologies (preferred).
- Professional certifications (Certified Data Scientist, Certified Analytics Professional, etc.) (preferred).
- Healthcare domain knowledge is a plus (preferred).
ATS Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard skills
data pipelinesDatabricksPySparkmachine learning modelsPythonScikit-learnTensorFlowSQLdata cleansingfeature engineering
Soft skills
collaborationcommunication
Certifications
Certified Data ScientistCertified Analytics Professional