Tech Stack
AWSHadoopPythonSparkSQL
About the role
- Write data infrastructure & analytics software tools for Data Science
- Guide Data Scientists on best practices in software engineering
- Modernize existing SAS and Hive SQL codebase to higher performance structures leveraging parallel computing capabilities
- Instrument data feeds for observability, profiling, and monitoring
- Develop shared resources to support Data Science work such as automated profiling and analysis tools
- Write detailed, complete, and enduring documentation to enable long term support
Requirements
- Bachelor’s degree
- Spark experience
- Hive SQL knowledge
- Software/data engineering in Python
- Machine Learning toolkits in Python/Spark
- Building reusable frameworks for data ingestion and storage in Hadoop or AWS/S3
- Relational databases knowledge including optimizing/tuning queries, indexing, partitioning, etc.
- SAS knowledge and ability to convert from SAS to Python
- Hybrid work model for collaborative environment
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard skills
PythonSparkHive SQLMachine LearningHadoopAWSS3SASdata engineeringrelational databases
Soft skills
guidancebest practicesdocumentation
Certifications
Bachelor’s degree