Salary
💰 $107,120 - $160,680 per year
Tech Stack
ApacheAWSCloudDockerHadoopHDFSJavaKafkaKubernetesMapReducePySparkPythonScalaSpark
About the role
- Design and develop Big data solutions and pipelines in Hadoop or Snowflake
- Partner with domain experts, product managers, analysts, and data scientists to develop Big Data pipelines
- Deliver data as a service framework and move legacy workloads to cloud platform
- Discuss requirements with stakeholders, document dataflow and capture requirements
- Recommend and implement access right solutions and best practices
- Build client pipelines using heterogeneous sources and provide engineering services for data science applications
- Research and integrate open-source technologies into design and implementation
- Mentor team members on Big Data and Cloud tech stacks
- Define maintainability, testability, performance, security, quality and usability needs for data platform
- Drive implementation of consistent patterns, reusable components, and coding standards
- Convert SAS based pipelines into PySpark/Scala for Hadoop and non-Hadoop ecosystems
- Tune Big data applications for optimal performance
- Evaluate IT developments and recommend system enhancements
- Supervise day-to-day staff management including resource management, work allocation, mentoring/coaching
Requirements
- 5+ years of experience in hadoop/big data technologies
- 3+ years of experience in PySpark
- 2+ years experience in Snowflake preferable
- 2+ year of experience working on Google or AWS cloud developing data solutions
- Certifications preferred
- Hands-on experience with Python/Pyspark/Scala and basic libraries for machine learning
- Experience with containerization and related technologies (e.g. Docker, Kubernetes) is a plus
- 1 year Hadoop administration experience preferred
- 1+ year of SAS experience preferred
- Comprehensive knowledge of the principles of software engineering and data analytics
- Advanced knowledge of the Hadoop ecosystem and Big Data technologies
- Hands-on experience with the Hadoop eco-system (HDFS, MapReduce, Hive, Pig, Impala, Spark, Kafka, Kudu, Solr)
- Knowledge of agile (scrum) development methodology is a plus
- Strong development/automation skills
- Proficient in programming in Java or Python with prior Apache Beam/Spark experience a plus
- System level understanding - Data structures, algorithms, distributed storage & compute
- Bachelor’s degree/University degree or equivalent experience
- discretionary and formulaic incentive and retention awards
- medical, dental & vision coverage
- 401(k)
- life, accident, and disability insurance
- wellness programs
- paid time off packages (planned time off/vacation, unplanned time off/sick leave, paid holidays)
- competitive employee benefits
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard skills
HadoopSnowflakePySparkScalaPythonSASDockerKubernetesMapReduceSpark
Soft skills
mentoringstakeholder communicationresource managementwork allocationcoaching
Certifications
Big Data certifications