Salary
💰 $258,000 per year
Tech Stack
AirflowAWSCloudDockerJavaJenkinsKafkaMicroservicesNoSQLPulsarPythonScalaSpark
About the role
- Define and drive software architecture development across different engineering teams, re-designing data-pipeline software for over 200 million records daily to increase efficiency and responsiveness to user needs, ensuring scalable, high-performance, and maintainable software products.
- Drive technical direction for data engineering on product teams, ensuring alignment with business goals, and fostering best practices in software development.
- Develop and maintain the data engineering technical strategy and roadmap for key Demandbase software products, aligning technical strategy with the goal of improving data quality by 20% and reducing process latency by 20%.
- Lead the integration of data pipelines and workflows, delivering business outcomes autonomously.
- Work with engineering managers, peer engineers, and product managers to ensure seamless execution of technical initiatives.
- Introduce and advocate for best software engineering practices, including software design principles, code quality, security, and cloud scalability.
- Act as a mentor and role model, helping to grow and develop engineering talent within the organization.
- Work closely with product managers to break down product initiatives into deliverable iterations while balancing technical and business needs.
- Contribute to code reviews, proof of concepts, and complex system designs when needed.
- Lead design and implementation of robust data solutions and microservices to meet real-time and batch requirements, including development and optimization of large-scale distributed microservices and APIs using Scala/Java/Python.
- Lead the development and scaling of data pipelines using Spark, incorporating NoSQL and relational databases, and handle data aggregation, warehousing, and processing of at least 200 million records daily.
- Consume and produce data using event-driven systems like Pulsar and Kafka.
- Lead automation and streamlining of deployments, maintaining and creating GitLab pipelines to automate build, test and deployments on AWS Cloud using GitLab CI/CD.
- Lead orchestration of data pipelines, including scheduling, monitoring, and managing high volume data workflows using Astronomer deployed via CI/CD.
- Use Docker for containerization.
- Create, maintain, and review data models to suit business requirements while ensuring efficient solutions.
- 100% remote. May be located anywhere in continental United States. No travel required.
Requirements
- Bachelor’s degree (or foreign equivalent) in Computer Science or Computer Engineering
- 60 months (5 years) of progressive and post bachelor’s experience as a software engineer or in any occupations in which required experience was gained.
- 5 years of experience developing and optimizing large-scale, distributed microservices and APIs for real-time and batch requirements using Scala, Java, and/or Python.
- 5 years of experience using Spark (or similar tools) to develop and scale data pipelines which incorporate noSQL and relational databases.
- 5 years of experience setting up automated build, test and deployment pipelines with CI/CD using Github, gitlab, and/or Jenkins.
- 5 years of experience working with Big Data/ Cloud technologies.
- 5 years of experience performing data modeling.
- 5 years of experience working on the orchestration of data pipelines, including scheduling, monitoring, and managing high volume data workflows using tools like Astronomer, Airflow or similar tools.
- Experience using Docker for containerization.
- Experience with data aggregation, warehousing, and processing of at least 100 million records daily.