Tech Stack
CloudDjangoETLFlaskJavaMySQLPostgresPythonScala
About the role
- Implement data storage solutions and architectures for AI/ML workflows.
- Develop integration patterns between various data platforms and systems.
- Build scalable data processing pipelines using distributed computing frameworks.
- Create data quality and validation frameworks for ML pipelines.
- Design and implement ETL processes for model training data preparation.
- Support performance monitoring and optimization of data infrastructure.
- Implement security configurations and access controls for data platforms.
- Develop data lineage tracking and metadata management solutions.
- Build event streaming integrations for real-time data processing.
- Create automated testing frameworks for data pipeline validation.
- Document technical implementations and integration patterns.
- Collaborate with data scientists on data access and processing requirements.
Requirements
- Bachelor’s degree in computer science or related field.
- 4+ years of software engineering experience with data platform focus.
- Strong knowledge of database systems and data storage architectures.
- Experience with distributed data processing frameworks.
- Proficiency in data pipeline design and ETL development.
- Understanding of event streaming systems and real-time processing.
- Experience with bulk data manipulation and optimization techniques.
- Strong programming skills in Python, Java, or Scala.
- Knowledge of data security and access control patterns.
- Experience with cloud data platforms and services.
- Understanding of data quality and validation methodologies.
- Competitive salary
- Flexible working hours
- Professional development budget
- Home office setup allowance
- Global team events
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard skills
data storage solutionsAI/ML workflowsintegration patternsdata processing pipelinesdistributed computing frameworksETL processesdata quality frameworksautomated testing frameworksbulk data manipulationprogramming in Python
Soft skills
collaborationdocumentationperformance monitoringoptimization
Certifications
Bachelor’s degree in computer science