Tech Stack
AirflowAmazon RedshiftApacheAWSAzureCloudGoogle Cloud PlatformKafkaPrometheusPythonSparkSQL
About the role
- Architect and lead the development of scalable, secure, and high-performance data platforms across cloud environments (Azure, AWS, GCP).
- Design and optimize data pipelines for batch and real-time processing, including ingestion, transformation, and delivery.
- Implement and maintain robust data warehousing solutions using Snowflake, Amazon Redshift, and Azure Synapse Analytics.
- Drive initiatives around data segregation, de-duplication, cleanup, persistence, and lifecycle management.
- Ensure data is organized, exposed, and secured effectively for downstream analytics, reporting, and ML workflows.
- Collaborate with cross-functional teams to integrate OLTP, OLAP, and Timeseries DB systems into unified data platforms.
- Champion best practices in data governance, security, and compliance (e.g., GDPR, HIPAA, SOC 2).
- Lead technical evaluations of emerging tools and technologies in the data ecosystem.
- Mentor junior engineers and contribute to the technical growth of the team.
Requirements
- Strong knowledge of Big Data, OLTP, OLAP, and Time Series DB architectures and use cases.
- Proficiency in building and managing data pipelines using tools like Apache Spark, Kafka, Flink, Airflow, and DBT.
- Advanced SQL and Python skills, with experience in distributed computing and performance optimization.
- Experience implementing data security protocols, encryption standards, and access control mechanisms.
- Strong understanding of Data Modelling, Metadata Management, and Data Cataloguing Tools
- Experience with time-series data platforms (e.g., Influx DB, Timescale DB, Prometheus).
- Background in supporting AI / ML applications with robust data infrastructure.
- Has a background in building data platforms for AI / ML applications, AI LLM Based Data Retrieval, AI RAG
- Familiarity with ML Ops platforms and workflows (e.g., ML Flow, Kubeflow, SageMaker, Vertex AI).
- Proficiency in T-SQL, SQL Expertise and Query Optimization, Python, and distributed computing frameworks.
- Excellent communication and collaboration skills, with the ability to influence technical direction across teams.
- A competitive base pay
- Medical Benefits
- Discretionary incentive plan based on individual and company performance
- Focus on development: Access to a learning & development platform and opportunity for you to own your career pathways
- Flexible work policy
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard skills
Big Datadata pipelinesSQLPythondata security protocolsData ModellingMetadata Managementdata platformsdistributed computingperformance optimization
Soft skills
communicationcollaborationmentoringinfluencing