Recommended Technology Stack:
- Languages: Python, advanced SQL, Scala (optional)
- Orchestration & Transformation: Apache Airflow, dbt, Prefect
- Data Processing: Spark, Pandas, PySpark
- Storage: Data Lakes (S3, GCS, Azure Blob), Data Warehouses (BigQuery, Snowflake, Redshift), NoSQL databases, vector databases (Pinecone, Weaviate, FAISS)
- Streaming: Kafka, Pub/Sub
- Infrastructure: Docker, Kubernetes, Cloud (AWS, GCP or Azure)
The following hard skills are desirable:
- Data modeling
- ETL / ELT
- Distributed processing
- Advanced SQL
- Scalable data architecture
- Experience with unstructured data (text, logs, PDFs)
- DataOps concepts
- Data versioning and data quality
Required soft skills:
- Systems thinking
- Strong organization
- Attention to detail
- Scalability mindset
- Collaboration with ML teams
- Proactivity in preventing bottlenecks

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools

Pythonadvanced SQLScalaApache AirflowdbtPrefectSparkPandasPySparkETL

Soft Skills

systems thinkingstrong organizationattention to detailscalability mindsetcollaborationproactivity