Tech Stack
AirflowAWSAzureCloudDockerETLFlaskGoogle Cloud PlatformKafkaKubernetesMicroservicesPythonSpark
About the role
- Design, build, and maintain scalable data pipelines and systems that support agentic applications.
- Develop backend services and APIs that handle data ingestion, transformation, and access.
- Work closely with the Data Engineering lead to define architecture and technical direction.
- Integrate structured and unstructured data sources into agent workflows.
- Ensure high performance and reliability of data systems in production environments.
- Collaborate with AI researchers and product teams to deliver intelligent, data-driven features.
- Contribute to best practices in data governance, quality, and security.
Requirements
- 4+ years of experience in software engineering or data engineering roles.
- Proficiency in Python (or another high-level language), and experience with backend frameworks (e.g., FastAPI, Flask).
- Strong experience with data engineering tools and systems such as Airflow, Spark, dbt, or similar.
- Hands-on experience with cloud platforms (GCP, AWS, or Azure), especially in data infrastructure.
- Solid understanding of ETL/ELT pipelines, APIs, and microservices.
- Familiarity with containerization and orchestration (Docker, Kubernetes).
- Excellent problem-solving and debugging skills.
- Bachelor’s degree in Computer Science, Data Engineering, or a related technical field.
- Preferred Experience: Experience building systems that interact with LLMs, RAG pipelines, or agentic architectures.
- Familiarity with graph databases, vector stores, or semantic search technologies.
- Experience working in startups or fast-moving product teams.
- Exposure to DevOps and CI/CD practices in cloud-native environments.
- Understanding of real-time data streaming technologies (Kafka, Pub/Sub, etc.).