Tech Stack
Amazon RedshiftAWSBigQueryCloudDistributed SystemsElasticSearchETLGoogle Cloud PlatformPythonPyTorchScikit-LearnSQLTensorflow
About the role
- Own and ensure the health, stability, and performance of AI-driven data platforms, pipelines, and infrastructure
- Monitor, troubleshoot, and optimize complex ETL/ELT workflows, ensuring data quality and availability
- Develop Python-based scripts and tools to automate deployments, workflows, and system maintenance
- Collaborate with Engineering, Product, and Customer Success to resolve complex operational issues and ensure seamless data delivery
- Build and maintain detailed operational runbooks, incident playbooks, and system guides
- Work in an Agile environment with engineers, product managers, and data scientists to operationalize analytics for Life Sciences & Healthcare datasets
- Operationalize data architectures, optimize pipelines, and deliver robust solutions that drive internal efficiency and customer satisfaction
Requirements
- Bachelors or Masters in Computer Science, Engineering, or related field
- 4+ years in Technical Operations, DevOps, or SRE with focus on data platforms
- Proven experience managing enterprise-grade data services (Data Pipelines, Data Lakes, Warehouses)
- Expert Python skills for automation and operational tooling
- Strong cloud experience (AWS or GCP) including compute, storage, databases, containerization, orchestration
- SQL proficiency with BigQuery, Redshift, or Snowflake
- Deep knowledge of ETL/ELT best practices, data governance, and compliance
- Ability to diagnose complex distributed systems issues with strong RCA skills
- Excellent communication (verbal & written) and experience creating technical documentation
- Collaborative, proactive mindset with strong ownership
- Ability to work with stakeholders in EST timezone
- Experience in regulated industries (healthcare, finance) and compliance (HIPAA)
- Nice to have: Experience in Life Sciences / Healthcare data domain; Knowledge of MLOps and deploying AI/ML models; Familiarity with data visualization tools (Looker, PowerBI); Experience with Elasticsearch or search technologies; Understanding of ML frameworks (TensorFlow, PyTorch, Scikit-learn, MLFlow)