Tech Stack
AirflowAmazon RedshiftApacheAWSCloudDynamoDBHadoopKafkaPythonSparkTerraform
About the role
- End-to-end planning, building, and deploying of software systems and data solutions on AWS for customer engagements
- Drive programming of well-constructed, testable code and be a deep technical resource for customers
- Help customers shape their journey to adopting the cloud and provide technical and strategic guidance on their cloud journey
- Consult, plan, design, and implement Data solutions on the cloud customers, including AWS Data Lake implementations and Data Quality/Governance solutions
- Develop high-quality technical content such as automation tools, reference architectures, and white papers
- Innovate on behalf of customers and translate ideas into measurable results; convey customer needs and feedback into technology roadmaps
- Assist with technical briefs and reference architecture implementations
- Support internal and external brand development through thought leadership (work with Marketing/Alliances on blog posts and case studies)
- Assist customers with building frameworks for data collection, ingestion, cataloging, storage, and serving, including S3 bucket strategies, encryption, partitioning, lifecycle, and cross-region replication
- Build reusable ingestion frameworks, serverless ingestion where applicable, and orchestration/state-management for lambda architectures
- Advise customers on AWS service choices for ingestion, orchestration, and analytics and implement monitoring, logging, and alerting strategies to avoid data loss and handle exceptions
Requirements
- Professional experience architecting/operating Data / DevOps solutions built on AWS
- Experience in software/technology customer facing experience
- Must be legally authorized to work in the United States without the need for employer sponsorship, now or at any time in the future
- Familiarity with data lake design and implementation, data quality and governance
- Experience building data ingestion, transformation, orchestration, and serving layers
- Experience with Amazon VPC, S3 bucket strategy, and lambda/big-data architectures
- Experience implementing monitoring, logging, alerting, retry and error-handling mechanisms for data pipelines
- Must Have: Apache Iceberg, SageMaker, SageMaker Lakehouse, SageMaker Catalog, Terraform
- Should Have: Athena, Redshift, EMR, Glue, DataBricks and/or Snowflake
- Primary language: Python
- Tooling/Services/Libraries: Airflow, Kafka, Parquet, Spark, Metaflow, Git, Hadoop
- AWS tooling: CloudFormation, AWS CLI, AWS CDK, IAM, CloudTrail, Service Catalog, Step Functions, Data Pipeline
- Experience advising on and selecting AWS analytics and storage services (S3, Lambda, EMR, Kinesis, Glue, Lake Formation)
- Mid Level (as listed in posting)