Staff Engineer – Data Platform and Lakehouse, Databricks

Informa

full-time

Posted on: 10/2/2025

Location Type: Remote

Location: Remote • California • 🇺🇸 United States

✨ AI Apply

Lead

AirflowApacheAWSAzureCloudETLGoogle Cloud PlatformPythonSDLCSparkSQLUnity

About the role

Design and implement scalable, secure, and maintainable data platforms on Databricks and AWS cloud infrastructure.
Provide architectural leadership across engineering domains, ensuring consistency, scalability, and resilience
Architect distributed data processing systems using Apache Spark and optimize for performance and scalability.
Lead development of reusable data pipelines and workflows using Databricks Workflows.
Translate business objectives into platform capabilities in collaboration with Product Managers and cross-functional teams.
Support AI/ML initiatives through robust data engineering, including feature engineering, model deployment.
Champion best practices in ETL/ELT, data quality, monitoring, observability, and Agile development.
Drive adoption of data governance standards: access control, metadata management, lineage, and compliance.
Establish and maintain CI/CD pipelines and DevOps automation for data infrastructure.
Evaluate and integrate emerging technologies to enhance development, testing, deployment, and monitoring.

15+ years of experience in software development, covering the full SDLC: ideation, design, development, testing, deployment, and support.
Strategic mindset, capable of aligning technical decisions with business goals and driving architectural vision.
Excellent collaboration and communication skills across functions, including data scientists, MLOps engineers, and product managers.
Advanced proficiency in Python and SQL, with a strong foundation in software engineering principles.
Extensive experience with distributed computing frameworks, particularly Apache Spark, including performance tuning and scalability.
Required hands-on experience with Databricks, including Unity Catalog, Feature Store, and Delta Live Tables.
Proficiency in data pipeline orchestration tools using Databricks Workflows, Airflow, or AWS Glue; Databricks required.
Strong command of data engineering best practices and tools, including ETL/ELT design, data quality validation, monitoring, and observability.
Experience with Monte Carlo preferred.
Proven ability to build scalable, cloud-native data architectures on Databricks, AWS, Azure, or GCP; Databricks required.
Experience enabling AI/ML workloads including feature engineering, model deployment, and real-time processing.
Strong understanding of data governance and security, including access control, data lineage, compliance, and metadata management.

Benefits

Competitive benefits, including a range of Financial, Health and Lifestyle benefits to choose from
Flexible working options, including home working, flexible hours and part time options, depending on the role requirements – please ask!
Competitive annual leave, floating holidays, volunteering days and a day off for your birthday!
Learning and development tools to assist with your career development
Work with industry leading Subject Matter Experts and specialist products
Regular social events and networking opportunities
Collaborative, supportive culture, including an active DE&I program
Employee Assistance Program which provides expert third-party advice on wellbeing, relationships, legal and financial matters, as well as access to counselling services

Tip: use these terms in your resume and cover letter to boost ATS matches.

PythonSQLApache SparkDatabricksETLELTCI/CDDevOpsdata pipeline orchestrationfeature engineering

collaborationcommunicationstrategic mindsetarchitectural leadership