Tech Stack
AirflowApacheAWSCloudETLJenkinsKafkaPySparkPythonScalaSparkTerraform
About the role
- Provide technical vision and leadership for a major business component of LG Ad Solutions’ data platform.
- Manage end-to-end data processes operating at high scale, processing several petabytes of data daily.
- Architect, design, develop, and deploy ETL pipelines, data warehousing, data architecture, data cataloguing, and data delivery mechanisms.
- Develop and execute the strategic vision for the data platform and oversee architecture and management of data systems.
- Collaborate cross-functionally with product managers, engineering teams, data scientists, and other stakeholders to design and implement data integration and hand-off mechanisms.
- Identify and address performance bottlenecks; implement strategies to enhance system performance, scalability, and reliability.
- Implement and maintain data governance frameworks to ensure data quality, accuracy, consistency, and security.
- Mentor and lead data engineering teams; provide technical leadership, code and architecture reviews, and foster continuous learning.
- Lead code and architecture reviews to maintain high standards of code quality and system design.
- Keep up-to-date with emerging technologies, evaluate and recommend new solutions to enhance the data platform's capabilities.
Requirements
- Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field.
- Over 5 years of software development experience, with at least 2 years in a technical leadership role overseeing data engineering or data platform teams.
- Strong proficiency in big data technologies, including Apache Spark/PySpark, Apache Airflow, and Apache Kafka.
- Experience with programming languages such as Scala and Python for developing robust data pipelines.
- In-depth knowledge of both relational and non-relational databases, with the ability to design and optimize complex queries.
- Hands-on experience with data visualization tools and techniques to present data insights effectively.
- Proficiency in developing and maintaining CI/CD pipelines using tools like Jenkins and GitHub Actions.
- Experience with Infrastructure as Code (IaC) tools such as Pulumi and Terraform for managing cloud resources.
- Strong experience in writing and maintaining unit, integration, and end-to-end tests to ensure data pipeline reliability and accuracy.
- Excellent problem-solving abilities with a keen attention to detail.
- Strong communication skills, with the ability to convey complex technical concepts to diverse audiences.
- Proven ability to collaborate cross-functionally with product managers, data scientists, and other engineering teams.
- Familiarity with data governance and security practices to ensure compliance and data integrity.
- Preferred: Prior experience working with Databricks and AWS cloud services.
- Preferred: Experience with Agile development methodologies and leading Agile teams.
- Preferred: Experience in working with Distributed Graph Databases.