Pythian

Data Engineer

Pythian

full-time

Posted on:

Location Type: Remote

Location: India

Visit company website

Explore more

AI Apply
Apply

About the role

  • Design and develop end-to-end cloud-based solutions with a strong emphasis on data applications and infrastructure.
  • Lead discovery and design sessions with customers to gather requirements and translate functional needs into detailed designs.
  • Create and contribute to technical design documents and other project-related documentation.
  • Work with stakeholders to identify technical and business requirements, and apply best practices and standards to achieve successful project outcomes.
  • Regularly demonstrate proficiency in established practices and standards for cloud solutions.
  • Write high-performance, reliable, and maintainable code.
  • Develop test automation frameworks and associated tooling to ensure project success.
  • Handle complex and diverse cloud-based projects, including tasks such as collecting, managing, analyzing, and visualizing very large datasets.
  • Build efficient and scalable data pipelines for batch and real-time use cases across various source and target systems.
  • Optimize ETL/ELT pipelines, troubleshoot pipeline issues, and enhance observability dashboards.
  • Execute data pipeline-specific DevOps activities, such as IaC provisioning, implementing data security, and automation.
  • Analyze potential issues, perform root cause analyses, and resolve technical challenges.
  • Review bug descriptions, functional requirements, and design documents to ensure comprehensive testing plans and cases.
  • Performance tuning of batch and real-time data processing pipelines.
  • Ensure security best practices are followed when working on internal and customer-facing cloud data platforms.
  • Build foundational CI/CD pipelines for all infrastructure components, data pipelines, and custom data applications.
  • Develop observability and data quality solutions for data platforms, including ML and AI applications.
  • Act as a trusted advisor for customers, addressing technical queries and providing support.
  • Engage in thought leadership activities such as whitepaper authoring, conference presentations, and podcasting.
  • Suggest and implement ways to improve project progress and efficiency.
  • Participate in pre-sales activities when required.

Requirements

  • Experience in implementing complex data architecture, data modeling, data design, and persistence (e.g., warehousing, data marts, data lakes).
  • Proficiency in a programming language such as Python, Java, Go, or Scala.
  • Experience with big data cloud technologies like Microsoft Fabric, Databricks, EMR, Athena, Glue, BigQuery, Dataproc, and Dataflow.
  • Ideally, you will have specific strong hands-on experience working with Google Cloud Platform data technologies—Google BigQuery, Google DataFlow, and executing PySpark and SparkSQL code at Dataproc.
  • Solid understanding of Spark (PySpark or SparkSQL), including using the DataFrame Application Programming Interface as well as analyzing and performance tuning Spark queries.
  • Strong experience in data orchestration using Apache Airflow.
  • Highly proficient in SQL.
  • Strong experience in using code repositories such as GitHub and demonstrable GitOps best practices.
  • Bring a good knowledge of popular database and data warehouse technologies and concepts from Google, Amazon, or Microsoft (Cloud & Conventional RDBMS), such as BigQuery, Redshift, Microsoft Azure SQL Data Warehouse, Snowflake, etc.
  • Have knowledge of how to design distributed systems and the trade-offs involved.
  • Have strong knowledge of CI/CD tools and frameworks such as Jenkins and GitLab to implement DevOps pipelines.
  • Proficiency in using GenAI tools for productivity e.g. Copilot.
Benefits
  • Competitive total rewards package
  • Blog during work hours; take a day off and volunteer for your favorite charity.
  • Flexibly work remotely from your home, there’s no daily travel requirement to an office!
  • All you need is a stable internet connection.
  • Collaborate with some of the best and brightest in the industry!
  • Hone your skills or learn new ones with our substantial training allowance; participate in professional development days, attend training, become certified, whatever you like!
  • We give you all the equipment you need to work from home including a laptop with your choice of OS, and an annual budget to personalize your work environment!
  • You will have an annual wellness budget to make yourself a priority (use it on gym memberships, massages, fitness and more).
  • A generous amount of paid vacation and sick days, as well as a day off to volunteer for your favorite charity.
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
PythonJavaGoScalaSQLApache AirflowSparkPySparkSparkSQLETL
Soft Skills
leadershipcommunicationproblem-solvingcollaborationcustomer supportthought leadershipproject managementanalytical skillsadaptabilitycreativity