Cummins Inc.

Data Engineer

Cummins Inc.

full-time

Posted on:

Origin:  • 🇺🇸 United States • Tennessee

Visit company website
AI Apply
Apply

Job Level

Mid-LevelSenior

Tech Stack

CassandraCloudDynamoDBETLHadoopHBaseIoTJavaKafkaMongoDBOpen SourceScalaSDLCSparkSQL

About the role

  • Support, develop and maintain a data and analytics platform.
  • Effectively and efficiently process, store and make data available to analysts and other consumers.
  • Work with Business and IT teams to understand requirements and leverage technologies for agile data delivery at scale.
  • Implement and automate deployment of distributed systems for ingesting and transforming data from relational, event-based, and unstructured sources.
  • Implement continuous monitoring and troubleshooting of data quality and data integrity issues.
  • Implement data governance processes and methods for managing metadata, access, and retention for internal and external users.
  • Develop reliable, efficient, scalable and quality data pipelines with monitoring and alert mechanisms using ETL/ELT tools or scripting languages.
  • Develop physical data models and implement data storage architectures per design guidelines.
  • Analyze complex data elements and systems to contribute to conceptual, physical and logical data models.
  • Participate in testing and troubleshooting of data pipelines.
  • Develop and operate large scale data storage and processing solutions using distributed and cloud-based platforms (Data Lakes, Hadoop, Hbase, Cassandra, MongoDB, Accumulo, DynamoDB, etc.).
  • Use agile development practices including DevOps, Scrum, Kanban and continuous improvement cycles for data-driven applications.

Requirements

  • College, university, or equivalent degree in relevant technical discipline, or relevant equivalent experience required.
  • This position may require licensing for compliance with export controls or sanctions regulations.
  • Relevant experience preferred such as temporary student employment, intern, co-op, or other extracurricular team activities.
  • Exposure to Big Data open source technologies.
  • Experience or exposure to SPARK, Scala/Java, Map-Reduce, Hive, Hbase, and Kafka (or equivalent college coursework).
  • Proficiency with SQL query language.
  • Clustered compute cloud-based implementation experience.
  • Familiarity developing applications requiring large file movement for a Cloud-based environment.
  • Exposure to Agile software development (DevOps, Scrum, Kanban).
  • Exposure to building analytical solutions.
  • Exposure to IoT technology.
  • Skills in ETL/ELT, data extraction, data quality, data governance, metadata management, retention and access controls.
  • Programming skills: creating, writing and testing code, test scripts, build scripts, version control, build and test automation.
  • Ability to analyze complex data elements, data flow, dependencies, and relationships.
  • Problem solving and quality assurance metric application skills.