Data Engineer

Cummins Inc.

full-time

Posted on: 9/9/2025

Origin: • 🇺🇸 United States • Tennessee

✨ AI Apply

Mid-LevelSenior

CassandraCloudDynamoDBETLHadoopHBaseIoTJavaKafkaMongoDBOpen SourceScalaSDLCSparkSQL

About the role

Support, develop and maintain a data and analytics platform.
Effectively and efficiently process, store and make data available to analysts and other consumers.
Work with Business and IT teams to understand requirements and leverage technologies for agile data delivery at scale.
Implement and automate deployment of distributed systems for ingesting and transforming data from relational, event-based, and unstructured sources.
Implement continuous monitoring and troubleshooting of data quality and data integrity issues.
Implement data governance processes and methods for managing metadata, access, and retention for internal and external users.
Develop reliable, efficient, scalable and quality data pipelines with monitoring and alert mechanisms using ETL/ELT tools or scripting languages.
Develop physical data models and implement data storage architectures per design guidelines.
Analyze complex data elements and systems to contribute to conceptual, physical and logical data models.
Participate in testing and troubleshooting of data pipelines.
Develop and operate large scale data storage and processing solutions using distributed and cloud-based platforms (Data Lakes, Hadoop, Hbase, Cassandra, MongoDB, Accumulo, DynamoDB, etc.).
Use agile development practices including DevOps, Scrum, Kanban and continuous improvement cycles for data-driven applications.

College, university, or equivalent degree in relevant technical discipline, or relevant equivalent experience required.
This position may require licensing for compliance with export controls or sanctions regulations.
Relevant experience preferred such as temporary student employment, intern, co-op, or other extracurricular team activities.
Exposure to Big Data open source technologies.
Experience or exposure to SPARK, Scala/Java, Map-Reduce, Hive, Hbase, and Kafka (or equivalent college coursework).
Proficiency with SQL query language.
Clustered compute cloud-based implementation experience.
Familiarity developing applications requiring large file movement for a Cloud-based environment.
Exposure to Agile software development (DevOps, Scrum, Kanban).
Exposure to building analytical solutions.
Exposure to IoT technology.
Skills in ETL/ELT, data extraction, data quality, data governance, metadata management, retention and access controls.
Programming skills: creating, writing and testing code, test scripts, build scripts, version control, build and test automation.
Ability to analyze complex data elements, data flow, dependencies, and relationships.
Problem solving and quality assurance metric application skills.