
Staff Data Engineer
Walmart
full-time
Posted on:
Location Type: Office
Location: Sunnyvale • California • United States
Visit company websiteExplore more
Salary
💰 $153,301 - $286,000 per year
Job Level
Tech Stack
About the role
- Design, build, and maintain scalable data pipelines and infrastructure for large-scale data processing and analytics using technologies such as Hadoop, Spark, distributed event store and stream-processing platform, in-memory databases, and other big data tools.
- Build large-scale distributed event streaming platforms such as Apache Kafka and Google Cloud Pub/Sub.
- Develop and deploy real-time data processing pipelines for near-real-time (NRT) streaming data.
- Design and implement data storage solutions using Data Lakes and NoSQL databases to support high-volume and high-velocity data processing.
- Develop and implement machine learning models to support predictive analytics and automation.
- Develop and deploy natural language processing (NLP) models using chatgpt and other Gen AI focused tools and platforms.
- Work with data scientists, analysts, and other stakeholders to understand data requirements and develop solutions that meet their needs.
- Develop and maintain data quality and governance processes to ensure data accuracy, completeness, and consistency across different systems and sources.
- Design and implement job scheduling and automation using scheduling tools.
- Optimize data processing workflows using managed services provided by cloud platforms.
- Identify and resolve performance bottlenecks, data quality issues, and other technical challenges that arise in large-scale data processing environments.
- Create and maintain documentation and best practices for data engineering processes and systems.
- Stay up-to-date with the latest trends and innovations in big data, cloud computing, and related technologies, and adapt these technologies to improve data processing and analytics capabilities.
- Build data models to support data visualization and analysis.
- Develop and maintain data pipelines to extract, transform, and load data from various sources into the data visualization tool.
- Build and maintain dashboards and reports to provide insights into business performance and trends.
- Develop and maintain data validation and testing procedures to ensure data accuracy.
Requirements
- Bachelor’s degree or equivalent in Computer Science, Engineering (Any), or related field and 4 years of experience in software engineering, data engineering, database engineering, business intelligence, business analytics or related field; OR Master's degree or equivalent in Computer Science, Engineering (Any), or related field and 2 years of experience in software engineering, data engineering, database engineering, business intelligence, business analytics or related field.
- Experience with software development in object-oriented languages such as Scala, GoLang and Python.
- Experience designing REST API web services using Nginx, Scala, Python and Golang.
- Experience working with relational databases like PostgreSQL, MariaDB and NoSQL Databases like Elasticsearch and Redis.
- Experience with software design and architectural patterns like microservice, client-server, model-view-controller, sharding, pub-sub, and event-driven.
- Experience working with distributed queue systems such as Kafka and Nats.
- Experience with Microsoft Azure Storage and Google Big Query for data storage and querying.
- Experience with data transformation techniques like ETL batch processing, stream ingestion, API integration and data normalization.
- Experience developing data processing pipelines using technology stacks such as Apache Kafka, NATs, Elasticsearch and Akka.
- Experience deploying application and pipelines using scheduling, CI/CD and orchestration frameworks like Kubernetes, Jenkins, Ansible and Docker.
- Experience developing data visualizations and dashboards using tools like Grafana, Kibana, and PowerBI.
- Experience writing complex queries in SQL and Elasticsearch to analyze data including the analysis of timeseries data.
- Experience managing Linux-based systems and deploying applications on Linux.
- Experience with a high-level understanding of network devices, terminology, protocols and concepts like routers, switches, bandwidth, BGP, and SNMP.
- Experience working with Log Management and analyzing platforms like Splunk, Graylog, and Elasticsearch.
Benefits
- Health benefits include medical, vision and dental coverage.
- Financial benefits include 401(k), stock purchase and company-paid life insurance.
- Paid time off benefits include PTO (including sick leave), parental leave, family care leave, bereavement, jury duty and voting.
- Other benefits include short-term and long-term disability, education assistance with 100% company paid college degrees, company discounts, military service pay, adoption expense reimbursement, and more.
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
HadoopSparkApache KafkaGoogle Cloud Pub/SubNoSQL databasesmachine learningnatural language processingETLSQLdata visualization
Soft Skills
collaborationproblem-solvingcommunicationdocumentationadaptabilitydata governancedata quality managementanalytical thinkingstakeholder engagementprocess optimization