
Senior Data Engineer
Valtech
full-time
Posted on:
Location Type: Remote
Location: Remote • 🇲🇽 Mexico
Visit company websiteJob Level
Senior
Tech Stack
AirflowApacheAzureBigQueryCassandraCloudETLGoogle Cloud PlatformJavaKafkaMongoDBMySQLNoSQLPostgresPythonScalaSparkSQL
About the role
- Design, build, and maintain highly scalable, reliable, and efficient ETL/ELT pipelines for batch and real-time data processing using various tools and technologies.
- Ingest data from a multitude of sources, including relational databases, APIs, streaming services, cloud storage, and third-party data providers.
- Transform raw data into clean, structured, and AI/ML-ready formats for consumption by data scientists and machine learning models.
- Monitor data pipelines for performance, integrity, and security, troubleshooting and resolving issues as they arise.
- Contribute to the design and evolution of the data architecture, including data lakes, data warehouses, and data marts, ensuring they meet the demands of AI workloads.
- Implement and optimize data storage solutions (e.g., S3, Azure Data Lake Storage, BigQuery, Snowflake) and processing frameworks (e.g., Apache Spark, Flink, Kafka).
- Ensure data security, privacy, and compliance with internal policies and external regulations (e.g., GDPR, CCPA).
- Evaluate and integrate new data technologies and tools to enhance our data platform capabilities.
- Work closely with data scientists, machine learning engineers, and business analysts to understand their data needs and provide tailored data solutions.
- Enable efficient data access for model training, validation, and deployment, including feature engineering and data versioning.
- Support the MLOps lifecycle by providing robust data foundations for automated model retraining and monitoring.
- Optimize data processing for machine learning algorithms, ensuring high performance and cost efficiency.
- Implement and enforce data quality checks and validation processes within pipelines to maintain high data integrity.
- Develop and maintain comprehensive data documentation, including data dictionaries, data lineage, and metadata management.
- Contribute to data governance initiatives, ensuring consistent data definitions and usage across the organization.
Requirements
- Bachelor's or Master's degree in Computer Science, Data Engineering, Software Engineering, or a related quantitative field.
- Minimum of 3-5 years of experience as a Data Engineer, with significant experience in building and managing data platforms.
- Proven experience working on projects that directly support Artificial Intelligence, Machine Learning, or Data Science initiatives.
- Strong proficiency in at least one modern programming language (e.g., Python, Scala, Java). Python is highly preferred.
- Extensive experience with big data processing frameworks and stream processing.
- Hands-on experience with GCP’s data services (e.g.,BigQuery, Dataflow, Dataproc).
- Solid understanding of SQL and experience with various database systems (e.g., PostgreSQL, MySQL, SQL Server, NoSQL databases like MongoDB, Cassandra).
- Experience with data warehousing concepts and technologies (e.g., Google BigQuery).
- Experience with workflow orchestration tools (e.g., Airflow, Prefect, Luigi).
- Proficiency with Git and collaborative development workflows.
- Good understanding of data modeling, data warehousing principles, and data lake architectures.
- Familiarity with machine learning concepts, MLOps principles, and how data pipelines feed into AI models.
- Excellent analytical and problem-solving skills, with a keen eye for detail and data accuracy.
- Strong communication skills, able to articulate complex technical concepts to both technical and non-technical stakeholders.
- Ability to work effectively in an Agile team environment, collaborating with cross-functional teams.
- Advanced English-Level communication skills.
Benefits
- Flexibility, with remote and hybrid work options (country-dependent)
- Career advancement, with international mobility and professional development programs
- Learning and development, with access to cutting-edge tools, training and industry experts
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard skills
ETLELTdata processingPythonSQLdata warehousingdata modelingbig data processingmachine learningdata governance
Soft skills
analytical skillsproblem-solving skillscommunication skillscollaborationattention to detailAgile methodology