
Data Engineer, Databricks
Solvd, Inc.
full-time
Posted on:
Location Type: Remote
Location: Argentina
Visit company websiteExplore more
About the role
- Build and maintain scalable data pipelines with Databricks, Spark, and PySpark.
- Manage data governance, security, and credentials using Unity Catalog and Secret Scopes.
- Develop and deploy ML models with MLflow; work with LLMs and embedding-based vector search.
- Apply ML/DL techniques (classification, regression, clustering, transformers) and evaluate using industry metrics.
- Design data models and warehouses leveraging dbt, Delta Lake, and Medallion architecture.
- Work with healthcare data standards and medical terminology mapping.
Requirements
- Hands-on experience with the Databricks platform, including:
- Unity Catalog: Managing data governance, access control, and auditing across workspaces.
- Secret Scopes: Secure handling of credentials and sensitive configurations.
- Apache Spark / PySpark: Writing performant, scalable distributed data pipelines.
- MLflow: Managing ML lifecycle including experiment tracking, model registry, and deployment.
- Vector Search: Working with vector databases or search APIs to build embedding-based retrieval systems.
- LLMs (Large Language Models): Familiarity with using or fine-tuning LLMs in Databricks or similar environments.
- Experience designing and maintaining robust data pipelines:
- Data Modeling & Warehousing: Dimensional modeling, star/snowflake schemas, SCD (Slowly Changing Dimensions).
- Familiarity with dbt, Delta Lake, and the Medallion architecture (Bronze, Silver, Gold layers)
Benefits
- Shape real-world AI-driven projects across key industries, working with clients from startup innovation to enterprise transformation.
- Be part of a global team with equal opportunities for collaboration across continents and cultures.
- Thrive in an inclusive environment that prioritizes continuous learning, innovation, and ethical AI standards.
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
DatabricksSparkPySparkMLflowML techniquesDL techniquesdbtDelta Lakedata modelingdata warehousing