
Data Engineer
Adobe
full-time
Posted on:
Location Type: Office
Location: San Jose • California, Texas • 🇺🇸 United States
Visit company websiteSalary
💰 $113,400 - $206,300 per year
Job Level
Mid-LevelSenior
Tech Stack
ApacheAWSAzureCloudDockerETLGoogle Cloud PlatformJavaKafkaKubernetesNeo4jPythonRayScalaSpark
About the role
- Build scalable ingestion pipelines for brand information and creative assets, ensuring freshness, reliability, and versioning.
- Integrate and leverage systems and models from our machine learning and data science partner teams.
- Design and maintain brand-aware data models, ontologies, and multi-modal graphs to support context linking and rich retrievals.
- Implement hybrid storage and retrieval strategies across vector databases, graph databases, and search engines, optimizing for precision and latency.
- Develop metadata enrichment pipelines to enhance semantic search, personalization, and optimization for RAG-based conversational systems.
- Ensure data quality and observability by monitoring metrics for accuracy, coverage, and timeliness; build monitoring systems to track ingestion and retrieval health.
- Collaborate with product, ML, and information retrieval teams to align data infrastructure with creative workflows.
- Optimize pipelines using distributed/streaming systems for scale and speed.
Requirements
- 4+ years of software development experience with 1+ year in data engineering, search relevance, or large-scale systems for conversational experiences.
- Expertise in building reliable and innovative ETL pipelines for heterogeneous data.
- Experience with distributed data frameworks (Spark, Flink) and streaming platforms (Kafka).
- Proficiency in Python (preferred) or Java/Scala, with strong CS fundamentals.
- Experience with cloud platforms (Azure, AWS, or GCP) and containerization/orchestration (Docker, Kubernetes).
- Familiarity with modern search systems (Elastic, Vespa) and graph databases (Neo4j, TigerGraph).
- Understanding of ML data pipelines and MLOps standards for monitoring and continuous improvement
- Self-starter who thrives in zero-to-one environments and can make informed tradeoffs.
- Preferred: Background in information retrieval, NLP, or cognitive computing.
- Preferred: Experience developing, optimizing, and deploying data processing with Apache Spark/Dask/Ray.
- Preferred: Experience with ontologies, knowledge graphs, or semantic enrichment pipelines.
- Preferred: Degree in Computer Science, Information Systems, or related field.
Benefits
- Pay within range varies by work location and may depend on job-related knowledge, skills, and experience.
- Annual Incentive Plan (short-term incentives)
- Certain roles may be eligible for long-term incentives in the form of a new hire equity award
- Equal Employment Opportunity and accommodations for disabilities (accessibility support)
- Employee experiences recognized and benefits implied (Adobe Life blog and meaningful benefits)
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard skills
ETL pipelinesdata engineeringsearch relevancelarge-scale systemsdistributed data frameworksstreaming platformsPythonJavaScalaMLOps
Soft skills
self-startercollaborationproblem-solvingtradeoff analysis