Honeywell

Principal AI Data Engineer

Honeywell

full-time

Posted on:

Location Type: Hybrid

Location: PhoenixArizonaUnited States

Visit company website

Explore more

AI Apply
Apply

Job Level

Tech Stack

About the role

  • Support end‑to‑end data needs for all AI modalities, including classic ML, GenAI/LLMs, and agentic AI systems
  • Build robust, scalable data pipelines for structured, semi‑structured, and unstructured data, including text, documents, images, audio, video, and logs
  • Develop feature engineering pipelines for classic ML, including feature extraction, transformation, and feature store management
  • Build and optimize GenAI and LLM data pipelines, including embedding generation, vectorization, chunking, metadata extraction, and document enrichment for RAG and context retrieval
  • Develop data ingestion and orchestration workflows that support agentic AI, including memory stores, event-driven pipelines, tool-use data flows, and real-time retrieval services
  • Design and implement advanced data solutions using AWS (S3, Glue, Lambda, EMR, Kinesis), Databricks (Spark, Delta Lake, Vector Search), and Dataiku to enable intelligent systems at scale
  • Implement data governance, quality, lineage, monitoring, and observability to support high-performance, trustworthy AI
  • Partner with data scientists, ML engineers, and AI product teams to deliver datasets for model development, fine‑tuning, evaluation, and production inference
  • Optimize pipelines for latency, cost, reliability, and throughput, ensuring AI systems—from batch ML to real-time agents—have the data they need

Requirements

  • Bachelor’s degree in a technical field (CS, Engineering, Math, or related)
  • Experience supporting AI at scale across classic ML, GenAI/LLM, and agentic AI systems
  • Experience with vector databases and semantic search (Databricks Vector Search, Pinecone, FAISS, Milvus, OpenSearch)
  • Familiarity with LLM and GenAI data preparation, including:
  • Text processing
  • Tokenization
  • Chunking strategies
  • Prompt/context formatting
  • Experience with unstructured data technologies (OCR, NLP pipelines, computer vision data processing)
  • Hands-on experience with Dataiku for automation, workflow orchestration, and AI project management
  • Knowledge of MLOps tooling: MLflow, Delta Lake, experiment tracking, CI/CD for ML
  • Understanding of agentic AI system patterns, such as memory architectures, tool APIs, event-driven workflows, and reasoning chain data requirements
  • Strong analytical mindset, attention to detail, and commitment to high data quality
  • Ability to thrive in a fast-paced, evolving AI environment and collaborate across cross-functional teams
Benefits
  • employer-subsidized Medical, Dental, Vision, and Life Insurance
  • Short-Term and Long-Term Disability
  • 401(k) match
  • Flexible Spending Accounts
  • Health Savings Accounts
  • EAP
  • Educational Assistance
  • Parental Leave
  • Paid Time Off (for vacation, personal business, sick time, and parental leave)
  • 12 Paid Holidays
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
data pipelinesfeature engineeringembedding generationvectorizationmetadata extractiondata ingestionorchestration workflowsdata governanceunstructured data technologiesMLOps
Soft Skills
analytical mindsetattention to detailcollaborationadaptability