BLACKBIRD.AI

Staff Data Engineer

BLACKBIRD.AI

full-time

Posted on:

Location Type: Remote

Location: New YorkTexasUnited States

Visit company website

Explore more

AI Apply
Apply

Salary

💰 $160,000 - $190,000 per year

Job Level

About the role

  • Design and implement scalable data platform architecture on Databricks, supporting both batch and streaming ingestion
  • Build robust, fault-tolerant data ingestion pipelines that integrate with multiple third-party APIs and data providers
  • Design and implement AI-powered enrichment stages within pipelines—applying ML clustering, generative AI summarization, classification, and entity extraction to transform raw data into actionable intelligence
  • Build analytical systems with full-text search capabilities using Elasticsearch for rapid querying and analysis of enriched data
  • Work with AI/ML researchers to implement, integrate and scaling AI processing
  • Expose data platform capabilities as APIs and other interfaces for downstream consumption by applications and services
  • Optimize data lake and lakehouse architecture for performance, cost-efficiency, and scalability
  • Design and implement data quality frameworks, monitoring, and alerting systems
  • Design efficient architectures for calling external AI APIs and managing rate limits, costs, and reliability
  • Architect solutions with cost-efficiency as a first-class concern, implementing monitoring and optimization strategies for compute and storage
  • Make critical build-vs-buy decisions and establish architectural standards for the data organization
  • Mentor engineers and elevate the team's technical capabilities through code reviews, design discussions, and knowledge sharing

Requirements

  • 8+ years of software engineering experience with 5+ years focused on data platforms or data engineering
  • Deep expertise with Databricks, Apache Spark, and data lakehouse architectures
  • Strong experience building and operating data pipelines at scale (handling TBs+ of data)
  • Experience integrating AI/ML capabilities into data pipelines (clustering, LLM APIs, classification, summarization)
  • Proficiency in Python, DBT, and SQL for data processing and pipeline development
  • Experience with both batch and streaming large scale data processing patterns
  • Strong understanding of cloud platforms (AWS, Azure)
  • Excellent communication skills and ability to mentor engineers
  • **Preferred Qualifications:**
  • Experience designing both batch and streaming/near real-time data architectures
  • Proficiency with Elasticsearch for building analytical systems with full-text search capabilities
  • Hands-on experience with LLM APIs and understanding of rate limiting and cost optimization
  • Experience with Agentic AI, context engineering, and evaluation
  • Background in trust & safety, security, or content moderation domains
  • Experience with data observability tools and building comprehensive monitoring systems
  • Prior experience at a startup or fast-paced environment
  • Apply agentic coding tools for day to day development
  • Familiarity with Databricks' Lakeflow, Agent Bricks, and vector databases
Benefits
  • Competitive compensation package, 401(k), and equity -** everyone has a stake in our growth! **
  • Comprehensive health benefits for you and your loved ones, including wellness days and monthly wellness reimbursements - **an apple a day doesn't always keep the doctor away! **
  • Generous vacation policy, encouraging you to take the time you need - we trust you to strike the right work/life balance!
  • A flexible work environment with opportunities to collaborate with your team in person -** you can have it all! **
  • Inclusion and Impact **- soar to new heights! **
  • Professional development stipend -** never stop learning! **
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
DatabricksApache Sparkdata lakehouse architecturedata pipelinesPythonDBTSQLElasticsearchAI/ML integrationdata quality frameworks
Soft Skills
communicationmentoringteam collaborationknowledge sharingcritical decision making