Distyl AI

AI Production Engineer

Distyl AI

full-time

Posted on:

Location Type: Hybrid

Location: San FranciscoCaliforniaNew YorkUnited States

Visit company website

Explore more

AI Apply
Apply

Salary

💰 $130,000 - $250,000 per year

About the role

  • Own the performance and reliability characteristics of AI systems deployed in customer environments
  • Design, build, and operate low-latency AI services—including real-time voice and interaction pipelines—as well as large-scale batch processing workflows that execute complex AI workloads reliably
  • AI Production Engineers are the escalation point for performance and reliability risk, and have veto power on launches that violate production constraints
  • Deeply involved in system design, implementation, and operation, investigating performance bottlenecks, failure modes, and scaling limits across AI pipelines, APIs, orchestration layers, and infrastructure
  • Design and evolve observability systems—metrics, logs, tracing, alerts—that make AI behavior understandable and actionable in production
  • Work directly with Forward Deployed AI Engineers, Product Engineers, and Architects to ensure that production constraints meaningfully shape system design
  • Step in on high-risk or high-impact issues, debug live systems, and harden AI services so they can operate continuously under real-world load
  • Help turn one-off production solutions into reusable patterns and platform capabilities, raising the overall production bar for Distyl’s AI systems over time

Requirements

  • 3+ years of software engineering experience
  • Deep Production Engineering Experience: Built and operated high-scale systems—low-latency APIs, streaming pipelines, real-time services, or large batch processing systems—and can reason deeply about performance, throughput, and reliability. Experience with real-time voice systems is a strong plus
  • Strong Systems and Backend Fundamentals: Write high-quality production code and understand distributed systems concepts such as concurrency, fault tolerance, backpressure, and graceful degradation. You are comfortable optimizing systems under tight latency and throughput constraints
  • Operational Excellence Mindset: Treat observability, instrumentation, and incident response as first-class concerns. Logging, metrics, tracing, alerting, and on-call readiness are integral to how you design and operate systems
  • Ownership of AI Systems in Production: Take responsibility for AI systems end-to-end—design, deployment, monitoring, and ongoing health. When something breaks, you care about understanding why, fixing it properly, and preventing recurrence
  • AI-Native Working Style: Actively use AI tools to debug systems, analyze performance data, explore designs, and automate operational workflows
Benefits
  • 100% covered medical, dental, and vision for employees and dependents
  • 401(k) with additional perks (e.g., commuter benefits, in‑office lunch)
  • Access to state‑of‑the‑art models, generous usage of modern AI tools, and real‑world business problems
  • Ownership of high‑impact projects across top enterprises
  • A mission‑driven, fast‑moving culture that prizes curiosity, pragmatism, and excellence
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
AI systemslow-latency APIsstreaming pipelinesreal-time serviceslarge batch processing systemsdistributed systemsperformance optimizationobservabilityinstrumentationincident response
Soft Skills
ownershipproblem-solvingdebuggingcollaborationattention to detailoperational excellenceresponsibilityanalytical thinkingadaptabilitycommunication