About the role
- Architect and build real-time conversation orchestration service: ASR → LLM inference → TTS streaming
- Write robust, asynchronous Python designed for high concurrency without deadlocks, race conditions, or memory leaks
- Design and maintain clean, well-structured APIs for scalability and debugging
- Manage interaction data using SQLAlchemy or equivalent with efficient schema design and safe migrations
- Implement observability: structured logging, metrics, and tracing across the system
- Partner with ML and Product teams to iterate on conversation flow and user experience
- Enforce a strong testing culture: automated unit tests, E2E flows, and load testing
- Build resilient systems handling noisy audio, unreliable APIs, and flaky networks
- Continuously profile, optimize, and reduce latency and response times
- Monitor, debug, and improve the system as it runs in production
Requirements
- Deep Python expertise: 5+ years in Python; production systems experience; context managers, generators, event loops, GIL, and effective use of asyncio
- Database fundamentals: data modeling, efficient queries, ORM best practices (e.g., SQLAlchemy)
- Networking & I/O: streaming, backpressure, resilient design for unreliable networks
- Testing discipline: automated unit tests, end-to-end flows, and load testing
- Observability mindset: structured logging, metrics, and tracing
- Production readiness: experience building and supporting systems running live at scale
- Experience collaborating with ML and Product teams on conversation flow and model behavior
- Ability to work US hours at least until 6 PM ET
- Must have own system/work setup for remote work
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard skills
PythonSQLAlchemyasynchronous programmingAPIsautomated testingE2E testingload testingobservabilitydata modelingstreaming
Soft skills
collaborationproblem-solvingcommunicationtesting disciplinescalability mindset