Tech Stack
PythonPyTorchRayTensorflow
About the role
- Design, implement, and optimize advanced AI systems balancing quality, performance, and cost.
- Architect inference and RAG pipelines; design manager/worker agent patterns with deterministic fallback mechanisms for reliability.
- Build robust evaluation harnesses and develop simulation-based training pipelines to improve model robustness.
- Tune AI systems across quality, latency, and cost dimensions for scalable production performance.
- Run A/B tests and continuous feedback loops to measure system performance and guide improvements.
- Collaborate with product and design teams to ensure AI outputs integrate into workflows with intuitive, high-quality UX.
Requirements
- Strong background in machine learning, AI systems, or applied research.
- Hands-on experience with inference optimization and retrieval-augmented generation (RAG).
- Familiarity with multi-agent or distributed system design.
- Experience building evaluation frameworks, test harnesses, or simulation-based training environments.
- Skilled in performance tuning (latency, throughput, cost efficiency).
- Proficiency in Python and ML/AI frameworks (PyTorch, TensorFlow, LangChain, Ray, etc.).
- Bonus: Experience with A/B testing, feedback loops, and integrating AI with front-end UX.