Tech Stack
AWSAzureCloudDistributed SystemsDockerGoogle Cloud PlatformJavaScriptNext.jsNode.jsPostgresPythonReactRedisTypeScript
About the role
- Prototype LLM + retrieval pipelines with safety and filtering
- Operate knowledge graph/vector DBs (Pinecone, Weaviate) and manage embeddings
- Build FastAPI services for search, recommender systems, and memory
- Design resilient systems with caching, retries, and observability
- Run data pipelines for large-scale indexing and embeddings
- Capture personalization signals (search, chat, purchase)
- Optimize for low-latency APIs and high-throughput pipelines
- Collaborate with research and product teams on evaluation and UX
- Work across engineering, research, and product to turn cutting-edge AI into production-ready solutions
Requirements
- Strong Python (FastAPI, async/await, Redis, PostgreSQL)
- 1-3 years of hands-on experience with LLM prompting, RAG, embeddings, vector search
- Comfort with APIs, distributed systems, caching, observability
- Familiarity with GCP/AWS/Azure or similar cloud services, Docker, Git, CI/CD
- Clear communicator, self-driven, team player
- Preferred: TypeScript/Node.js (NestJS), React/Next.js, familiarity with FastAPI, Streamlit
- Recommender systems exposure
- Embedding model evaluation skills
- Excellent communication, presentation, collaboration, and time management skills
- Self-starter attitude with a customer-centric mindset; ability to work in a multicultural and fast-paced environment
- Bachelor’s degree or higher in Computer Science, Artificial Intelligence, Machine Learning, Linguistics, Localization or related field
- Application asks whether candidate is legally authorized to work in Canada and about visa sponsorship