Tech Stack
CloudDistributed SystemsDockerGoGraphQLGRPCJavaKubernetesPythonScala
About the role
- Design, build, and operate the backend services and platform APIs that power GenAI features in Lyric’s products.
- Implement orchestration layers to route, sequence, and manage LLM calls with context grounding and prompt engineering.
- Build scalable and cost-efficient GenAI serving infrastructure, including support for multiple model providers and fallback strategies.
- Ensure platform resilience, security, observability, and compliance when serving user-facing GenAI workloads.
- Provide abstractions, SDKs, and tooling to enable product and ML engineers to experiment and ship GenAI features faster.
- Collaborate with ML engineers, product managers, and designers to understand requirements and deliver performant, developer-friendly systems.
- Monitor performance, optimize latency and cost, and stay ahead of trends in GenAI and LLM ops.
Requirements
- 3–9 years of backend or platform engineering experience, building distributed systems or ML/AI platforms.
- Proficiency in at least one backend language (e.g., Python, Go, Java, Scala, or similar) and in designing APIs (REST, gRPC, GraphQL).
- Experience building and operating scalable, high-availability backend services in cloud-native environments.
- Familiarity with LLM serving and orchestration (e.g., OpenAI APIs, Anthropic, Hugging Face Inference Endpoints, or open-source LLM serving frameworks).
- Understanding of prompt engineering, context grounding, and retrieval-augmented generation (RAG) concepts is a plus.
- Knowledge of containerization and orchestration (Docker, Kubernetes) and observability best practices.
- Ability to thrive in ambiguous, fast-moving environments and work effectively across teams.
- Nice to have: Experience with vector databases (e.g., Pinecone, Weaviate, Milvus, FAISS) and embedding workflows.
- Nice to have: Familiarity with LLM fine-tuning, adapters (LoRA, PEFT), or hosting custom models.
- Nice to have: Knowledge of data privacy, security, and compliance concerns specific to GenAI workloads.