Tech Stack
CloudDistributed SystemsGoKubernetesMicroservicesNode.jsPythonRayRustTerraformWeb3
About the role
- Develop, fine-tune, and deploy LLMs for large-scale applications.
- Implement RAG (Retrieval-Augmented Generation) pipelines for real-world use cases.
- Work with vLLM for high-performance inference and deployment.
- Build distributed training & inference pipelines using Ray and KubeRay.
- Integrate and scale vector database for AI-native applications.
- Leverage KubeAI, Kubernetes, and Terraform to design and manage AI infrastructure.
- Architect, deploy, and manage MCP Servers and intelligent AI agents.
- Contribute to model training, optimization, and evaluation workflows.
- Collaborate with DevOps to ensure scalable and reliable infrastructure.
- Apply blockchain knowledge (Cosmos SDK, smart contracts, decentralized infra) for AI+Web3 integrations.
Requirements
- Proven experience working on LLMs (training, fine-tuning, inference, and optimization)
- Strong hands-on expertise with vLLM, RAG pipelines, Ray, and KubeRay
- Proficiency in managing AI infrastructure with Kubernetes, KubeAI, and Terraform
- Experience with vector databases
- Strong background in model training workflows and scalable AI infra design
- Knowledge of MCP servers, agents, and orchestration frameworks
- Familiarity with blockchain platforms, especially Cosmos SDK
- Solid programming skills in Python, Go, or Rust
- Prior experience contributing to open-source AI/Blockchain projects (Nice to Have)
- Knowledge of GPU drivers, distributed systems, and cloud-native infra (Nice to Have)
- Familiarity with security, quality assurance, and monitoring for AI infra (Nice to Have)