Salary
💰 $182,750 - $232,000 per year
Tech Stack
ApolloCloudDistributed Systems
About the role
- Help power the future of agentic AI workflows by taking MCP Server to an enterprise-grade service.
- Architect MCP Gateway—a new layer that will route requests across tools, enforce policies, and provide the runtime foundation for scalable multi-agent systems.
- Tackle challenges in scalability, performance, and developer experience to ensure our platform feels seamless and enterprise-ready.
- The Graph DX AI Runtime Team builds MCP Server and Gateway—the backbone of agent-to-tool communication and the routing layer that keeps everything flowing.
Requirements
- Expertise in agent-to-tool orchestration, routing, and coordination in scalable, fault-tolerant systems.
- Strong background in distributed systems, server architecture, and high-performance backend development.
- Proven experience with protocol design, message routing, and server-side orchestration frameworks.
- Experience building and maintaining robust runtime infrastructure that supports AI-driven workflows and enables reliable agent-to-tool interactions.
- Proven experience with protocol design, message routing, and building server-side frameworks that enable scalable, reliable multi-tool agent workflows.
- Hands-on experience with observability, monitoring, and debugging frameworks for complex systems.
- Passion for clean, maintainable code, high system reliability, and scalable architecture.
- Experience in strategic system design, making architectural trade-offs, and planning for long-term scalability and maintainability.
- Strong technical leadership and mentorship, including guiding junior engineers and driving engineering best practices across teams.
- Ability to influence cross-team architecture decisions and align engineering efforts with product and business objectives.
- Production ownership experience: leading incident response, debugging, and performance optimization in high-impact backend systems.
- Bonus Points: Exposure to AI/ML-enabled developer tooling or autonomous system orchestration.
- Familiarity with cloud-native architectures, containerization, or orchestration frameworks.
- Experience with performance optimization and cost-efficient scaling of high-throughput distributed systems.