Tech Stack
CloudDockerElasticSearchGrafanaKubernetesPrometheusTerraform
About the role
- Own deployment and operations of agentic AI systems at enterprise scale; accelerate shift to AI‑native infrastructure.
- Architect, design, implement, and maintain multi‑agent orchestration frameworks (e.g., FastAgent, FastMCP, LangGraph, AutoGen, CrewAI).
- Develop and operationalize RAG workflows, vector stores, and knowledge‑graph connectors.
- Implement end‑to‑end observability using Prometheus/Grafana, OpenTelemetry, and create incident playbooks.
- Enforce data governance, PII handling, and prompt‑audit trails across agent interactions to ensure security and compliance.
- Partner with data engineers, MLOps, and DevOps to align CI/CD for AI workloads and mentor teams.
- Benchmark agent performance, optimize token usage, and research emerging agentic AI frameworks for continuous improvement.
Requirements
- 5+ years in production ML/LLM operations, with 2+ years in autonomous agent systems.
- Hands‑on experience with FastAgent, FastMCP, LangGraph, AutoGen, CrewAI, or similar.
- Expertise in Kubernetes, Docker, Terraform (or Pulumi), and GitOps workflows.
- Proven track record implementing RAG pipelines with Pinecone, Elasticsearch, or similar.
- Proficient in observability tools (Prometheus, Grafana, Jaeger/OpenTelemetry).
- Open‑source contributions or conference talks on agentic AI or LLMOps (preferred).
- Experience optimizing inference with vLLM, TensorRT‑LLM, or similar libraries (preferred).
- Certifications in cloud architecture or AI security (preferred).
- Familiarity with multi‑modal agent workflows or edge‑deployed agents and AI‑first governance (preferred).
- Simpplr’s Hub-Hybrid-Remote model (role-based flexible work model)
- Hub - 100% work from Simpplr office. Role requires Simpplifier to be in the office full-time.
- Hybrid - Hybrid work from home and office. Role dictates the ability to work from home, plus benefit from in-person collaboration on a regular basis.
- Remote - 100% remote. Role can be done anywhere within your country of hire, as long as the requirements of the role are met.
ATS Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard skills
ML operationsLLM operationsautonomous agent systemsRAG workflowsvector storesknowledge-graph connectorsKubernetesDockerTerraformGitOps workflows
Soft skills
mentoringcollaboration
Certifications
cloud architectureAI security