Aegis Ventures

Senior / Staff AI Software Engineer

Aegis Ventures

full-time

Posted on:

Origin:  • 🇺🇸 United States

Visit company website
AI Apply
Manual Apply

Salary

💰 $185,000 - $280,000 per year

Job Level

Senior

Tech Stack

ApacheApolloAWSCloudDockerGraphQLKafkaKubernetesMicroservicesOpen SourcePostgresPythonReactTerraformTypeScript

About the role

  • Design and develop robust, scalable, event-driven services using Python, FastAPI, Apache Kafka and GraphQL.
  • Build fundamental LLM agents and integrate them into our product.
  • Work with DevOps on deployments, monitoring, and reliability improvements.
  • Maintain and optimize PostgreSQL databases and data models.
  • Collaborate across product and engineering teams to define requirements and architect features.
  • Drive engineering best practices through code reviews and mentorship.
  • Engage with current and prospective clients to drive understanding of the Caregentic AI architecture and capabilities.
  • Design, build, and operate LLM services, including RAG systems (LangChain), agentic workflows, and evaluation pipelines (LangSmith, deepeval, A/B testing).
  • Own vector search & embeddings pipelines from schema and metadata design to model benchmarking, cost/latency optimization, and Databricks Vector Search integration.
  • Lead conversational AI development enhancing NLU policies, safety guardrails, and custom action servers, plus integrating assistants with microservices.

Requirements

  • 7+ years of backend development experience in production environments, specifically strong Python skills, including async programming and type hints.
  • Experience building and monitoring production-quality ML and AI systems.
  • Hands-on expertise with RAG frameworks and agentic workflows.
  • Solid understanding of PostgreSQL database design and optimization.
  • Familiarity with Docker and containerization.
  • Strong testing practices using pytest.
  • Experience with microservice architectures is preferred.
  • Experience with GraphQL APIs.
  • Experience with event-driven systems and message queues.
  • Experience with major cloud providers (e.g., AWS).
  • Shipped production LLM systems: ideally RAG architectures, agent/tool use, and prompt/system design, with LangChain (tracing/evals via LangSmith), embeddings, and vector databases (Databricks Vector Search preferred). Deep expertise in retrieval quality, including chunking, metadata, hybrid search, reranking, and grounding.