
Architect, Applied Science – AgentForce
Salesforce
full-time
Posted on:
Location Type: Hybrid
Location: Palo Alto • California • Washington • United States
Visit company websiteExplore more
Salary
💰 $218,400 - $365,200 per year
About the role
- Define the end-to-end architecture for AgentForce’s model serving, inference orchestration, and agentic reasoning loops.
- Make high-stakes technical decisions regarding "build vs. buy," model sizing, context window management, and retrieval-augmented generation (RAG) strategies.
- Architect scalable pipelines for continuous learning (RLHF/RLAIF) that integrate seamlessly with production traffic without compromising latency or stability.
- Design systems for multi-turn agent state management, memory persistence, and tool invocation (function calling).
- Own the end-to-end architectural design of AgentForce AI capabilities from product requirements through model design, system implementation, and production rollout.
- Translate product use cases (e.g., agent experiences, workflows, UI features) into concrete system architectures, including APIs, service contracts, and model interaction patterns.
- Define reference architectures for AI-powered applications (web, backend services, agent runtimes) that standardize how products integrate with AgentForce models.
- Translate abstract research concepts into concrete engineering specifications.
- Collaborate with scientists to optimize models for deployment (quantization, distillation, pruning) without sacrificing reasoning capabilities.
- Mentor Principal Scientists and Staff Engineers on system design principles and architectural patterns.
Requirements
- PhD or Master’s in Computer Science, AI, Machine Learning, or Distributed Systems
- 10+ years of technical experience, with a specific focus on deploying ML models at scale
- Proven experience acting as an Architect or Principal-level technical lead for large-scale AI or data platforms
- Experience designing and building production-grade AI-powered applications or platforms
- Experience defining public/internal APIs, SDKs, and service interfaces for ML/AI capabilities consumed by product teams
- Familiarity with frontend–backend–model interaction patterns for low-latency user-facing AI experiences
- Profound understanding of Transformer architectures, attention mechanisms, and the math behind LLMs (not just API usage)
- Experience with high-performance inference serving (e.g., vLLM, TensorRT-LLM, TGI, Triton) and optimization techniques (quantization, LoRA adapters, paged attention)
- Strong background in designing distributed systems, microservices, and event-driven architectures (Kafka, gRPC, Kubernetes)
- Advanced proficiency in Python and familiarity with C++ or CUDA is a strong plus.
Benefits
- time off programs
- medical
- dental
- vision
- mental health support
- paid parental leave
- life and disability insurance
- 401(k)
- employee stock purchasing program
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
model servinginference orchestrationagentic reasoningcontinuous learningmulti-turn agent state managementmemory persistencefunction callingTransformer architectureshigh-performance inference servingdistributed systems
Soft Skills
technical decision makingmentoringcollaborationsystem design principlesarchitectural patterns
Certifications
PhD in Computer ScienceMaster’s in Computer Science