
Staff Infrastructure Software Engineer – AI Platform
Addepar
full-time
Posted on:
Location Type: Remote
Location: Remote • 🇨🇦 Canada
Visit company websiteJob Level
Lead
Tech Stack
AWSCloudGoGrafanaJavaJenkinsKubernetesMicroservicesPrometheusPythonTerraformUnity
About the role
- Design and build the production runtime for LLM-based agents and products, creating the services and infrastructure that serve autonomous agents.
- Develop deep application-level knowledge to proactively inform and influence requirements, constraints and best practices for implementing composable, complex AI systems.
- Lead the design, implementation, and automation of production infrastructure on a variety of cloud environments (Kubernetes/Databricks), to enable us to ship and scale AI features instantly.
- Evangelize and promote disciplined, best engineering practices to enforce strong production hygiene and culture.
- Initiate and lead collaborations with cross-functional teams to identify and resolve complex application or infrastructure issues, serving as a technical subject matter expert.
- Architect, build, and maintain advanced, automated CI/CD pipelines e.g. using Jenkins, ArgoCD, AWS CodeBuild/Pipeline, GitHub Actions, or similar, establishing best practices for deployment strategies (e.g., blue/green, canary).
- Develop systems and best practices monitoring, alerting, and troubleshooting of our probabilistic and AI-driven systems and broader software stack.
Requirements
- Extensive experience as a Software/Backend Engineer, with a track record of taking on increasing responsibility.
- Experience across the full product lifecycle: designing, implementing, shipping, scaling, operationalizing, and maintaining technology/SaaS products.
- Exceptional Programming skills and fundamentals in Python/Go/Java, with a proven track record of building large scale production systems.
- Proficient experience with diverse compute environments including microservices (K8s), Databricks and serverless architectures (e.g. AWS Lambda).
- Demonstrable experience leading initiatives with infrastructure-as-code tools such as Terraform in complex, multi-account environments.
- Proficient experience with comprehensive monitoring and alerting stacks (e.g. Prometheus/Grafana/Sentry/cloud-native tools), with a focus on observability strategy.
- Excellent interpersonal and communication skills to effectively collaborate with multi-functional teams, articulate complex technical concepts, and influence outcomes.
- Bonus points/Nice to haves: Extensive experience with Databricks (Unity Catalog, Model Serving, Jobs).
- Demonstrable experience writing and contributing to significant systems automation tooling or open-source projects is a strong plus.
- Specific experience with LLMs and agentic systems and associated technologies such as Langchain, Vector DBs, or MLFlow.
- Exposure to industry practices in financial services or other highly regulated data environments is a plus.
Benefits
- In addition to our core values, Addepar is proud to be an equal opportunity employer. We seek to bring together diverse ideas, experiences, skill sets, perspectives, backgrounds and identities to drive innovative solutions. We commit to promoting a welcoming environment where inclusion and belonging are held as a shared responsibility.
- We will ensure that individuals with disabilities are provided reasonable accommodation to participate in the job application or interview process, to perform essential job functions, and to receive other benefits and privileges of employment. Please contact us to request accommodation.
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard skills
PythonGoJavaKubernetesDatabricksAWS LambdaTerraformCI/CDJenkinsArgoCD
Soft skills
interpersonal skillscommunication skillscollaborationinfluenceleadership