HHAeXchange

Site Reliability Architect

HHAeXchange

full-time

Posted on:

Location Type: Remote

Location: United States

Visit company website

Explore more

AI Apply
Apply

Salary

💰 $170,000 - $185,000 per year

Job Level

About the role

  • Architect with a resiliency-by-design intent, for self-healing, fault-tolerant systems, focusing on proactive readiness rather than reactive correction.
  • Operate within a secure high-volume, high-volatility application environment, utilizing advanced networking and compute structures, in cloud hosted environments (AWS/GCP).
  • Move the organization from "firefighting" to a proactive culture through habits and systems supporting feature flagging, production readiness reviews, architectural decision records, and chaos engineering.
  • Support the incident management practice, mentoring SREs and Software engineers alike in utilizing our monitoring and observability toolsets for effective troubleshooting.
  • Define SLIs, SLOs, and error budgets that balance feature velocity with platform stability, supporting a shift to service ownership.
  • Underscore an automation-first perspective using Terraform, CDK, and other cloud-formation infrastructure as code toolsets to ensure repeatable, audit-ready environments.

Requirements

  • Bachelor's or Master's degree in Computer Science, Information Systems, or related field and applicable experience.
  • 10 + years in SRE/DevOps with 4 of that in an enterprise SaaS environment.
  • 4+ years in software development contributing to a SaaS-based, cloud-hosted product line.
  • Proven track record in a distributed SaaS environment managing multi-cloud or multi-region workloads.
  • Proficiency in modern cloud networking, including DNS, TCP/IP, Load Balancing, and Zero Trust security models.
  • Strong coding skills in Go, Python, Java, C#, or others, to build internal reliability tools and automate complex operational workflows.
  • Expert-level knowledge of Kubernetes (EKS/GKE) architecture, including multi-cluster management and stateful workloads.
  • Ability to optimize cloud spend while maintaining high performance and reliability.
  • Experience operating in a DevSecOps context with compliance guardrails (e.g., GDPR, HIPAA, HITRUST) across varied infrastructures
  • Willingness to explore and adopt AI tools responsibly to enhance productivity and innovation in your role
Benefits
  • competitive health plans
  • paid time-off
  • company paid holidays
  • 401K retirement program with a Company elected match
  • other company sponsored programs
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
GoPythonJavaC#KubernetesTerraformCDKcloud networkingDNSTCP/IP
Soft Skills
mentoringproactive culturetroubleshootingautomation-first perspectivefeature flaggingproduction readiness reviewsarchitectural decision recordschaos engineeringincident managementservice ownership
Certifications
Bachelor's degreeMaster's degree