Climb Channel Solutions NA

Senior Site Reliability Engineer – FedRAMP

Climb Channel Solutions NA

full-time

Posted on:

Location Type: Remote

Location: United States

Visit company website

Explore more

AI Apply
Apply

Job Level

About the role

  • Serve as the primary point of contact for several critical production SaaS applications hosted in Azure, ensuring their availability, performance, and reliability.
  • Maintain and support infrastructure within a FedRAMP High authorized environment, ensuring continuous compliance with NIST 800-53 controls and participating in audit readiness activities
  • Configure, monitor, troubleshoot, and resolve complex cloud infrastructure and application issues across multiple environments.
  • Ensure critical SLAs are met, including participation in an on-call rotation for weekends and emergencies.
  • Develop and maintain automation solutions for monitoring, alert mitigation, telemetry, log analysis, and incident response.
  • Contribute to security documentation including system security plans, standard operating procedures, and runbooks
  • Apply observability best practices to proactively detect and mitigate issues using logging, metrics, tracing, and alerting tools.
  • Partner with engineering, security, and product teams to drive reliability improvements and ensure services are built with SRE principles from the ground up.
  • Lead and contribute to post-incident reviews, identifying root causes, and implementing preventive actions.

Requirements

  • 8+ years of relevant experience in Site Reliability Engineering, DevOps, or Cloud Administration.
  • Strong background in integrating, upgrading, securing, and supporting software systems across heterogeneous environments.
  • Proven hands-on experience as a Cloud Administrator with Azure, including microservices on AKS (Azure Kubernetes Service), cloud concepts, and cloud security.
  • Scripting and programming experience: PowerShell, Python, and markup languages such as XML, JSON, and YAML.
  • Infrastructure-as-code expertise with Terraform and Azure DevOps pipelines.
  • Knowledge of redundancy, backup, and disaster recovery strategies in cloud environments.
  • Hands-on expertise with monitoring and observability tools such as Datadog, Azure Application Insights, Log Analytics
  • Strong understanding of networking fundamentals, including firewalls, VLANs, NAT, NACLs, load balancing, VPN tunnels, DNS, DHCP, and packet filtering.
  • Direct experience operating in FedRAMP environments, with working knowledge of NIST 800-53 controls, ConMon requirements, and boundary protection
Benefits
  • comprehensive life insurance
  • healthcare insurance
  • pension/retirement matching
  • time off plans
  • paid company holidays
  • meaningful bonus program
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
Site Reliability EngineeringDevOpsCloud AdministrationCloud securityScriptingPowerShellPythonInfrastructure-as-codeTerraformAzure DevOps
Soft Skills
communicationproblem-solvingcollaborationleadershipincident responseroot cause analysisaudit readinessproactive detectionpreventive actionsreliability improvements