Solace

Cloud Site Reliability Engineer

Solace

full-time

Posted on:

Location Type: Hybrid

Location: OttawaCanada

Visit company website

Explore more

AI Apply
Apply

Salary

💰 CA$100,000 - CA$120,000 per year

About the role

  • Ensuring that the Solace Cloud Services are healthy and reliable, and that SLAs are being met
  • Assist in implementing the infrastructure tooling, observability, and automation
  • Contribute to making the production operations more efficient, less error-prone
  • Handle production incidents in enterprise-grade multi-cloud environments according to industry-standard Incident management process
  • Process handling service requests and provisioning by the customers
  • Manage customer escalations and drive resolution in mission-critical, high-impact production environments
  • Work directly with customers to identify, troubleshoot, and resolve operational issues
  • Expert debugging knowledge in Linux and Kubernetes to detect operational issues
  • Be on-call rotation and provide 24x7 off-hours support

Requirements

  • Proven expertise with public cloud providers (AWS, Azure, GCP) services & features
  • Proven expertise with cloud Kubernetes infrastructure platforms such as AWS Elastic Kubernetes Service, Azure Kubernetes Service, Google Kubernetes Service
  • Hands-on experience with Monitoring tools like Datadog, Kibana, Prometheus etc.
  • Hands-on experience with Infrastructure Automation using Terraform, Cloud Formation
  • Hands-on expertise in debugging production alerts
  • Strong understanding of Linux Operating Systems
  • Programmer in languages such as Groovy, Python, and Go
  • Hands-on experience with AI tools and a strong interest in advancing AI capabilities.
Benefits
  • Balance matters – We believe work should fit into your life, not the other way.
  • Hybrid-first – Flexibility is built into how we work, so everyone feels included and empowered.
  • Values-driven – We live and breathe our core values: craftsmanship, trust, courage, freedom, momentum, humility, and human experience.
  • Growth mindset – Our training programs are designed to help you level up, fast.
  • Customer Obsessed – We’re proud of our world-class customer lineup.
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
KubernetesLinuxTerraformCloud FormationGroovyPythonGoMonitoring toolsAI toolsPublic cloud services
Soft Skills
Incident managementCustomer escalation managementTroubleshootingOperational issue resolutionProduction operations efficiency