
Cloud Site Reliability Engineer
Solace
full-time
Posted on:
Location Type: Hybrid
Location: Ottawa • Canada
Visit company websiteExplore more
Salary
💰 CA$100,000 - CA$120,000 per year
About the role
- Ensuring that the Solace Cloud Services are healthy and reliable, and that SLAs are being met
- Assist in implementing the infrastructure tooling, observability, and automation
- Contribute to making the production operations more efficient, less error-prone
- Handle production incidents in enterprise-grade multi-cloud environments according to industry-standard Incident management process
- Process handling service requests and provisioning by the customers
- Manage customer escalations and drive resolution in mission-critical, high-impact production environments
- Work directly with customers to identify, troubleshoot, and resolve operational issues
- Expert debugging knowledge in Linux and Kubernetes to detect operational issues
- Be on-call rotation and provide 24x7 off-hours support
Requirements
- Proven expertise with public cloud providers (AWS, Azure, GCP) services & features
- Proven expertise with cloud Kubernetes infrastructure platforms such as AWS Elastic Kubernetes Service, Azure Kubernetes Service, Google Kubernetes Service
- Hands-on experience with Monitoring tools like Datadog, Kibana, Prometheus etc.
- Hands-on experience with Infrastructure Automation using Terraform, Cloud Formation
- Hands-on expertise in debugging production alerts
- Strong understanding of Linux Operating Systems
- Programmer in languages such as Groovy, Python, and Go
- Hands-on experience with AI tools and a strong interest in advancing AI capabilities.
Benefits
- Balance matters – We believe work should fit into your life, not the other way.
- Hybrid-first – Flexibility is built into how we work, so everyone feels included and empowered.
- Values-driven – We live and breathe our core values: craftsmanship, trust, courage, freedom, momentum, humility, and human experience.
- Growth mindset – Our training programs are designed to help you level up, fast.
- Customer Obsessed – We’re proud of our world-class customer lineup.
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
KubernetesLinuxTerraformCloud FormationGroovyPythonGoMonitoring toolsAI toolsPublic cloud services
Soft Skills
Incident managementCustomer escalation managementTroubleshootingOperational issue resolutionProduction operations efficiency