
Senior Cloud Site Reliability Engineer
Solace
full-time
Posted on:
Location Type: Hybrid
Location: Ottawa • Canada
Visit company websiteExplore more
Salary
💰 CA$120,000 - CA$150,000 per year
Job Level
About the role
- Ensuring that the Solace Cloud Services are healthy and reliable, and that SLAs are being met
- Design and implement our infrastructure tooling, observability, and automation
- Contribute to making the production operations more efficient, less error-prone, etc.
- Expert-level knowledge in handling production Incidents in production-grade multi-cloud environments according to industry-standard Incident management process
- Process handling service requests and provisioning by the customers.
- Proven ability to manage customer escalations and drive resolution in mission-critical, high-impact production environments
- Work directly with customers to identify, troubleshoot, and resolve operational issues.
- Expert debugging knowledge in Linux and Kubernetes to detect operational issues.
- Be on-call rotation and provide 24x7 off-hours support
Requirements
- Proven expertise with public cloud providers (AWS, Azure, GCP) services & features
- Proven expertise with cloud Kubernetes infrastructure platforms such as AWS Elastic Kubernetes Service, Azure Kubernetes Service, Google Kubernetes Service
- Hands-on experience with Monitoring tools like Datadog, Kibana, Prometheus etc.
- Hands-on experience with Infrastructure Automation using Terraform, Cloud Formation
- Hands-on expertise in debugging production alerts
- Expert-level understanding of Linux Operating Systems
- Programmer in languages such as Groovy, Python, and Go
- Certified Kubernetes Administrator
- Certified Cloud Administrator (AWS, Azure, or GCP)
Benefits
- Work with brilliance – Our team is packed with some of the sharpest minds in the industry.
- Balance matters – We believe work should fit into your life, not the other way around.
- Hybrid-first – Flexibility is built into how we work, so everyone feels included and empowered.
- Values-driven – We live and breathe our core values: craftsmanship, trust, courage, freedom, momentum, humility, and human experience.
- Growth mindset – Our training programs are designed to help you level up, fast.
- Customer Obsessed – We’re proud of our world-class customer lineup (we’re not shy about it).
- Keep it fun – We’re social, we keep things simple, and we know how to have a good time.
- Creative culture – We’ve got a great sense of humour and we make cool videos on topics like MITT and this (check them out!).
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
KubernetesLinuxTerraformCloud FormationGroovyPythonGoMonitoring toolsDebuggingIncident management
Soft Skills
Customer escalation managementTroubleshootingOperational issue resolutionEfficiency improvementError reduction
Certifications
Certified Kubernetes AdministratorCertified Cloud Administrator