Tech Stack
AWSAzureCloudGoGoogle Cloud PlatformGroovyKubernetesLinuxPrometheusPythonTerraform
About the role
- Responsible for daily operations of Solace Cloud, the market-leading SaaS offering, across AWS, Azure, GCP, Kubernetes, etc.
- Ensure Solace Cloud Services are healthy and reliable and SLAs are being met
- Design and implement infrastructure tooling, observability, and automation
- Improve production operations to be more efficient and less error-prone
- Handle production incidents according to industry-standard incident management processes
- Process service requests and provisioning by customers
- Manage customer escalations and drive resolution in mission-critical, high-impact production environments
- Work directly with customers to identify, troubleshoot, and resolve operational issues
- Debug Linux and Kubernetes at a system level to detect and resolve operational issues
- Participate in on-call rotation and provide 24x7 off-hours support
Requirements
- Proven expertise with public cloud providers (AWS, Azure, GCP) services & features
- Proven expertise with cloud Kubernetes infrastructure platforms (EKS, AKS, GKE)
- Hands-on experience with Monitoring tools like Datadog, Kibana, Prometheus
- Hands-on experience with Infrastructure Automation using Terraform, CloudFormation
- Hands-on expertise in debugging production alerts
- Expert-level understanding of Linux Operating Systems
- Programmer in languages such as Groovy, Python, and Go
- Certified Kubernetes Administrator
- Certified Cloud Administrator (AWS, Azure, or GCP)
- Expert-level knowledge in Cloud Networking Solutions
- Expert-level knowledge in handling production incidents in multi-cloud environments
- Proven ability to manage customer escalations and drive resolution in mission-critical production environments
- Experience in SaaS operations and customer-facing technical support
- Be on-call rotation and provide 24x7 off-hours support
- Strong communicator able to articulate complex technical issues and communicate with customers
- Ideally 7+ years of work experience in a technical role
- Must be able to work in/commute to Ottawa area; eligibility to work in Canada asked in application