Solace

Senior Cloud Site Reliability Engineer

Solace

full-time

Posted on:

Origin:  • 🇨🇦 Canada

Visit company website
AI Apply
Apply

Job Level

Senior

Tech Stack

AWSAzureCloudGoGoogle Cloud PlatformGroovyKubernetesLinuxPrometheusPythonTerraform

About the role

  • Responsible for daily operations of Solace Cloud, the market-leading SaaS offering, across AWS, Azure, GCP, Kubernetes, etc.
  • Ensure Solace Cloud Services are healthy and reliable and SLAs are being met
  • Design and implement infrastructure tooling, observability, and automation
  • Improve production operations to be more efficient and less error-prone
  • Handle production incidents according to industry-standard incident management processes
  • Process service requests and provisioning by customers
  • Manage customer escalations and drive resolution in mission-critical, high-impact production environments
  • Work directly with customers to identify, troubleshoot, and resolve operational issues
  • Debug Linux and Kubernetes at a system level to detect and resolve operational issues
  • Participate in on-call rotation and provide 24x7 off-hours support

Requirements

  • Proven expertise with public cloud providers (AWS, Azure, GCP) services & features
  • Proven expertise with cloud Kubernetes infrastructure platforms (EKS, AKS, GKE)
  • Hands-on experience with Monitoring tools like Datadog, Kibana, Prometheus
  • Hands-on experience with Infrastructure Automation using Terraform, CloudFormation
  • Hands-on expertise in debugging production alerts
  • Expert-level understanding of Linux Operating Systems
  • Programmer in languages such as Groovy, Python, and Go
  • Certified Kubernetes Administrator
  • Certified Cloud Administrator (AWS, Azure, or GCP)
  • Expert-level knowledge in Cloud Networking Solutions
  • Expert-level knowledge in handling production incidents in multi-cloud environments
  • Proven ability to manage customer escalations and drive resolution in mission-critical production environments
  • Experience in SaaS operations and customer-facing technical support
  • Be on-call rotation and provide 24x7 off-hours support
  • Strong communicator able to articulate complex technical issues and communicate with customers
  • Ideally 7+ years of work experience in a technical role
  • Must be able to work in/commute to Ottawa area; eligibility to work in Canada asked in application
Lucidworks

Senior Managed Services Engineer

Lucidworks
Seniorfull-time$120k–$165k / year🇺🇸 United States
Posted: 1 day agoSource: jobs.lever.co
AWSAzureCloudGoogle Cloud PlatformJavaJavaScriptKubernetesPythonSpark
Equinix

Head of Engineering – Cloud Platform

Equinix
Leadfull-time$202k–$304k / year🇺🇸 United States
Posted: 14 days agoSource: equinix.wd1.myworkdayjobs.com
AWSAzureCloudGoogle Cloud PlatformKubernetesTerraform
Writer

Cloud Platform Engineer

Writer
Senior · Leadfull-timeNew York · 🇺🇸 United States
Posted: 12 days agoSource: jobs.ashbyhq.com
AWSAzureCloudDistributed SystemsDockerGoGoogle Cloud PlatformKubernetesPythonTerraform
Seqera

Technical Solutions Architect

Seqera
Mid · Seniorfull-timeCalifornia, New York · 🇺🇸 United States
Posted: 10 days agoSource: jobs.ashbyhq.com
AWSAzureCloudDockerGoogle Cloud PlatformKubernetesPythonTerraform
MetroStar

Sr. DevOps Engineer II (6061)

MetroStar
Seniorfull-time$141k–$172k / year🇺🇸 United States
Posted: 36 days agoSource: boards.greenhouse.io
AWSAzureCloudDrupalGoogle Cloud PlatformJenkinsKubernetesLinuxOpenShiftVMware