Pendo.io

Site Reliability Engineer

Pendo.io

full-time

Posted on:

Location Type: Office

Location: Raleigh • North Carolina • 🇺🇸 United States

Visit company website
AI Apply
Apply

Salary

💰 $105,000 - $120,000 per year

Job Level

Mid-LevelSenior

Tech Stack

AnsibleBigQueryCloudDistributed SystemsGoGoogle Cloud PlatformKubernetesPythonTerraform

About the role

  • Provisioning and maintaining cloud infrastructure from development through production for product initiatives
  • Write high-quality infrastructure-as-code that automates provisioning, deployment, scaling, and monitoring
  • Write maintainable code for product functionality with emphasis on operations, scale, resiliency, and monitoring
  • Provide developers with stable and performant CI and release pipelines and development environments
  • Work with engineers to ensure new services are well-designed, properly monitored and have well-defined SLIs and achievable SLOs
  • Debug production issues, mitigate quickly, and implement preventative measures
  • Maintain runbooks for manual tasks and replace them with automation whenever possible
  • Proactively track capacity, quotas, and performance limits to plan for growth
  • Participate in a 24x7 on-call rotation to handle product availability issues and urgent customer support escalations
  • Support a high-throughput platform processing more than 15 billion events per day
  • Collaborate with Information Security to ensure cloud infrastructure security and controls to meet compliance goals such as SOC 2

Requirements

  • Experience working with cloud infrastructure using tools such as Ansible or Terraform
  • Programming skills in a language such as Go or Python, and a willingness to learn new languages as needed
  • Ability to think and talk about systems in terms of possible failure modes, bottlenecks, etc.
  • Ability to write clear and concise English-language documentation of processes for incident runbooks and release processes
  • Good number sense for discussing performance analysis, cost analysis, and operational metrics
  • Experience designing, analyzing, and troubleshooting distributed systems (preferred)
  • Experience maintaining Kubernetes clusters in a production environment (preferred)
  • Previous experience as a Site Reliability Engineer, DevOps Engineer, or similar role (preferred)
  • Familiarity with Google Cloud Platform technologies: Google Kubernetes Engine (GKE), Memorystore, Cloud Datastore, PubSub, Cloud Functions, BigQuery, Vertex AI
  • Familiarity with third-party services such as Amazon SES
Benefits
  • Our salary ranges are based on paying competitively for our size and industry, and are one part of many compensation, benefits and other reward opportunities we provide.
  • Come join one of the fastest-growing startups, supported by best-in-class institutions like Battery Ventures, Salesforce Ventures, Spark Capital and Meritech.
  • You will gain experience in a diverse and exciting set of technologies and clients and have a real impact on Pendo's future.
  • Our culture is passionate, dynamic, and fun.
  • Pendo is committed to working with, and providing access and reasonable accommodation to, applicants with mental and/or physical disabilities. If you require an accommodation, contact accommodation@pendo.io.

ATS Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard skills
infrastructure-as-codecloud infrastructureprogramming in Goprogramming in PythonKubernetesperformance analysiscost analysisoperational metricsdistributed systemsautomation
Soft skills
problem-solvingcommunicationcollaborationdocumentationcapacity planningincident managementmonitoringresiliency thinkingon-call supportproactive tracking
Certifications
SOC 2 compliance
Nextiva

Platform Operations Engineer

Nextiva
Junior · Midfull-time🇮🇳 India
Posted: 21 days agoSource: boards.greenhouse.io
AnsibleCloudDistributed SystemsFirewallsGoogle Cloud PlatformJavaJavaScriptPythonSQLTerraformVoIP
Imubit

Site Reliability Engineer

Imubit
Mid · Seniorfull-time🇺🇸 United States
Posted: 17 days agoSource: boards.greenhouse.io
AnsibleAWSCloudDistributed SystemsGoGoogle Cloud PlatformGrafanaKubernetesPostgresPrometheusPythonSplunk+2 more
Braze

Senior Platform Infrastructure Engineer

Braze
Seniorfull-time$155k–$248k / yearNew York · 🇺🇸 United States
Posted: 7 hours agoSource: boards.greenhouse.io
AnsibleAWSAzureCloudEC2Google Cloud PlatformKafkaKubernetesTerraform
EDB

Manager, PostgreSQL DBaaS Support and Site Reliability Engineering

EDB
Senior · Leadfull-time🇺🇸 United States
Posted: 22 days agoSource: boards.greenhouse.io
AnsibleAWSAzureCloudGoGoogle Cloud PlatformGrafanaKubernetesOpen SourcePostgresPrometheusPython+1 more
Arta Finance

IT Systems Engineer

Arta Finance
Senior · Leadfull-time🇸🇬 Singapore
Posted: 24 days agoSource: jobs.ashbyhq.com
AnsibleAWSCloudGoGoogle Cloud PlatformLinuxPythonTerraform