Provisioning and maintaining cloud infrastructure from development through production for product initiatives
Write high-quality infrastructure-as-code that automates provisioning, deployment, scaling, and monitoring
Write maintainable code for product functionality with emphasis on operations, scale, resiliency, and monitoring
Provide developers with stable and performant CI and release pipelines and development environments
Work with engineers to ensure new services are well-designed, properly monitored and have well-defined SLIs and achievable SLOs
Debug production issues, mitigate quickly, and implement preventative measures
Maintain runbooks for manual tasks and replace them with automation whenever possible
Proactively track capacity, quotas, and performance limits to plan for growth
Participate in a 24x7 on-call rotation to handle product availability issues and urgent customer support escalations
Support a high-throughput platform processing more than 15 billion events per day
Collaborate with Information Security to ensure cloud infrastructure security and controls to meet compliance goals such as SOC 2
Requirements
Experience working with cloud infrastructure using tools such as Ansible or Terraform
Programming skills in a language such as Go or Python, and a willingness to learn new languages as needed
Ability to think and talk about systems in terms of possible failure modes, bottlenecks, etc.
Ability to write clear and concise English-language documentation of processes for incident runbooks and release processes
Good number sense for discussing performance analysis, cost analysis, and operational metrics
Experience designing, analyzing, and troubleshooting distributed systems (preferred)
Experience maintaining Kubernetes clusters in a production environment (preferred)
Previous experience as a Site Reliability Engineer, DevOps Engineer, or similar role (preferred)
Familiarity with Google Cloud Platform technologies: Google Kubernetes Engine (GKE), Memorystore, Cloud Datastore, PubSub, Cloud Functions, BigQuery, Vertex AI
Familiarity with third-party services such as Amazon SES
Benefits
Our salary ranges are based on paying competitively for our size and industry, and are one part of many compensation, benefits and other reward opportunities we provide.
Come join one of the fastest-growing startups, supported by best-in-class institutions like Battery Ventures, Salesforce Ventures, Spark Capital and Meritech.
You will gain experience in a diverse and exciting set of technologies and clients and have a real impact on Pendo's future.
Our culture is passionate, dynamic, and fun.
Pendo is committed to working with, and providing access and reasonable accommodation to, applicants with mental and/or physical disabilities. If you require an accommodation, contact accommodation@pendo.io.
ATS Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard skills
infrastructure-as-codecloud infrastructureprogramming in Goprogramming in PythonKubernetesperformance analysiscost analysisoperational metricsdistributed systemsautomation