Splunk

Manager, SRE, FedRAMP

Splunk

full-time

Posted on:

Origin:  • 🇺🇸 United States • Illinois

Visit company website
AI Apply
Manual Apply

Salary

💰 $139,840 - $192,280 per year

Job Level

SeniorLead

Tech Stack

ApacheAWSCassandraCloudDistributed SystemsGoGoogle Cloud PlatformGrafanaJenkinsKafkaKubernetesMicroservicesMongoDBPackerPrometheusPythonRedisSpinnakerSplunkTerraformZookeeper

About the role

  • Lead a team of engineers for Splunk Cloud Observability in FedRAMP environments.
  • Manage across the organization to deliver quality products.
  • Mentor and grow engineering teams building cloud-based environment for massive-scale data processing.
  • Partner with Talent Acquisition to recruit and hire SRE FedRAMP team members.
  • Manage teams to exceed goals and drive success.
  • Lead reliability projects: HA, BCP, disaster recovery, backup/restore, RTO, RPO, chaos engineering, uptime and performance.
  • Capacity management & planning, SLIs, SLOs, error budgets, monitoring dashboards.
  • Deploy and operate large-scale distributed data stores and streaming services.
  • Establishing design patterns for monitoring and benchmarking.
  • Document production run books and developer guidelines.
  • Implement tooling, toil reduction, runbooks & automation for production.
  • Incident management and improving MTTD/MTTR.
  • Cloud cost optimization.

Requirements

  • Must-Have:
  • 8+ years of experience in handling large-scale cloud-native microservices platforms.
  • 2+ years of strong hands-on management experience managing teams deploying, handling, and monitoring large-scale Kubernetes clusters in the public cloud specifically AWS or GCP
  • Experience with and leading a team in infrastructure automation and scripting using Python and/or Golang.
  • Experience managing remote teams.
  • Strong hands-on experience in monitoring tools such as Splunk, Prometheus, Grafana, ELK stack, etc. in order to build observability for large-scale microservices deployments.
  • Experience with deployment, operations, and performance management of one or more of the following large-scale clusters such as Cassandra, Kafka, Elastic Search, MongoDB, ZooKeeper, Redis, etc.
  • Excellent problem-solving, triaging, and debugging skills in large-scale distributed systems Preferred:
  • Familiarity working with and/or managing in compliance environments such as HIPPA, GovCloud, State Government, Federal Government, SOC2 or FedRAMP
  • AWS Solutions Architect certification preferred.
  • Confluent Certified Administrator for Apache Kafka and/or Apache Cassandra Administrator Associate certifications are preferred
  • Experience with Infrastructure-as-Code using Terraform, CloudFormation, Google Deployment Manager, Pulumi, Packer, ARM, etc.
  • Experience with CI/CD frameworks and Pipeline-as-Code such as Jenkins, Spinnaker, Gitlab, Argo, Artifactory, etc.
  • Proven skills to effectively work across teams and functions to influence the design, operations, and deployment of highly available software.
  • Bachelors/Masters in Computer Science, Computer Engineering, or related technical field, or equivalent practical experience.