Recorded Future

Senior Site Reliability Engineer

Recorded Future

full-time

Posted on:

Location Type: Office

Location: Boston • Massachusetts • 🇺🇸 United States

Visit company website
AI Apply
Apply

Job Level

Senior

Tech Stack

ApacheAWSChefDistributed SystemsElasticSearchGoGrafanaKafkaKubernetesLinuxLogstashMicroservicesMongoDBNoSQLPrometheusPythonRabbitMQTerraform

About the role

  • Ensure the reliability, scalability, and performance of critical systems and infrastructure.
  • Build and maintain robust infrastructure on AWS, implementing automation and Infrastructure as Code.
  • Design, implement, and maintain scalable and reliable infrastructure on AWS.
  • Develop and manage observability solutions using Grafana, ELK (Elasticsearch, Logstash, Kibana), and Prometheus to monitor system health and performance.
  • Automate infrastructure provisioning and configuration using Terraform and Chef.
  • Participate in a 24/7 on-call rotation to respond to and resolve production incidents.
  • Collaborate with engineering teams to ensure applications are designed for high availability and resilience.
  • Proactively identify and address performance bottlenecks and potential issues.
  • Drive continuous improvement through automation, process optimization, and post-incident reviews.
  • Work closely with development teams to build and maintain robust infrastructure and foster a culture of operational excellence.

Requirements

  • 2+ years of experience in a Site Reliability Engineer, DevOps Engineer, or similar role.
  • Extensive hands-on experience with Amazon Web Services (AWS), including a deep understanding of networking concepts within AWS.
  • Ability to grasp complex architectures and perform multi-step troubleshooting.
  • Advanced Linux skills (engineering fundamentals, networking, storage, operating systems)
  • Development experience with Go or Python
  • Exposure managing and optimizing observability suites (e.g., Grafana, ELK Stack).
  • Strong proficiency in Terraform and Chef.
  • A strong preference for automating tasks and implementing solutions via Infrastructure as Code rather than manual changes.
  • Spectacular collaborator and communicator.
  • A team player but self motivated.
  • Knowledge and experience with Kubernetes. (preferred)
  • Familiarity with message brokers such as RabbitMQ and Apache Kafka. (preferred)
  • Experience with NoSQL databases, particularly MongoDB and Elasticsearch. (preferred)
  • Familiarity with OpenTelemetry (preferred)
  • Experience with large distributed systems and microservices architecture (preferred)
  • Experience with CI/CD pipelines. (preferred)

ATS Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard skills
AWSTerraformChefGoPythonLinuxGrafanaELKPrometheusKubernetes
Soft skills
collaborationcommunicationself-motivationteam playerproblem-solvingprocess optimizationcontinuous improvementtroubleshootingoperational excellenceresilience
SearchStax

Staff Site Reliability Engineer, AWS

SearchStax
Leadfull-time$170k–$240k / year🇺🇸 United States
Posted: 22 days agoSource: jobs.ashbyhq.com
ApacheAWSCloudDistributed SystemsDockerEC2ElasticSearchGoGrafanaJenkinsKubernetesOpen Source+3 more
Everest Technologies, Inc

Azure DevOps Engineer, Kafka Exp

Everest Technologies, Inc
Mid · Seniorfull-time🇺🇸 United States
Posted: 44 days agoSource: etech.zohorecruit.com
ApacheAzureCloudGrafanaJenkinsKafkaKubernetesMicroservicesNode.jsPrometheusPythonSplunk+1 more
DMV IT Service

Senior Software Engineer – Infrastructure Tooling

DMV IT Service
Seniorfull-timeWashington · 🇺🇸 United States
Posted: 3 days agoSource: apply.workable.com
AnsibleApacheAWSAzureChefCloudDistributed SystemsDockerElasticSearchGrafanaJavaKafka+13 more
qode.world

Infrastructure Engineer, Kafka and GenAI

qode.world
Mid · Seniorfull-time🇺🇸 United States
Posted: 33 days agoSource: apply.workable.com
ApacheAWSAzureCloudDockerGoGoogle Cloud PlatformGrafanaJenkinsKafkaKubernetesPrometheus+4 more
Beekeeper

Senior Full-Stack Software Engineer

Beekeeper
Seniorfull-time🇵🇱 Poland
Posted: 21 hours agoSource: boards.greenhouse.io
ApacheAWSCloudDockerElasticSearchGoogle Cloud PlatformGrafanaJavaJavaScriptKafkaKubernetesMySQL+8 more