NBCUniversal

Site Reliability Engineer 2

NBCUniversal

full-time

Posted on:

Origin:  • 🇺🇸 United States • Illinois, Virginia

Visit company website
AI Apply
Apply

Salary

💰 $84,407 - $126,611 per year

Job Level

Junior

Tech Stack

AnsibleAWSAzureCassandraCloudDockerGoGoogle Cloud PlatformGrafanaHadoopHDFSJavaKafkaKubernetesMySQLNoSQLPostgresPrometheusPythonScalaSparkTerraform

About the role

  • Ensure the reliability, scalability, and performance of FreeWheel systems and data platforms.
  • Manage infrastructure, optimize system reliability, automate daily operations, and resolve technical issues impacting upstream/downstream platforms.
  • Design and implement monitoring and alerting systems; join on-call shifts to respond to and resolve issues.
  • Develop and maintain automation tools and scripts for deployment, monitoring, backup and disaster recovery.
  • Analyze and optimize performance of data storage, query performance, and data flows to reduce latency and improve processing speed.
  • Respond quickly to platform failures, perform troubleshooting, and coordinate cross-team efforts to ensure high availability.
  • Work with engineering teams on capacity planning and scaling to handle traffic growth.
  • Support Freewheel powered Live events.
  • Maintain cloud access management & governance and enforce compliance practices across cloud environment.
  • Document architecture, configurations, and operational procedures; provide training and knowledge sharing.
  • Ensure platforms meet security standards and compliance requirements.
  • Collaborate with engineering, product, and project management teams to support product design and implementation.

Requirements

  • 1-3 years of experience as an SRE, DevOps or Operations Engineer.
  • Experience with cloud platforms (e.g. AWS, OCI, GCP, Azure).
  • Hands-on experience with Terraform and infrastructure as code (IaC) principle.
  • Proficiency in automation tools and frameworks (e.g. Ansible, Terraform , Kubernetes , Docker) for automating system deployment and maintenance.
  • Familiarity with modern data architectures and technologies, including big data platforms (e.g., Kafka, Hadoop, Spark), distributed storage (e.g., Cassandra, HDFS, AWS S3), etc.
  • Extensive experience in data base management (e.g. NoSQL databases, MySQL, PostgreSQL).
  • Programming Skills: Proficient in at least one programming language, such as Python, Go , Java, or Scala, with the ability to write efficient scripts and automation tools.
  • System Monitoring and Log Management: Familiar with using monitoring and log management tools such as Prometheus, Grafana, ELK Stack, or other similar tools.
  • Troubleshooting and Debugging: Strong debugging and troubleshooting skills, with the ability to quickly identify and resolve production issues.
  • Team Collaboration and Communication: Excellent communication skills with the ability to convey technical information clearly and concisely to both technical and non-technical stakeholders.
  • Proactive learner eager to grow in operations and governance.
  • Education: Bachelor’s degree or higher in Computer Science, Software Engineering, or a related field.
  • Relevant Work Experience: 2-5 Years (posting contains 1-3 years and 2-5 years references).
NBCUniversal

Site Reliability Engineer 3

NBCUniversal
Mid · Seniorfull-time$100k–$149k / yearIllinois, Virginia · 🇺🇸 United States
Posted: 2 hours agoSource: comcast.wd5.myworkdayjobs.com
AnsibleAWSAzureCassandraCloudDockerGoGoogle Cloud PlatformGrafanaHadoopHDFSJava+10 more
MRSOOL | مرسول

Senior Site Reliability Engineer

MRSOOL | مرسول
Seniorfull-time🇪🇬 Egypt
Posted: 6 days agoSource: apply.workable.com
AnsibleAWSAzureChefCloudDistributed SystemsDockerGoGoogle Cloud PlatformGrafanaJavaKubernetes+5 more
Istari

Senior Solutions Infrastructure Engineer

Istari
Seniorfull-time$135k–$220k / year🇺🇸 United States
Posted: 6 days agoSource: jobs.lever.co
AnsibleAWSAzureCloudGoogle Cloud PlatformKubernetesPostgresTerraform
Cprime, Inc

Automation Engineer

Cprime, Inc
Mid · Seniorfull-time🇮🇳 India
Posted: 25 days agoSource: jobs.lever.co
AnsibleAWSAzureCloudDockerGoogle Cloud PlatformGrafanaGraphQLITSMJenkinsKubernetesPrometheus+5 more
Cummins Inc.

Senior Platform Engineer

Cummins Inc.
Seniorfull-time🇮🇳 India
Posted: 24 days agoSource: fa-espx-saasfaprod1.fa.ocs.oraclecloud.com
AnsibleAWSAzureChefCloudDockerGoKubernetesLinuxOraclePrometheusPuppet+4 more