NVIDIA

Network Site Reliability Engineer

NVIDIA

full-time

Posted on:

Origin:  • 🇬🇧 United Kingdom

Visit company website
AI Apply
Apply

Job Level

SeniorLead

Tech Stack

AnsibleFirewallsGoGrafanaLinuxPrometheusPythonSaltStackServiceNowSwitching

About the role

  • Owning the operational aspect of the network infrastructure, ensuring its high availability and reliability.
  • Partnering with architecture and deployment teams to guarantee that new implementations are supportable and align with production standards.
  • Advocating for and implementing automation to reduce toil and enhance operational efficiency.
  • Monitoring network performance, identifying areas for improvement, and coordinating with relevant teams to execute enhancements.
  • Collaborating with SMEs to resolve production issues swiftly and effectively, maintaining customer satisfaction.
  • Identifying opportunities for operational improvements and partnering with teams to develop solutions that drive excellence and sustainability in network operations.
  • Minimizing manual labor, achieving Service Level Objectives (SLOs), documenting KB articles for bots, following through on RCAs, and conducting blameless postmortems.
  • Hands-on troubleshooting, network automation, observability, documentation, and excellence in operations.
  • Mentoring and fostering professional development and growth within the team.

Requirements

  • BS degree in Computer Science, Electrical Engineering, or a related technical field, or equivalent experience.
  • Minimum of 8 years of industry experience in network site reliability engineering, network automation, network operations, or related areas.
  • Experience on both campus and data center networks.
  • Familiarity with network management tools such as Prometheus, Grafana, Alert Manager, Nautobot/Netbox, BigPanda.
  • Expertise in automating networks using frameworks such as Salt, Ansible, or similar.
  • In depth experience in one or more of the following: Python, Go.
  • Knowledge in network technologies such as TCP/UDP, IPv4/IPv6, Wireless, BGP, VPN, L2 switching, Firewalls, Load Balancers, EVPN, VxLAN, Segment Routing.
  • Proven track record in network operations.
  • Skills with ServiceNow and Jira.
  • Knowledge of Linux system fundamentals is a plus.
  • Systematic problem-solving approach, coupled with excellent communication skills and a sense of ownership and drive.
  • Ways to stand out: experience taking operational signals through SNMP, Syslog, Streaming Telemetry; debugging and optimizing code; automating routine tasks; experience with Mellanox/Cumulus Linux, Palo Alto firewalls, Netscalers and F5 load balancers; previous SRE experience.
Fluent Trade Technologies

Senior Network Engineer

Fluent Trade Technologies
Seniorfull-time🇮🇱 Israel
Posted: 15 days agoSource: www.comeet.com
AnsibleChefElasticSearchFirewallsGrafanaLinuxLogstashPerlPuppetPythonSaltStackTCP/IP
Long View Systems

Senior Network Consultant

Long View Systems
Seniorfull-time$80k–$106k / year🇨🇦 Canada
Posted: 9 days agoSource: jobs.lever.co
AnsibleCloudFirewallsPythonSwitching
Blue Mantis

Senior Network Engineer - Toronto Canada

Blue Mantis
Seniorfull-time🇺🇸 United States
Posted: 30 days agoSource: bluemantis.pinpointhq.com
AnsibleCloudFirewallsPythonSwitching
Sauce Labs

Junior Network Engineer

Sauce Labs
Juniorfull-time$60k–$75k / yearNorth Carolina · 🇺🇸 United States
Posted: 1 day agoSource: boards.greenhouse.io
AnsibleAWSCloudFirewallsGoogle Cloud PlatformiOSPythonSDLCTCP/IP
ALTEN

Ingénieur réseau senior, CCIE, Juniper

ALTEN
Seniorfull-time🇲🇦 Morocco
Posted: 16 days agoSource: jobs.smartrecruiters.com
AnsibleFirewallsPythonSwitchingTerraform