Lead our world-class Software Defined Networking team.
Lead both internal and customer facing projects in order to expand the capabilities of our multi-tenant, high-performance software-defined network.
Work closely with peer Managers in the Networking, Control Plane, and HPC Architecture teams to set our future vision for software defined networking.
Ensure our customers have the best possible experience that meets their performance, feature, and reliability requirements.
Manage both operational and development workloads, ensuring rigorous SLAs while also investing in automation and platform improvements to accelerate future growth.
Own and assist with product-focused projects and strategies that keep Lambda at the cutting edge of GPU hosting, making us the best place to run any GPU, ML or AI workloads.
Hire, grow and retain top-tier engineers, focusing on both systems reliability engineering and software engineering.
Shape a culture of sustainable, empathetic, and high-velocity engineering, with a deep focus on cross-team collaboration, documentation, and data-driven decision-making.
Requirements
6+ years in full-time engineering management roles at a hyperscalar/cloud, networking solutions provider, technology company dependent on on-prem data-center networking, or a networking software company where you led networking or networking-adjacent teams.
10+ years of industry experience in software engineering, with a focus on deploying networking, distributed systems engineering, and/or software-defined networking.
Design and implementation of networking control planes and data planes, Development and tuning of traffic engineering, routing protocols (e.g., BGP, OSPF), VPNs, load balancers, and distributed firewalls, Proficient in low-level Linux networking, network namespaces, iptables, eBPF, and DPDK
Proven record of leading and building engineering teams that work on mission-critical, high performance networking infrastructure and distributed-systems orchestration.
Demonstrated operational excellence in running production-grade networking infrastructure with 99.99%+ availability SLAs, Defining SLIs and SLOs, Incident management under high-pressure scenarios, Postmortem and root cause analysis, Implementation of observability pipelines (Prometheus, Grafana, ELK, etc.)
Experience deploying and operating next-generation networking technologies in High-performance computing (HPC) and AI datacenters, Edge environments with strict latency and jitter constraints, Private cloud stacks such as OpenStack Neutron, Open vSwitch (OvS), Open Virtual Network (OVN) for Open vSwitch, or other Software Defined Networking software stacks like Nutanix, or VMware NSX
Exceptional leadership skills that encompass leading by trust, building empathy with your reports and other teams, and maintaining a sustainable but rapid velocity.
Strong customer-facing skills, including pre-sales, general support, and incident management.
Expertise with Kubernetes and container networking stacks like CNI plugins (Calico, Cilium, Flannel), Service mesh implementations (Istio, Linkerd), Ingress controllers, multi-tenant network policies, and network security enforcement.
Demonstrated expertise in managing long-term projects alongside urgent, short-term priorities and incident resolution.
Extensive experience collaborating with product, sales, and other engineering teams to build cohesive products with a focus on user experience and reliability.
Benefits
Health, dental, and vision coverage for you and your dependents
Wellness and Commuter stipends for select roles
401k Plan with 2% company match (USA employees)
Flexible Paid Time Off Plan that we all actually use
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard skills
software-defined networkingnetworking control planestraffic engineeringrouting protocolsLinux networkingiptableseBPFDPDKKubernetescontainer networking