
Manager, Network DevOps
Vultr
full-time
Posted on:
Location Type: Remote
Location: Remote • 🇺🇸 United States
Visit company websiteSalary
💰 $140,000 - $150,000 per year
Job Level
Mid-LevelSenior
Tech Stack
AnsibleCloudDistributed SystemsGoGrafanaKafkaLinuxPrometheusPythonRustSwitching
About the role
- Own the NetDevOps roadmap — spanning automation, observability, configuration validation, telemetry ingestion, and operational tooling for the global network.
- Manage and grow a high-performing team of NetDevOps Engineers, providing technical guidance, career development, and hands-on mentorship.
- Drive automation for complex environments, including EVPN-VXLAN data center fabrics, RoCEv2 lossless Ethernet, and global WAN/edge infrastructure.
- Build and evolve operator tooling for Network Operations (Tier 1/2) including event correlation, intent validation, playbooks, and automated remediation workflows.
- Ensure operational excellence across fleet-wide updates, config management, CI/CD pipelines, and reliability metrics for automation systems.
- Partner closely with Cloud Networking (who own front-end networking, VPC automation, dataplane behavior) to unify automation interfaces and ensure clean separation of responsibilities.
- Collaborate with Architecture, Platform, and GPU/AI Engineering on next-generation fabric design, automation hooks, observability, and provisioning flows.
- Standardize telemetry ingestion and correlation pipelines (gNMI, Kafka, Prometheus, custom collectors) to generate actionable, real-time insights into network behavior.
- Lead complex investigations across routing, switching, RDMA transport behavior, congestion, ECMP, and overlay/underlay interactions, especially where tooling or automation must evolve.
- Define engineering standards, SLIs/SLOs for automation services, and operational maturity goals (testing, documentation, failure modes).
Requirements
- Strong experience building and leading high-performing engineering teams (NetDevOps, SRE, automation, or network engineering groups).
- Deep understanding of modern data center networking: EVPN-VXLAN, BGP, QoS, telemetry, and config automation.
- Familiarity with RoCEv2/RDMA fabrics, PFC/ECN tuning, congestion management, or GPU/AI fabric operations.
- Hands-on experience with automation ecosystems - Ansible, Python, Go, Rust, CI/CD pipelines, config linting, and intent validation frameworks.
- Experience integrating automation with a Source-of-Truth (NetBox, Nautobot, OpsMill, homegrown systems).
- Strong understanding of telemetry and monitoring stacks (Prometheus/Grafana, Kafka, OpenTelemetry, custom collectors).
- Ability to dive deep into Linux networking internals, namespaces, netlink, and distributed systems behavior.
- Proven experience delivering reliable automation services at scale, with strong fundamentals in testing, versioning, rollback, and change management.
Benefits
- 100% company-paid insurance premiums for employee medical, dental and vision plans.
- 401(k) plan that matches 100% up to 4%, with immediate vesting
- Professional Development Reimbursement of $2,500 each year
- 11 Holidays + Paid Time Off Accrual + Rollover Plan
- Commitment matters to Vultr! Increased PTO at 3 year and 10 year anniversary + 1 month paid sabbatical every 5 years + Anniversary Bonus each year
- $500 stipend for remote office setup in first year + $400 each following year
- Internet reimbursement up to $75 per month
- Gym membership reimbursement up to $50 per month
- Company paid Wellable subscription
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard skills
NetDevOpsautomationobservabilityconfiguration validationtelemetry ingestionEVPN-VXLANRoCEv2CI/CDAnsiblePython
Soft skills
team leadershiptechnical guidancecareer developmentmentorshipcollaborationinvestigationoperational excellencestandardizationcommunicationproblem-solving