
Network Operations Centre Engineer
Maxtec
full-time
Posted on:
Location Type: Hybrid
Location: Cape Town • South Africa
Visit company websiteExplore more
Tech Stack
About the role
- Monitoring & Observability
- Using monitoring tools such as Grafana, Datadog, SolarWinds, and Nagios to interpret dashboards, review alerts, and identify abnormal performance patterns or traffic deviations.
- Correlating real‑time metrics, logs, and telemetry to detect system health concerns and escalate appropriately.
- Networking & Platform Operations
- Applying solid understanding of TCP/IP, DNS, HTTP, TLS, load balancing, and CDNs to support troubleshooting of platform issues.
- Using working knowledge of distributed systems, caching, and messaging components to assist with fault isolation and impact assessment during incidents.
- Incident Management Tooling
- Using Jira for structured incident tracking, escalation, and resolution workflows
- Operating on‑call platforms such as PagerDuty and maintaining knowledge base/runbook documentation for consistent incident response
- Diagnostics & Troubleshooting
- Performing first‑pass triage on server health, application performance, API latency, and database connectivity (e.g., SQL reachability, connection pooling)
- Analysing logs, metrics, and system indicators to narrow down root‑cause direction during high‑pressure incidents
- Scripting & Automation
- Using basic Bash, Python, or PowerShell scripts for log extraction, parsing, or system checks.
- Assisting with automating recurring operational tasks to reduce manual effort and improve consistency.
- Cloud & Container Technologies
- Understanding cloud fundamentals in AWS, Azure, or GCP to support cloud‑based troubleshooting or triage.
- Basic exposure to Docker, Kubernetes, or log stacks such as ELK/Opensearch, Splunk, or Loki/Promtail to aid with diagnosing distributed workloads.
Requirements
- A relevant IT qualification or industry certification, such as CompTIA A+ / Network+ or Cisco CCNA, or equivalent intermediate‑level technical certification.
- 3 - 5 years’ experience in an Operations, NOC, or Incident Management environment with a focus on real‑time monitoring, incident detection, and structured escalation.
- 3 - 5 years’ hands‑on experience using at least one major monitoring platform (e.g., Nagios, SolarWinds, Datadog, Grafana, Zabbix), including alert interpretation and basic correlation.
- 3 - 5 years’ experience using enterprise ITSM tools such as Jira, ServiceNow, or Freshservice.
- Practical exposure to ITIL Incident Management, demonstrated either by ITIL v4 Foundation certification, or documented participation in a structured incident-response workflow in a previous role
- 3 - 5 years’ experience collaborating with engineering, support, platform, or SRE teams in a 24/7 operational or incident‑driven environment.
- Prior experience in environments with rapid changes, live-system dependencies, or peak‑traffic events (e.g., sports, e‑commerce, gaming, streaming, financial trading)
Benefits
- Supergrowth is real here.
- Our learning and development programmes give you the tools, training and opportunities to level up fast.
- Your progress matters.
- Our Performance tool ensures you get meaningful feedback to support your development and superdrive your career.
- Support that has your back.
- Our Employee Assistance Programme offers resources for you and your family.
- Group Life Cover
- Funeral Fund Benefit
- Income Continuation Benefit
- Medical Aid Subsidy
- Retirement Annuity Subsidy
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
TCP/IPDNSHTTPTLSload balancingdistributed systemsBashPythonPowerShellcloud fundamentals
Soft Skills
incident detectionstructured escalationcollaborationtroubleshootingdiagnosticsproblem-solvingcommunicationadaptabilitycritical thinkingtime management
Certifications
CompTIA A+CompTIA Network+Cisco CCNAITIL v4 Foundation