Sophos

Cloud Operations Engineer

Sophos

full-time

Posted on:

Location Type: Remote

Location: Remote • 🇮🇳 India

Visit company website
AI Apply
Apply

Job Level

JuniorMid-Level

Tech Stack

AWSAzureCloudDistributed SystemsDNSFirewallsGrafanaITSMJenkinsLinuxPythonShell ScriptingTCP/IP

About the role

  • Ensure the continuous availability, performance, and reliability of cloud-hosted/On-prem applications and infrastructure through 24x7 support operations
  • Proactive monitoring of critical systems, swiftly identifying and resolving incidents, and escalating issues to appropriate teams
  • Participate in a rotational on-call schedule to provide continuous operational support and rapid incident response for cloud-hosted applications and infrastructure
  • Perform real-time monitoring of infrastructure, platforms, and applications to identify anomalies, performance degradation, or service disruptions
  • Serve as the first line of defense for incident management by promptly acknowledging alerts, triaging issues, and executing documented runbooks
  • Escalate unresolved or critical issues to appropriate support or engineering teams
  • Act as the central point of contact for incident updates, ensuring clear, timely, and accurate communication with stakeholders
  • Work closely with application support, DevOps, infrastructure, and network teams to troubleshoot, resolve, and prevent operational issues
  • Participate in Root Cause Analysis (RCA) processes following major incidents and contribute to developing preventive measures and service improvement plans
  • Follow and maintain standard operating procedures (SOPs), change management policies, and compliance requirements
  • Identify and proactively report potential risks, configuration issues, or performance bottlenecks
  • Maintain accurate documentation of systems, procedures, and incident logs and contribute to knowledge base articles

Requirements

  • Proficiency in managing and troubleshooting services across at least one major cloud provider like AWS or Microsoft Azure
  • Familiarity with core cloud services (Compute, Storage, Networking, IAM, Monitoring, Auto Scaling, etc.)
  • Hands-on experience with enterprise-grade monitoring tools such as Grafana and CloudWatch
  • Ability to configure alerts, dashboards, and automated health checks
  • Strong knowledge of ITIL principles and experience with ITSM tools like PagerDuty, Jira
  • Understanding of incident triage, escalation procedures, service restoration, and Root Cause Analysis (RCA)
  • Working knowledge of Linux and Windows operating systems in a cloud or hybrid environment
  • Familiarity with system administration tasks, shell scripting, and log analysis
  • Ability to create and maintain basic scripts using Bash, Python, or PowerShell to automate operational tasks and monitoring functions
  • Understanding of CI/CD pipelines, deployment processes, and integration with cloud environments
  • Exposure to tools like Git and Jenkins CI/CD is a plus
  • Basic understanding of TCP/IP, DNS, VPN, firewalls, load balancers, and cloud networking concepts (VPCs, NSGs, Subnets)
  • Familiarity with identity and access management (IAM) and security best practices in a cloud environment
  • Experience working with centralized logging solutions (e.g., AWS Cloudwatch or Azure Log Analytics)
  • Ability to trace incidents and correlate logs across distributed systems
  • Strong habit of maintaining accurate operational documentation and runbooks
  • Good to have
  • Proficient understanding of cloud-native monitoring and alerting platforms, showcasing a solid foundation in cloud technology
  • Accumulate 1 to 2 years of practical experience in hands-on utilization of cloud computing, networking, storage, and database systems, with a preference for expertise in AWS
  • Demonstrate a fundamental grasp of scripting tools like Python, Bash, and PowerShell, showcasing the ability to automate tasks for efficiency
  • Certifications like RHCSA / RHCE , AWS Certified (Associate) – Solutions Architect or Six Sigma would be an advantage
Benefits
  • Sophos operates a remote-first working model
  • Our people – we innovate and create, all of which are accompanied by a great sense of fun and team spirit
  • Employee-led diversity and inclusion networks that build community and provide education and advocacy
  • Annual charity and fundraising initiatives and volunteer days for employees to support local communities
  • Global employee sustainability initiatives to reduce our environmental footprint
  • Global fitness and trivia competitions to keep our bodies and minds sharp
  • Global wellbeing days for employees to relax and recharge
  • Monthly wellbeing webinars and training to support employee health and wellbeing

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard skills
cloud computingincident managementmonitoringscriptingLinuxWindowsCI/CDnetworkinglog analysisRoot Cause Analysis
Soft skills
communicationproblem-solvingcollaborationdocumentationincident triageescalation proceduresrisk identificationservice restorationproactive monitoringstakeholder engagement
Certifications
RHCSARHCEAWS Certified Solutions ArchitectSix Sigma