Motive

Cloud Ops Engineer

Motive

full-time

Posted on:

Location Type: Hybrid

Location: IslamabadPakistan

Visit company website

Explore more

AI Apply
Apply

About the role

  • Own and refine the incident management lifecycle and be the incident commander, running communication and triage, and post-incident analysis and follow-ups to drive continuous service improvement.
  • Manage the central on-call solution and integrations used by over 100 teams from different monitoring and other platforms, leveraging automation and self-serve tools such as terraform.
  • Analyze operational statistics (MTTR, incident frequency, service-level data) to identify trends and prioritize reliability initiatives and teams’ focus.
  • Improve change management processes and automation to reduce both risk and friction.
  • Collaborate with engineering teams across the organization to standardize operational practices and develop automated workflows.
  • Leverage AI for incident analysis, alert/issue solutioning, and automation.

Requirements

  • Experience managing and participating in a 24/7 on-call rotation and incident response process.
  • Experience with on-call systems such as Rootly, PagerDuty, Opsgenie, etc.
  • Experience with monitoring and observability tools (e.g., Datadog, NewRelic, Grafana, etc.).
  • Ability to communicate clearly and manage incidents, communications, and action items with stakeholders from engineers to directors, and public-facing messaging.
  • Experience with IT Service Management tools (Jira/JSM) for ticket and change management.
  • 3+ years experience in an incident response role.
Benefits
  • Creating a diverse and inclusive workplace is one of Motive's core values.
  • We are an equal opportunity employer and welcome people of different backgrounds, experiences, abilities and perspectives.
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
incident managementautomationchange managementincident responseoperational statisticsMTTRservice-level dataAI for incident analysisautomated workflows
Soft Skills
communicationcollaborationstakeholder management