FREE ACCESS
5,000–10,000 jobs/day

See all jobs on JobTailor
Search thousands of fresh jobs every day.
Discover
- Fresh listings
- Fast filters
- No subscription required
Create a free account and start exploring right away.

Lead Business Analyst – Alert Management, Observability Standards
AstreyaBusiness Analyst IV managing alert management and observability standards for IT operations. Rationalizing system alerts and ensuring alignment with operational coverage models and service reliability goals.
Posted 5/26/2026full-timeRemote • California • 🇺🇸 United StatesSenior💰 $98,040 - $154,800 per yearWebsite
Tech Stack
Tools & technologiesAzureGrafanaPrometheusServiceNowSplunk
About the role
Key responsibilities & impact- Provide solutions that help attain business outcomes.
- Responsible for rationalizing and governing all system alerts to ensure they align with department priorities, operational coverage models, and service reliability goals.
- Define alerting standards, review and approve alerts before they are routed to the 24x7 Eyes-on-Glass Operations team, and establish a scalable approach to cataloging alert response instructions (runbooks/playbooks).
- Operate at the intersection of the IT Operations Command Center (OCC), engineering/application teams, platform/monitoring tool owners, and service owners, ensuring alerts are actionable, prioritized, and paired with clear response guidance.
- Establish and maintain a department-wide alert rationalization framework that evaluates alerts for: business/service criticality and operational priority, actionability, signal-to-noise, and ownership.
- Perform regular alert reviews to ensure alert quality, correct routing, and alignment with operational coverage.
- Lead continuous improvement efforts to reduce alert fatigue while preserving detection of true incidents and high-impact degradation.
- Define and enforce alerting standards.
Requirements
What you’ll need- 5+ years in IT Operations, SRE, Observability, Monitoring Engineering, or Incident Management
- Demonstrated success reducing noise and improving actionability across enterprise alerting ecosystems
- Experience with common monitoring/observability tools (e.g., Splunk, AppDynamics, Dynatrace, Datadog, Prometheus/Grafana, Azure Monitor, CloudWatch, ServiceNow Event Mgmt or similar)
- Strong understanding of: Incident response workflows and operational coverage models (24x7 vs. business hours)
- CMDB/service ownership concepts and dependency mapping
- Standard operating procedures/runbooks and knowledge management
- Excellent stakeholder management and ability to drive standards across teams.
Benefits
Comp & perks- Medical provided through UHC (PPO, HSA, Surest options) / Medical provided through Kaiser (HMO option only) for California employees only
- Dental provided through UHC Nationwide
- Vision provided by UHC
- Flexible Spending Account for Health & Dependent Care
- Pre-Tax Account for Commuter Benefit/Parking & Transit (location-specific)
- Continuing Education and Professional Development via various integrated platforms, e.g. Udemy and Coursera
- Corporate Wellness Program provided by Goomi Group
- Employee Assistance Program
- Wellness Days
- 401k Plan
- Basic and Supplemental Life Insurance
- Short Term & Long Term Disability
- Critical Illness, Critical Hospital, and Voluntary Accident Insurance
- Tuition Reimbursement (available 6 months after start date, capped)
- Paid Time Off (accrued and prorated, maximum of 120 hours annually)
- Paid Holidays
- Any other statutory leaves, paid time, or other ancillary benefits required under state and federal law
ATS Keywords
✓ Tailor your resumeApplicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
IT OperationsSREObservabilityMonitoring EngineeringIncident Managementalert rationalizationalert standardsincident response workflowsdependency mappingknowledge management
Soft Skills
stakeholder managementcontinuous improvementleadershipcommunicationorganizational skills