Identifies and establishes ways of stabilizing environments and sites while assessing opportunities to drive engineering stability through analytics and metrics
Responsible for site design consulting, platform management, and capacity planning
Defines, creates, and ensures that robust dashboards and monitoring systems are implemented
Identifies, coordinates, and implements SLAs and SLOs through established robust monitoring for applications, service sites, and platforms
Leads in analyzing metrics from operating sites and applications to assist in performance tuning and fault finding
Troubleshoots priority incidents and participates in blameless post-mortems
Engages in testing strategy approaches and results, complex incident response and root cause analysis efforts, resolving underlying issues, driving continuous improvement in incident management process, and reducing mean time to resolution
Mentors and trains junior team members on best practices for infrastructure management and disaster recovery
Requirements
3+ years of relevant / direct industry experience
Bachelor's degree in Computer Science or related field
Application Development
Business Management
Customer Solutions
Design
Group Problem Solving
Process Improvements
Release Management
Software Solutions
User Experience (UX) Design
Benefits
medical/prescription drug coverage (with a Health Savings Account feature)
dental and vision options
employee and spouse/child life insurance
short and long-term disability protection
401(k) with PNC match
pension and stock purchase plans
dependent care reimbursement account
back-up child/elder care
adoption, surrogacy, and doula reimbursement
educational assistance, including select programs fully paid
a robust wellness program with financial incentives
maternity and/or parental leave
up to 11 paid holidays each year
8 occasional absence days each year, unless otherwise required by law
between 15 to 25 vacation days each year, depending on career level; and years of service
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard skills
site design consultingplatform managementcapacity planningdashboard creationmonitoring systemsSLAsSLOsperformance tuningincident responsedisaster recovery