PANTHERx Rare Pharmacy

Site Reliability Engineer

PANTHERx Rare Pharmacy

full-time

Posted on:

Location Type: Hybrid

Location: Pittsburgh • Pennsylvania • 🇺🇸 United States

Visit company website
AI Apply
Apply

Job Level

Mid-LevelSenior

Tech Stack

AzureCloudPrometheusPythonSplunk

About the role

  • The Site Reliability Engineer (SRE) will lead the implementation and management of observability, monitoring, and reliability practices across our hybrid infrastructure.
  • This role requires hands-on expertise with Datadog or similar observability platforms, strong Azure administration skills, and a deep understanding of incident response and system performance.
  • The SRE will work closely with Infrastructure, Support, and Application teams to ensure high availability and operational excellence across on-prem and cloud environments.
  • Designs, implements, and manages observability solutions using Datadog or equivalent platforms.
  • Develops and maintains monitoring dashboards, alerts, and telemetry pipelines for critical systems.
  • Leads incident response efforts, including root cause analysis and postmortem documentation.
  • Collaborates with Infrastructure and Application teams to improve system reliability and performance.
  • Supports Azure administration tasks including resource monitoring, performance tuning, and cost optimization.
  • Defines and enforces best practices for system health, uptime, and scalability.
  • Contributes to automation of operational tasks and reliability improvements.
  • Documents observability standards, incident workflows, and operational runbooks.

Requirements

  • Bachelor’s degree in Computer Science, Information Technology, or equivalent.
  • Minimum of five (5) years of experience in Site Reliability Engineering, Infrastructure Monitoring, or DevOps.
  • Proficiency with Datadog or similar observability platforms (e.g., Prometheus, New Relic, Splunk).
  • Strong Azure administration experience including monitoring, resource management, and automation.
  • Solid understanding of on-prem infrastructure and hybrid cloud environments.
  • Experience with incident response, RCA, and operational documentation.
  • Strong scripting skills (e.g., PowerShell, Python) for automation and integration.
  • Excellent communication and collaboration skills across technical teams.
Benefits
  • Hybrid, remote and flexible on-site work schedules are available, based on the position.
  • Excellent benefit package, including but not limited to medical, dental, vision, health savings and flexible spending accounts
  • 401K with employer matching
  • Employer-paid life insurance and short/long term disability coverage
  • Employee Assistance Program
  • Generous paid time off is also available to all full-time employees
  • Limited paid time off for part-time employees
  • Paid holidays

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard skills
Site Reliability EngineeringInfrastructure MonitoringDevOpsDatadogAzure administrationincident responseroot cause analysisscriptingPowerShellPython
Soft skills
communicationcollaboration
Certifications
Bachelor’s degree in Computer ScienceBachelor’s degree in Information Technology
PNC

Senior Site Reliability Engineer – Adobe AEM/AEP

PNC
Seniorfull-time$98k–$133k / yearOhio, Pennsylvania · 🇺🇸 United States
Posted: 1 day agoSource: pnc.wd5.myworkdayjobs.com
Clarivate

DevOps Engineer

Clarivate
Mid · Seniorfull-timeMissouri, Pennsylvania, Virginia · 🇺🇸 United States
Posted: 11 days agoSource: clarivate.wd3.myworkdayjobs.com
AWSCloudEC2PythonTerraform
Clarivate

DevOps Engineer

Clarivate
Mid · Seniorfull-timeMissouri, Pennsylvania, Virginia · 🇺🇸 United States
Posted: 11 days agoSource: clarivate.wd3.myworkdayjobs.com
AWSCloudEC2PythonTerraform