
Manager, Observability
Intact
full-time
Posted on:
Location Type: Hybrid
Location: Montréal • 🇨🇦 Canada
Visit company websiteJob Level
Mid-LevelSenior
Tech Stack
AWSAzureCloudGoogle Cloud PlatformGrafanaSDLC
About the role
- Own platform operations and roadmap (Elastic, Dynatrace, Micro Focus, Grafana)
- Manage capacity, cost, performance, and security
- Govern logging, telemetry, tracing, topology, and data lifecycle/quality
- Publish standards and guardrails; ensure compliance via gating and maturity checks
- Align governance with enterprise architecture
- Manage vendor relationships in collaboration with the Director
- Build partnerships across IT, application owners, infrastructure, and SDLC stakeholders
- Coach on instrumentation, alert hygiene, dashboards, tracing, and topology
- Lead the observability community and deliver shared trainings and templates
- Communicate platform health, adoption, coverage, and outcomes
- Enrich signals with app, infra, and network data; apply anomaly detection and AI to reduce noise
- Provide reusable dashboards, alert policies, runbooks, and instrumentation patterns
- Strengthen incident response, major-incident support and contribute to post-mortems
- Implement enhancements that lower detection time and MTTR
- Automate provisioning, config-as-code, data onboarding, alerting, and visualization
- Embed observability in CI/CD and pre-release checks; promote “observability by default”
- Support SRE goals by enabling SLIs/SLOs/SLAs and improving reporting
- Manage two Agile product teams and a 24/7 Operations Center
- Develop talent and a culture of automation, reliability, and customer service
- Maintain backlog and roadmap; prioritize features and cost; drive continuous improvement and report outcomes
Requirements
- 6+ years in observability/platform engineering, with 2+ years leading ops/platform teams
- Hands-on expertise with observability tooling (Elastic Stack, Dynatrace, Grafana, Micro Focus) and pipelines for logging, metrics, tracing, and topology
- Experience building automation and self-service for observability (IaC, CI/CD, config-as-code) and integrating observability into the SDLC
- Familiarity with multi-cloud (AWS, Azure, GCP) and on-prem environments; hybrid infrastructure visibility across Canada
- Background in 24/7 operations and service management
- Strong communication, stakeholder partnership, coaching, and vendor management skills
- Experience with AI/ML anomaly detection and analytics in observability contexts
- Familiarity with SAFe/Agile product management and platform roadmaps
- Exposure to performance engineering and enterprise architecture standards
- Relevant certifications (Cloud: AWS/Azure/GCP; ITIL/Service Management; Observability/SRE/DevOps)
- Bilingual (French and English): Need to interact on a regular basis with colleagues across the country
Benefits
- Support, opportunities and performance-led financial rewards at a workplace where you can shape the future
- Policies to ensure equal access and participation for people with disabilities
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard skills
observabilityplatform engineeringautomationself-serviceIaCCI/CDconfig-as-codeanomaly detectionperformance engineeringdata lifecycle
Soft skills
communicationstakeholder partnershipcoachingvendor managementcontinuous improvementcustomer serviceteam leadershipcollaborationtrainingincident response
Certifications
AWSAzureGCPITILService ManagementObservabilitySREDevOps