FREE ACCESS
5,000–10,000 jobs/day

See all jobs on JobTailor
Search thousands of fresh jobs every day.
Discover
- Fresh listings
- Fast filters
- No subscription required
Create a free account and start exploring right away.
T
Principal Observability, Reliability Architect
Thinkahead Consultant Psychologist Pty LtdPrincipal Observability & Reliability Architect at AHEAD improving operational visibility and service reliability. Delivering on digital business transformation with strong client-facing leadership in observability initiatives.
Tech Stack
Tools & technologiesCloudITSM
About the role
Key responsibilities & impact- Lead client discovery, architecture workshops, and solution design across observability, telemetry, reliability, and operational intelligence initiatives.
- Design enterprise observability architectures spanning monitoring, logging, metrics, tracing, telemetry pipelines, alerting, event correlation, service visibility, and platform integrations.
- Define scalable standards for telemetry onboarding, naming, tagging, RBAC, service ownership, dashboards, alert governance, runbooks, and operational handoff.
- Advise on telemetry governance, including data quality, retention, access control, sampling, cardinality, and cost optimization.
- Lead modernization initiatives including tool rationalization, dashboard and alert rationalization, telemetry strategy, and migration from legacy monitoring platforms.
- Guide SRE practices including SLIs, SLOs, error budgets, production readiness, and incident response maturity.
- Design integration patterns across ITSM, CMDB, event management, and automation platforms.
- Support pursuits by shaping solution strategy, validating scope, informing estimates, and building client-facing technical narratives.
- Serve as a senior escalation point and provide architecture governance during delivery.
- Build reusable reference architectures, playbooks, and accelerators while mentoring architects, consultants, and offshore teams.
Requirements
What you’ll need- 10+ years in observability, monitoring, APM, platform operations, SRE, or related enterprise technology domains, including 5+ years leading architecture and delivery strategy for enterprise observability or reliability initiatives.
- Deep, hands-on experience designing and implementing across monitoring, logging, metrics, tracing, telemetry collection, and pipeline patterns in hybrid and multi-cloud environments.
- Strong knowledge of telemetry governance, including routing, transformation, normalization, enrichment, retention, access control, and cost management.
- Experience defining enterprise standards for dashboards, alerts, tagging, naming, service ownership, RBAC, and operating model adoption.
- Strong command of incident response, event correlation, alert strategy, service health, and business-service visibility, plus applied SRE concepts including SLIs, SLOs, error budgets, and production readiness.
- Ability to lead executive and technical workshops and translate business needs into actionable architecture and delivery plans.
- Consulting or professional services experience with strong client-facing communication, estimation, risk management, and cross-functional leadership.
Benefits
Comp & perks- Medical, Dental, and Vision Insurance
- 401(k)
- Paid company holidays
- Paid time off
- Paid parental and caregiver leave
- Plus more! See benefits https://www.aheadbenefits.com/ for additional details.
ATS Keywords
✓ Tailor your resumeApplicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
observabilitymonitoringAPMplatform operationsSREtelemetry collectionloggingmetricstracingpipeline patterns
Soft Skills
client-facing communicationcross-functional leadershiprisk managementestimationexecutive workshop facilitationtechnical workshop facilitationmentoringsolution strategyarchitecture governancedelivery strategy