Site Reliability Engineer

Compass Education

full-time

Posted on: 2/27/2026

Location Type: Hybrid

Location: Hawthorn • Australia

Visit company website

Explore more

DevOps Engineer jobs

✨ AI Apply

Apply

Job Level

Mid-Level Senior

Tech Stack

AWS Cloud

About the role

**What you'll do:**
**Infrastructure & Automation**
- Operate and improve our cloud infrastructure to ensure systems remain stable, scalable and secure as usage grows.
- Strengthen environment consistency and deployment safety through improved configuration and automation.
- Reduce operational toil by automating repetitive processes and improving tooling.
**Observability & Monitoring**
- Build and refine monitoring, alerting and logging to detect issues early and reduce customer impact.
- Improve dashboards and production visibility for Engineering squads.
- Raise the bar for observability before services reach production.
**Production & Incident Management**
- Participate in on-call and respond to incidents in a structured, calm manner.
- Lead lower-complexity incidents end-to-end and support higher-impact events.
- Contribute to post-incident reviews and implement systemic improvements.
**Reliability, Resilience & Risk**
- Contribute to improving service reliability targets and reducing repeat incidents.
- Support capacity planning, performance optimisation and disaster recovery readiness.
- Identify operational and security risks and contribute to preventative controls.

Requirements

**About You **
You’re a pragmatic, systems-minded engineer who stays calm under pressure and takes ownership of keeping production environments stable, secure and continuously improving.
You bring:
- 3-4+ years’ experience in Site Reliability, Platform Engineering, DevOps or similar roles, with a strong focus on production systems and operational excellence.
- Experience supporting live production environments, including participation in on-call rotations and incident response. You understand what it means to own systems that customers rely on daily.
- Confidence debugging and resolving issues under pressure, using structured problem-solving to diagnose root causes and restore service quickly.
- Experience working with cloud infrastructure (e.g. AWS or similar), including managing environments that support scalable, customer-facing applications.
- Familiarity with containerised environments and orchestration tools, and how they impact deployment, scaling and service reliability.
- Experience contributing to infrastructure management and automation, helping create consistent, repeatable environments.
- Familiarity with monitoring and alerting platforms, and an understanding of how strong observability improves reliability outcomes.
- Scripting or automation capability, with the ability to reduce manual processes and improve operational efficiency.

Benefits

**What’s in it for you?**
You’ll join a purpose-driven company at a genuinely exciting stage of growth, with the opportunity to make a real impact on education at scale.
What we offer:
- A hybrid working environment, with teams spending three days a week in our Melbourne office.
- Learning and development opportunities, including a dedicated PD budget.
- 24/7 access to our Employee Assistance Program (EAP), including face-to-face, phone and live chat support.
- A parental leave program for both primary and secondary carers.
- A supportive, inclusive culture where your voice is valued and heard.
- The chance to grow alongside a fast-moving, ambitious organisation.

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools

Site Reliability EngineeringPlatform EngineeringDevOpscloud infrastructureAWScontainer orchestrationinfrastructure managementautomationscriptingmonitoring and alerting

Soft Skills

calm under pressureownershipstructured problem-solvingincident responseoperational excellencepragmaticsystems-minded