Apply

Ready to go for it?

AI Apply speeds things up—apply directly if you prefer.

FREE ACCESS
5,000–10,000 jobs/day
JobTailor Logo

See all jobs on JobTailor

Search thousands of fresh jobs every day.

Discover
  • Fresh listings
  • Fast filters
  • No subscription required
Create a free account and start exploring right away.
Deepwatch

Manager, Site Reliability Engineering

Deepwatch

Manager of Site Reliability Engineering at Deepwatch leading a high-caliber SRE team. Overseeing cloud architecture and promoting DevOps excellence in a cybersecurity context.

Posted 4/29/2026full-timeTampa • Florida • 🇺🇸 United StatesSeniorLead💰 $178,000 - $213,000 per yearWebsite

Tech Stack

Tools & technologies
AWSCloudCyber SecurityDockerGoogle Cloud PlatformKubernetesPythonTerraform

About the role

Key responsibilities & impact
  • Lead and grow the SRE team, setting direction, mentoring and managing engineers, and fostering excellence.
  • Design and manage cloud and containerized infrastructure with IaC (Terraform).
  • Implement robust CI/CD pipelines integrating security and compliance.
  • Build scalable observability systems, leading the definition of SLIs / SLOs and dashboards.
  • Manage incident response, root cause analysis, and postmortems; automate recovery via playbooks/runbooks.
  • Drive capacity planning, performance tuning, and cost efficiency.
  • Collaborate with InfoSec, DevSecOps, and Compliance teams—ensuring alignment with frameworks like FedRAMP, NIST, RMF.
  • Support program-level initiatives, communicating effectively with stakeholders.
  • Promote a culture of reliability, security, and developer efficiency.
  • Maintain an active 'player' role, dedicating approximately 75% of your time to hands-on engineering (design, coding, and architecture) and 25% to leadership, mentorship, and management.

Requirements

What you’ll need
  • 8+ years in SRE, DevOps, or Platform Engineering; with technical leadership experience ready to step into management as a player/coach.
  • Proven cloud experience (AWS, GCP) and container orchestration (Kubernetes, Docker).
  • Strong coding/scripting (Python, GO) and proficiency in IaC and GitOps.
  • Deep knowledge of observability tools and defining reliability metrics.
  • Experienced in incident handling (PagerDuty, Datadog) and post-incident evaluations.
  • Demonstrated success in mentoring and developing junior/mid-level SRE talent, moving beyond delegation to hands-on technical coaching.
  • Familiarity with regulatory or cybersecurity frameworks (FedRAMP, NIST, STIGs, RMF).
  • Excellent cross-functional communication and stakeholder management.
  • Preferred: certifications such as AWS, CKA, or cyber security credentials (e.g., OSCP).

Benefits

Comp & perks
  • Medical, dental, vision, and disability insurance
  • Flexible Time Off (FTO), 12 company holidays, sick leave and 8-Weeks Paid Parental Leave
  • Unique professional development benefits with Annual “development dollars” to support our people growth and development
  • Wellness contests and monthly educational programs
  • 401(K) retirement program

ATS Keywords

✓ Tailor your resume
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
cloud infrastructurecontainer orchestrationIaCCI/CD pipelinesobservability toolscodingscriptingincident handlingreliability metricsperformance tuning
Soft Skills
leadershipmentoringcommunicationstakeholder managementcollaborationcoachingteam managementincident responseproblem-solvingcapacity planning
Certifications
AWSCKAOSCP