FREE ACCESS
5,000–10,000 jobs/day

See all jobs on JobTailor
Search thousands of fresh jobs every day.
Discover
- Fresh listings
- Fast filters
- No subscription required
Create a free account and start exploring right away.

Senior Site Reliability Engineer – Government, Sovereign Cloud
Veeam SoftwareSenior Site Reliability Engineer for Veeam's Government & Sovereign Cloud environments. Building a global SRE function with an emphasis on high availability and operational excellence.
Posted 4/27/2026full-timeRemote • California • 🇺🇸 United StatesSenior💰 $138,900 - $231,400 per yearWebsite
Tech Stack
Tools & technologiesAWSAzureCloudDaggerDistributed SystemsGoGrafanaJavaJavaScriptKubernetesPrometheusTerraformTypeScript
About the role
Key responsibilities & impact- Get up to speed on the full platform — all VDC workloads, dependencies, and risk areas. Much of this will happen through code, docs, and conversations rather than direct environment access.
- Work with SMEs across the org to fill knowledge gaps and build onboarding material for the team.
- Write and maintain runbooks, architecture docs, and operational guides.
- Design infrastructure for high availability and fault tolerance on Azure (including Azure Government).
- Define SLIs, SLOs, and error budgets where none exist today.
- Run incident response and blameless postmortems. Turn incidents into improvements.
- Identify reliability risks across modern and legacy workloads and build practical remediation plans that work within compliance constraints.
- Close observability gaps — define instrumentation requirements and drive implementation.
- Set alerting, telemetry, and monitoring standards with partner teams.
- Build automation to reduce toil and support fleet management.
- Participate in on-call rotations.
- Work with IaC, CI/CD, deployment automation, and config management — including in air-gapped or compliance-restricted environments.
- Build and maintain testing, canary deployment, and release validation pipelines.
- Integrate chaos engineering and monitoring tools, adapting choices to meet regulatory requirements.
- Work across product, platform, security, legal, compliance, and operations teams.
- Own problems end-to-end — identify gaps, drive solutions, don't wait for direction.
- Mentor other engineers and help spread SRE practices across the org.
Requirements
What you’ll need- 7+ years in Software Engineering, with 3+ years in SRE, Platform Engineering, or similar — across multi-service platforms, not just single-service environments.
- Experience with Government or Sovereign Cloud (e.g., Azure Government, AWS GovCloud).
- Experience in regulated compliance environments — government (FedRAMP, CMMC, IL2/IL4/IL5), financial (PCI-DSS, SOX), or healthcare (HIPAA, HITRUST). You understand how compliance shapes architecture and operations.
- Strong experience building and running production services on cloud infrastructure (Azure preferred, including Azure Government).
- Able to learn large, complex platforms quickly with limited guidance — comfortable building understanding from code, docs, and architecture artifacts when direct environment access is restricted.
- Can investigate systems independently and produce clear docs, risk assessments, and improvement plans.
- Comfortable working across teams — engineering, product, security, compliance, operations.
- Programming skills in one or more of: TypeScript/JS, Go, Java, C#, or similar.
- Experience with monitoring and observability tools (e.g., Prometheus, Grafana, OpenTelemetry, ELK stack).
- Experience with IaC (Terraform, Terragrunt, Pulumi) and container orchestration (Kubernetes).
- Experience with CI/CD and GitOps tooling — GitHub Actions, Azure DevOps, GitLab CI, ArgoCD, FluxCD, or Dagger.
- Solid grasp of distributed systems, networking, and cloud-native architecture.
- Clear written and verbal communication skills.
Benefits
Comp & perks- Unlimited paid time off, 12 paid holidays, plus 4 extra global VeeaMe Days for self-care and 24 paid volunteer hours annually through Veeam Cares
- Paid parental leave: 8 weeks for all parents, 16 weeks for birthing parents
- Medical, dental, and vision coverage starting on your first day
- Mental health support, therapy sessions, and digital wellness tools via our Employee Assistance Program
- 401(k) retirement plan with company matching contributions
- Fertility, adoption, and surrogacy support through Maven, plus paid volunteer time
- AirVet: 24/7 virtual veterinary care at no cost
- Legal services, identity protection, and supplemental health insurance options
- Tax-advantaged spending accounts for healthcare, dependent care, and commuting
- Opportunities to learn and grow through on-demand libraries (LinkedIn Learning, O’Reilly), mentoring, workshops, and learning events like our annual Global Day of Learning
ATS Keywords
✓ Tailor your resumeApplicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
Software EngineeringSite Reliability Engineering (SRE)Platform EngineeringCloud InfrastructureProgramming (TypeScript, JavaScript, Go, Java, C#)Infrastructure as Code (IaC)Continuous Integration/Continuous Deployment (CI/CD)Distributed SystemsMonitoring and ObservabilityChaos Engineering
Soft Skills
Clear CommunicationMentoringProblem SolvingCollaborationIndependent InvestigationDocumentationRisk AssessmentAdaptabilityLearning AgilityTeamwork
Certifications
FedRAMPCMMCPCI-DSSSOXHIPAAHITRUST