Apply

Ready to go for it?

AI Apply speeds things up—apply directly if you prefer.

FREE ACCESS
5,000–10,000 jobs/day
JobTailor Logo

See all jobs on JobTailor

Search thousands of fresh jobs every day.

Discover
  • Fresh listings
  • Fast filters
  • No subscription required
Create a free account and start exploring right away.
Navy Federal Credit Union

Platform Engineer – Site Reliability Engineering, SRE

Navy Federal Credit Union

Platform Engineer responsible for building and managing scalable platforms for application development at Navy Federal Credit Union. Requires expertise in CI/CD, container technologies, and cloud infrastructure.

Posted 5/20/2026full-timeVienna • Florida, Virginia • 🇺🇸 United StatesMid-LevelSenior💰 $78,400 - $123,200 per yearWebsite

Tech Stack

Tools & technologies
AnsibleAWSAzureCloudDockerFluxGoGoogle Cloud PlatformGrafanaKubernetesNode.jsOpenShiftPrometheusPythonSplunkTerraformVault

About the role

Key responsibilities & impact
  • Manage and automate OCP (OpenShift Container Platform) and ARO (Azure Red Hat OpenShift) cluster upgrades, ensuring alignment with upstream releases and zero-downtime rolling updates
  • Configure tools like OADP (OpenShift API for Data Protection) to handle backup, restore, and failover across on-prem OCP and ARO regions
  • Monitor compute, storage, and networking capacity to prevent bottlenecks in hybrid Kubernetes environments
  • Define Service Level Indicators (SLIs) and Service Level Objectives (SLOs) specifically for cluster control planes and underlying cloud resources.
  • Use and tune frameworks (e.g., Prometheus, Grafana) to surface actionable alerts and track core signals like latency, traffic, errors, and saturation
  • Participate in on-call rotations to triage cluster-level issues (e.g., node evictions, Distributed ETC Directory (ETCD) latency, certificate expirations)
  • Manage continuous deployment pipelines using ArgoCD or Flux to ensure declarative, reproducible configurations for all cluster workloads
  • Abstract underlying Kubernetes complexities through self-service portals, standardized namespaces, and automated CI/CD guardrails
  • Consult with development teams to resolve deployment failures, routing, and pod-crash issues
  • Enforce Role-Based Access Control (RBAC) across environments, integrating with Active Directory or Azure Entra ID
  • Implement platform-level security guardrails using OpenShift policies and Open Policy Agent (OPA) to automatically detect configuration drifts or vulnerabilities
  • Integrate enterprise vaults or Azure Key Vault seamlessly with native OpenShift secrets
  • Treat the platform as software by managing ARO clusters and underlying Azure resources using Terraform or Bicep
  • Write automation scripts (Python, Go, or Ansible) to streamline common Day 2 configuration tasks like deploying custom operators, storage classes (e.g., ODF), and monitoring add-ons
  • Collaborate with development, security, and operations teams to improve platform reliability and developer experience using iterative Agile processes and practices
  • Create and maintain technical documentation, standards, and operational procedures using Docs as Code
  • Support continuous improvement initiatives focused on scalability, resiliency, and automation of both on-prem and cloud environments

Requirements

What you’ll need
  • Bachelor’s degree in Computer Science, Information Technology, Engineering, or equivalent experience
  • 5+ years of experience in platform engineering, DevOps, Site Reliability Engineering (SRE), or related roles
  • Hands-on experience administering and supporting OpenShift or Kubernetes environments
  • Experience implementing GitOps workflows using Argo, Flux
  • Strong experience building CI/CD pipelines with Tekton and GitHub Actions
  • Experience with container technologies such as Docker and Kubernetes
  • Proficiency with Infrastructure as Code tools such as Terraform, Ansible, or similar
  • Strong scripting skills in Bash, Python, or similar languages
  • Experience with cloud platforms such as Amazon Web Services (AWS), Azure, or Google Cloud Platform
  • Familiarity with monitoring and observability tools such as Prometheus, Grafana, ELK, or Splunk
  • Strong problem-solving, communication, and collaboration skills.

Benefits

Comp & perks
  • FORTUNE* *100 Best Companies to Work For® 2025
  • Yello and WayUp Top 100 Internship Programs
  • Computerworld® Best Places to Work in IT
  • Newsweek Most Loved Workplaces
  • 2025 PEOPLE® Companies That Care
  • Newsweek Most Trustworthy Companies in America
  • Military Times 2025 Best for Vets Employers
  • Best Companies for Latinos to Work for 2025
  • Forbes® 2025 America’s Best Large Employers
  • Forbes® 2025 America's Best Employers for New Grads
  • Forbes® 2025 America's Best Employers for Tech Workers
  • 2025 RippleMatch Campus Forward Award Winner for Overall Excellence
  • Military.com Top Military Spouse Employers 2025
  • 2025 Handshake Early Talent Award

ATS Keywords

✓ Tailor your resume
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
OpenShiftKubernetesTerraformAnsiblePythonGoBashGitOpsCI/CDDocker
Soft Skills
problem-solvingcommunicationcollaboration