General Dynamics Information Technology

Cloud DevOps Engineer – Senior Reliability Engineer

General Dynamics Information Technology

full-time

Posted on:

Location Type: Remote

Location: United States

Visit company website

Explore more

AI Apply
Apply

Salary

💰 $123,250 - $166,750 per year

Job Level

About the role

  • Ensure operational stability, availability, performance, and scalability of cloud-hosted systems across production and development environments supporting multiple agile teams
  • Provide real-time monitoring, alerting, incident response, and health checks for infrastructure and applications across all cloud layers (OS, app, DB)
  • Implement and maintain dashboards, visualizations, and reports for system health, event management, and cost optimization using native CSP tools
  • Manage cloud resource thresholds and automate capacity planning, forecasting, and resource optimization strategies
  • Perform incident and event management (SIEM) operations, and support issue diagnosis, resolution, and reporting including RCA documentation
  • Track, document, and report monthly issues, including system performance, stability, ticket volumes, and time-to-resolution metrics
  • Monitor resource utilization (CPU, memory, disk space) across all deployed VMs, containers, and PaaS components
  • Contribute to the implementation of the Enterprise FinOps framework, including forecasting, budget control, and right-sizing analysis
  • Support deployment automation and ensure systems are resilient, repeatable, and scalable via Infrastructure as Code (IaC)
  • Integrate operations with DevSecOps, MLOps, and CI/CD pipelines for seamless deployment and management
  • Execute daily or agreed frequency system health checks and maintain operational Runbooks and SOPs

Requirements

  • 5+ years experience in IT system engineering, systems development, systems coding and programming
  • Deep expertise with AWS services, including monitoring, logging, compute, storage, and networking
  • Proficiency in Infrastructure as Code (IaC) tools like Terraform, AWS CloudFormation
  • Hands-on experience with monitoring and APM tools such as CloudWatch, Datadog, Prometheus, Grafana, New Relic, etc.
  • Solid understanding of incident response, change management, and ITIL-based operational support
  • Familiarity with CI/CD toolchains and automation platforms (Jenkins, GitHub Actions, GitLab, ArgoCD)
  • Strong scripting skills (Python, PowerShell, Bash) for automation and orchestration
  • Advanced experienced in providing DevSecOps implementation using GitOps, or similar tools
  • Experienced in developing, testing, and maintaining containerized applications
  • Expert knowledge of source version control, build/release tools and methodologies, CI/CD pipelines and the Software Build process for large enterprises that consists of a large number of complex applications
  • Experience with FinOps practices, cost modeling, forecasting, and optimization tools within cloud platforms
  • Understanding of federal compliance and security frameworks (e.g., FedRAMP, NIST, JISF Rev 5)
Benefits
  • 📊 Check your resume score for this job Improve your chances of getting an interview by checking your resume score before you apply. Check Resume Score
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
AWSInfrastructure as CodeTerraformAWS CloudFormationPythonPowerShellBashCI/CDDevSecOpsFinOps
Soft Skills
incident responsechange managementoperational supportautomationorchestrationproblem-solvingcommunicationcollaborationreportingdocumentation