Salary
💰 $118,485 - $144,000 per year
Tech Stack
AnsibleAWSCloudCyber SecurityLinuxMicroservicesPythonTerraform
About the role
- Design, implement, and maintain systems with high availability, fault tolerance, and disaster recovery capabilities.
- Define and monitor Service Level Objectives (SLOs), Service Level Indicators (SLIs), and error budgets to balance reliability with innovation.
- Develop observability solutions (logging, monitoring, tracing, alerting) to proactively detect anomalies and mitigate risks.
- Lead incident response efforts, perform root cause analysis, and conduct blameless postmortems to drive continuous improvement.
- Automate system deployments, configuration management, and operational tasks to reduce manual intervention and human error.
- Build self-healing and auto-scaling solutions that adapt to mission demands while maintaining compliance with DoD cybersecurity requirements.
- Implement, validate, and maintain cybersecurity controls aligned with DoD 8140/8570, RMF, and NIST 800-53 standards.
- Perform vulnerability assessments, patch management, and system hardening to safeguard mission systems against evolving threats.
- Partner with software engineering, DevSecOps, and infrastructure teams to integrate reliability and cybersecurity into the development lifecycle.
- Support subcontractor and vendor evaluations, ensuring compliance with reliability, security, and DoD standards.
- Analyze system failure data, usage patterns, and mission performance metrics to identify trends and recommend improvements.
- Contribute to process optimization initiatives, quality improvements, and the adoption of new reliability and security technologies.
- Ensure all contractual deliverables are met or exceeded to customer satisfaction
- Complete personal PDP and attend Staff Meeting and Storytime (with camera on)
- Build productive and positive professional relationships with clients within the program
- Execute all contract requirements in accordance with contract-specific LCAT and requirements
- Perform other related duties as assigned
Requirements
- Clearance: Secret Clearance
- Education and Years of Experience: Bachelor's degree (or equivalent) with 8-10 years of experience, or a Master’s degree with 6-8 years of experience
- Demonstrated experience in site reliability engineering, systems engineering, or DevSecOps in secure or defense environments.
- Strong knowledge of system observability, monitoring, and incident response practices.
- Familiarity with cloud environments (AWS, DoD IL environments) and container orchestration platforms (AWS ECS).
- Proficiency in automation tools (Ansible, Terraform, CI/CD pipelines) and scripting languages (Python, Bash, PowerShell).
- Understanding of RMF, NIST SP 800-53, DISA STIGs, and related DoD cybersecurity frameworks.
- Security + certification or equivalent