Tech Stack
AnsibleCloudGrafanaPrometheusServiceNowSplunkTerraform
About the role
- Translate business and application needs into infrastructure solutions across cloud, on-prem, or hybrid environments
- Contribute to and evolve Permit to Design / Build / Operate processes across critical infrastructure engagements
- Partner with delivery and product teams to embed NFRs (e.g., observability, resiliency, scalability, supportability) into infrastructure patterns
- Build and extend Infrastructure-as-Code (e.g., Terraform, Ansible) and participate in CI/CD pipelines
- Work with SRE teams to define SLOs/SLIs, logging/monitoring requirements, and automated DR validation
- Engage in architecture reviews, provide feedback on operational readiness, and help drive automation-based solutions
- Support major initiatives including cloud migration, network modernization, and resiliency uplift
- Perform problem tracking, diagnosis and root-cause analysis, replication, troubleshooting, and resolution for highly complex issues
- Oversee others who perform programming and debugging activities and mentor less experienced teammates
- Respond to incidents or service tickets in a timely manner and provide technical consultation on extremely challenging situations
- May lead large, complex projects, engage and manage external vendors, and have people management responsibilities
Requirements
- Bachelor's degree and eight years of experience in development or production support or an equivalent combination of education and work experience
- Deep specialized and/or broad functional knowledge
- Sound understanding of business and organizational strategies and processes
- Ability to interpret internal and external business challenges and recommend best practices
- Ability to lead complex projects
- Sophisticated analytical skills and the ability to solve complex technical and business problems
- Ability to influence others at senior levels to adopt a new perspective
- Preferred: 8+ years of experience in infrastructure engineering, DevOps, or SRE roles
- Strong knowledge of infrastructure domains: compute, storage, networking, virtualization, containerization
- Experience with automation tools such as Terraform, Ansible, GitLab CI/CD, or similar
- Familiarity with non-functional requirements (NFRs) and their impact on system design and operations
- Strong understanding of release management, infrastructure readiness, and run-time reliability
- Experience contributing to operational readiness, including monitoring, support models, and DR
- Proven ability to collaborate across disciplines, lead solution discussions, and mentor junior engineers
- Excellent communication skills and a bias toward action and ownership
- Exposure to frameworks like Permit to Operate, Design Reviews, or Run Readiness models
- Experience working in a regulated industry (e.g., financial services, healthcare, utilities)
- Hands-on with Grafana, Prometheus, AppDynamics, Dynatrace, ThousandEyes, Splunk, ServiceNow, or related tools
- Language Fluency: English (Required)
- Work authorization: Truist will not sponsor applicants for work visa status or employment authorization
- All regular teammates (not temporary or contingent workers) working 20 hours or more per week are eligible for benefits
- Medical, dental, vision
- Life insurance
- Disability insurance
- Accidental death and dismemberment
- Tax-preferred savings accounts
- 401k plan
- No less than 10 days of vacation (prorated during first year)
- 10 sick days (prorated)
- Paid holidays
- Depending on position/division: eligibility for defined benefit pension plan, restricted stock units, and/or a deferred compensation plan
ATS Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard skills
Infrastructure-as-CodeTerraformAnsibleCI/CDSLOsSLIsautomationtroubleshootingnetwork modernizationcloud migration
Soft skills
analytical skillsproblem-solvingleadershipinfluencecollaborationcommunicationmentoringownershipaction-orientedfeedback