Wells Fargo

Lead Systems Operations Engineer, Commercial Corporate & Investment Bank Technology

Wells Fargo

full-time

Posted on:

Location Type: Office

Location: CharlotteNew JerseyNorth CarolinaUnited States

Visit company website

Explore more

AI Apply
Apply

Salary

💰 $119,000 - $224,000 per year

Job Level

About the role

  • Embed SRE and production engineering principles into Payments Modernization from design through early life support
  • Define and validate non-functional requirements (NFRs) covering resilience, scalability, observability, recovery, and operability
  • Drive replay, retry, and exception-handling validation for event-driven payment flows
  • Lead capacity and performance testing, including volume growth and peak event scenarios (e.g. FedNow, CHIPS, SWIFT)
  • Own Permit-to-Operate readiness across environments (NFR Testing)
  • Define cutover, shadow support, and early life support models
  • Ensure runbooks, support procedures, on-call readiness, and escalation paths are production-grade before go-live
  • Partner with Change Assurance to apply risk-based release controls, canary/blue-green strategies, and rollback automation
  • Implement end-to-end observability across Kafka, MongoDB, API layers, and downstream payment components
  • Define and monitor SLOs, error budgets, and golden signals
  • Reduce alert noise through signal design, correlation, and automation
  • Analyze early defects and exception patterns (ACK/NACKs, business errors) to drive stabilization
  • Design and execute controlled failure testing (chaos engineering) to validate recovery patterns and blast radius
  • Lead blameless RCAs, ensuring corrective actions are owned and recurrence is prevented
  • Drive continuous service improvement (CSI) initiatives, including automation, resilience uplift, and technical debt reduction

Requirements

  • 5+ years of Systems Engineering, Technology Architecture experience, or equivalent demonstrated through one or a combination of the following: work experience, training, military experience, education
  • 2+ years of application support experience
  • 2+ years of experience in Application Frameworks such as Spring Boot, Spring WebFlux
  • 2+ years of Data Stores & Caching experience with MongoDB, Redis
  • 2+ years of Platform experience with Kubernetes / container orchestration
  • 2+ years of CI/CD & Automation experience
  • Progressive delivery, automated rollback, reliability-as-code concepts
  • 2+ years Resilience: Resilience4J, retry/replay patterns
  • 2+ years Testing & Resilience Validation: BlazeMeter, Chaos Monkey
  • 2+ years Observability: Distributed tracing, metrics, logging, SLO tooling
  • Strong experience in SRE, Production Engineering, Platform Engineering, or Service Transition within a complex technology or financial services environment
  • Demonstrated ability to productionize new platforms, not just support them
  • Solid understanding of high-value payment systems (Wires, RTP, SWIFT, CHIPS, FedNow) and their operational risk profile
  • Experience working with event-driven, distributed architectures
  • Proven ability to partner with engineering teams while representing the production and operational lens
  • Comfortable operating in early-stage, ambiguous transformation environments
  • Strong communication skills, with the ability to explain technical risk to senior stakeholders.
Benefits
  • Health benefits
  • 401(k) Plan
  • Paid time off
  • Disability benefits
  • Life insurance, critical illness insurance, and accident insurance
  • Parental leave
  • Critical caregiving leave
  • Discounts and savings
  • Commuter benefits
  • Tuition reimbursement
  • Scholarships for dependent children
  • Adoption reimbursement

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard skills
Systems EngineeringTechnology ArchitectureApplication FrameworksSpring BootSpring WebFluxMongoDBRedisKubernetesCI/CDResilience4J
Soft skills
strong communication skillsability to explain technical riskpartnering with engineering teamsleadershipproblem-solvingcollaborationadaptabilityblameless RCAcontinuous service improvementoperational lens