FREE ACCESS
5,000–10,000 jobs/day

See all jobs on JobTailor
Search thousands of fresh jobs every day.
Discover
- Fresh listings
- Fast filters
- No subscription required
Create a free account and start exploring right away.

Lead Systems Operations Engineer
Wells Fargo. Lead complex, broad impact initiatives including provision of high-level systems consultation for the technology teams .
Tech Stack
Tools & technologiesGrafanaLinuxOpenShiftPrometheusPythonRedisSplunk
About the role
Key responsibilities & impact- Lead complex, broad impact initiatives including provision of high-level systems consultation for the technology teams
- Lead day‑to‑day Platform (REDIS, OpenShift) platform operations, including cluster maintenance, upgrades, performance monitoring, and troubleshooting
- Improving operations practices to meet new Incident SLA and improving practices during incident & problem management
- Serve as an operational lead during incidents, driving rapid diagnosis, resolution, root‑cause analysis, and long‑term corrective actions
- Develop or enhance automation (Python, Bash, GitOps workflows, or AI‑assisted tools), build AI Agents, MCP server and tools, add skill in MCP that eliminates manual effort and streamlines run processes
- Lead Platform lifecycle activities, including new cluster builds, configuration, onboarding, upgrades, and cluster decommissioning, ensuring consistency, reliability, and compliance across environments
- Partner with engineering, SRE, security, and development teams to implement repeatable operational patterns, guardrails, and platform readiness standards
- Ensure platform operations follow organizational policies, security standards, audit controls, and regulatory requirements
- Identify operational gaps, recurring issues, or inefficiencies and lead initiatives to enhance reliability, resiliency, and operational maturity.
Requirements
What you’ll need- 5+ years of Systems Engineering, equivalent demonstrated through one or a combination of the following: work experience, training, military experience, education
- 5+ years of hands-on experience in Python for platform operations automation
- 5 + years of designing and building complex observability solutions leveraging industry standard toolset and or custom-built solutions
- Strong proficiency in writing production-quality Python code by using Python libraries and client integrations
- Ability to develop automation solutions, including remediation procedures & workflows and operational tools using Python
- 3+ years of experience managing complex, enterprise-scale applications in production environments
- Extensive experience with configuration and monitoring tools such as Grafana, Splunk, and Prometheus
- Deep platform expertise, including cluster build-outs, CI/CD pipeline integration, troubleshooting, debugging, remediation, patching, upgrades, and root cause analysis (RCA)
- 2+ years of hands-on Linux system administration experience
Benefits
Comp & perks- Participation in on-call rotations
ATS Keywords
✓ Tailor your resumeApplicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
PythonBashGitOpsAI-assisted toolsMCP serverobservability solutionsproduction-quality codeLinux system administrationtroubleshootingroot cause analysis
Soft Skills
leadershipproblem managementcommunicationcollaborationoperational maturity