FREE ACCESS
5,000–10,000 jobs/day

See all jobs on JobTailor
Search thousands of fresh jobs every day.
Discover
- Fresh listings
- Fast filters
- No subscription required
Create a free account and start exploring right away.

Lead Kafka Platform Engineer
Wells FargoLead Platform Engineer at Wells Fargo, specializing in scalable Kafka platforms on OpenShift. Overseeing design, deployment, and optimization of event streaming solutions.
Posted 6/17/2026full-timeIrving • New Jersey, Texas • 🇺🇸 United StatesSenior💰 $119,000 - $224,000 per yearWebsite
Tech Stack
Tools & technologiesApacheDistributed SystemsDNSKafkaKubernetesNode.jsOpenShiftZookeeper
About the role
Key responsibilities & impact- Lead complex initiatives to design and deliver Confluent Kafka platforms on OpenShift (OCP), enabling scalable, resilient, and secure event streaming solutions for enterprise applications
- Design, build, deploy, and maintain Kafka infrastructure on OCP using operator-based frameworks, supporting components such as brokers, KRaft controllers, Schema Registry, Kafka Connect, Fink and Control Center
- Drive continuous improvement and modernization efforts, including platform upgrades, automation, and performance optimization across Kafka and OCP environments
- Evaluate and integrate Kafka ecosystem tools and OCP-native capabilities, ensuring alignment with enterprise architecture standards and target-state platform design
- Develop and maintain automation frameworks using CI/CD pipelines and Infrastructure-as-Code to standardize Kafka cluster provisioning, configuration, and lifecycle management
- Architect and implement high availability and disaster recovery solutions, including cross-data center deployments, replication strategies, and cluster linking for multi-region resilience
- Define and enforce platform governance standards, including naming conventions, topic management, schema governance, security policies, and data isolation strategies
- Analyze and resolve high-impact incidents, performing root cause analysis and implementing corrective actions to improve reliability and prevent recurrence
- Make key technical decisions on Kafka architecture and OCP deployment models, including cluster topology, storage integration, networking, and workload placement
- Establish and manage operational risk and control processes, ensuring compliance with enterprise security, regulatory, and audit requirements
- Optimize platform performance and cost through capacity planning, resource utilization tuning, and workload distribution strategies
- Collaborate with application teams to support onboarding, migration, and adoption of Kafka, enabling self-service capabilities and best practices
- Partner with internal platform teams (OCP, networking, security) and external vendors to drive platform delivery, resolve issues, and influence roadmap priorities
Requirements
What you’ll need- 5+ years of experience in distributed systems, platform engineering, or infrastructure engineering, with a strong focus on event streaming platforms
- Hands-on experience with Apache Kafka / Confluent Platform, including brokers, KRaft/Zookeeper, Schema Registry, Kafka Connect, and related ecosystem components
- Strong experience working with container platforms such as OpenShift (OCP) or Kubernetes, including operators, namespaces, networking, and storage integration
- Proven experience designing and operating highly available, multi-node or multi-data-center distributed systems, including replication, fault tolerance, and disaster recovery strategies
- Experience with automation and Infrastructure-as-Code, including CI/CD pipelines, configuration management tools, and declarative deployment models
- Solid understanding of networking, DNS, load balancing, and secure service exposure in containerized environments
- Strong knowledge of security practices, including TLS/mTLS, authentication, authorization (RBAC), and certificate lifecycle management in enterprise environments
- Experience with observability and monitoring tools, including metrics, logging, alerting, and performance tuning of distributed platforms
- Proven ability to troubleshoot complex platform issues, perform root cause analysis, and implement long-term solutions
- Strong understanding of capacity planning, performance optimization, and resource utilization for large-scale platforms
- Experience working in regulated enterprise environments, with knowledge of risk management, controls, and compliance requirements
- Excellent collaboration and communication skills, with the ability to work across engineering teams, infrastructure teams, and external vendors.
Benefits
Comp & perks- Health benefits
- 401(k) Plan
- Paid time off
- Disability benefits
- Life insurance, critical illness insurance, and accident insurance
- Parental leave
- Critical caregiving leave
- Discounts and savings
- Commuter benefits
- Tuition reimbursement
- Scholarships for dependent children
- Adoption reimbursement
ATS Keywords
✓ Tailor your resumeApplicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
Apache KafkaConfluent PlatformOpenShiftKubernetesInfrastructure-as-CodeCI/CD pipelinesdisaster recoveryperformance optimizationnetworkingsecurity practices
Soft Skills
collaborationcommunicationtroubleshootingroot cause analysiscapacity planningperformance tuningrisk managementcontrolscompliancedecision making