Apply

Ready to go for it?

AI Apply speeds things up—apply directly if you prefer.

FREE ACCESS
5,000–10,000 jobs/day
JobTailor Logo

See all jobs on JobTailor

Search thousands of fresh jobs every day.

Discover
  • Fresh listings
  • Fast filters
  • No subscription required
Create a free account and start exploring right away.
Kohl's

Senior Reliability Engineer

Kohl's

Senior Reliability Engineer ensuring the resilience and availability of Kohl's systems. Collaborating with development teams and implementing robust monitoring and failover mechanisms.

Posted 6/4/2026full-timeRemote • 🇺🇸 United StatesSeniorWebsite

Tech Stack

Tools & technologies
AWSAzureCloudGoGoogle Cloud PlatformGrafanaJavaJavaScriptNode.jsPrometheusPython

About the role

Key responsibilities & impact
  • Ensure the resilience and availability of Kohl’s systems and applications
  • Collaborate closely with development teams
  • Contribute to architectural designs
  • Conduct risk assessments and design for failure
  • Implement robust monitoring and failover mechanisms
  • Drive error budget and Service Level Objective (SLO) adoption across products
  • Drive incident response efforts, perform root cause analysis and implement preventative measures to enhance system reliability
  • Establish consistent practices that elevate Kohl’s operational excellence through automation and process improvements
  • Follow software lifecycle and drive reliability, observability, and efficiency across product teams within an assigned domain
  • Identify repeated toil and find opportunities for automation and risk reduction
  • On-call on a rotation to respond to production incidents and conduct blameless retros and root-cause analyses (RCAs) to drive a culture of continuous improvements
  • Proactively identify failures before they cause outages using chaos engineering techniques such as edge cases, failure modes and design review
  • Advise on capacity planning and provide continuous assessments on systems behavior and consumption
  • Work with product managers to identify and prioritize work for reliability best practices (i.e., leveraging SLIs/SLOs/Error Budgets)
  • Mentor and assist engineers on the team

Requirements

What you’ll need
  • Bachelor's Degree or equivalent in MIS, Computer Science or related field
  • 4+ years of experience in software development
  • Strong programming skills in one or more languages (Java, Python, Go or Node.js)
  • In-depth knowledge of systems architecture, operating system internals and network fundamentals
  • In-depth knowledge of application design patterns, event-driven architecture, database schemas, and testing strategies
  • Experience with multi-region application troubleshooting and performance tuning
  • Working experience with one cloud platform (GCP, AWS, or Azure)
  • Working experience with monitoring techniques and tools (e.g., CloudWatch, Grafana, Prometheus, OpenTelemetry, Tracing)

Benefits

Comp & perks
  • Health insurance
  • Retirement plans
  • Paid time off
  • Flexible work arrangements
  • Professional development

ATS Keywords

✓ Tailor your resume
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
JavaPythonGoNode.jssystems architectureoperating system internalsnetwork fundamentalsapplication design patternsevent-driven architecturedatabase schemas
Soft Skills
collaborationmentoringincident responseroot cause analysisprocess improvementautomationcapacity planningcontinuous improvementcommunicationleadership
Certifications
Bachelor's Degree in MISBachelor's Degree in Computer Science