Tech Stack
JavaJavaScriptNode.jsPythonRubyRuby on RailsSpringSQL
About the role
- Lead and mentor a team of Site Reliability Engineers, fostering a culture of continuous improvement and innovation.
- Collaborate with cross-functional teams to ensure alignment on reliability and performance goals.
- Conduct reliability reviews to identify areas for improvement and implement solutions to enhance system reliability.
- Implement and promote performance engineering practices to ensure optimal system performance.
- Develop and execute strategies for destructive testing to identify potential points of failure and improve system resilience.
- Oversee production engineering efforts to ensure systems are designed for operational excellence and reliability.
- Provide leadership in incident management and root cause analysis to resolve production issues and prevent recurrence.
- Establish and maintain operational support practices, including monitoring, alerting, and incident response.
- Drive continuous improvement initiatives in reliability, performance, and operational support.
- Stay current with industry trends and best practices to ensure our systems and processes remain cutting-edge.
Requirements
- Must be eighteen years of age or older.
- Must be legally permitted to work in the United States.
- Mastery of an object oriented programming language (preferably Java)
- Proven experience in reliability reviews, performance engineering, and destructive testing.
- Strong understanding of production engineering and operational support practices.
- Experience with supply chain systems and retail environments is a plus.
- Excellent leadership and team management skills.
- Strong problem-solving and analytical abilities.
- Excellent communication and collaboration skills.
- 6-8 years of relevant work experience
- Mastery of a modern scripting language (preferably Python)
- Mastery of a modern web application framework such as Ruby on Rails, Spring MVC, and Node.js
- Mastery of writing SQL queries against a relational database
- Mastery of modern product development processes and pipelines
- Proficient in effective troubleshooting and issue resolution techniques
- Proficient in effective system monitoring and log analysis techniques
- Capable of understanding complicated systems quickly
- Proficiency in guiding more junior team members through Software Engineering fundamentals in a professional setting
- Proficient managing and growing team members in a professional setting
- Proficient balancing workloads across teams
- Experience managing vendor relationships
- Experience with translating high level strategy to tactical execution
- Health care benefits
- 401K
- ESPP
- Paid time off
- Success sharing bonus
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard skills
JavaPythonRuby on RailsSpring MVCNode.jsSQLperformance engineeringdestructive testingproduction engineeringsystem monitoring
Soft skills
leadershipteam managementproblem-solvinganalytical abilitiescommunicationcollaborationmentoringworkload balancingincident managementroot cause analysis