
Engineering Manager, Infrastructure Engineering
Whatnot
full-time
Posted on:
Location Type: Remote
Location: Poland
Visit company websiteExplore more
Tech Stack
About the role
- Lead and mentor a team of highly skilled software engineers, supporting their technical growth, execution, and long-term career development.
- Set technical direction and quality standards for the team while empowering senior ICs to own design and architecture decisions. Roll up your sleeves when needed and course-correct when technical decisions are heading in the wrong direction.
- Develop and execute the strategic roadmap for reliability engineering at Whatnot.
- Build and operationalize best practices that empower product and platform teams to design and run reliable systems, incorporating SLOs, monitoring standards, and incident response patterns into their development workflows.
- Own the strategic roadmap for reliability tooling, including incident response systems, SLO measurement platforms, and developer-facing reliability libraries, while partnering with senior engineers on architecture and design.
- Lead the team in designing and building traffic control systems (distributed semaphores, rate limiting, circuit breaking) as reusable platform components consumed across Whatnot's service fleet.
- Lead the design and execution of load testing at scale, validating resilience against sustained and bursty growth scenarios, and providing tooling that enables teams to contribute new load test scenarios.
- Drive continuous improvement in incident detection and mitigation, including early warning systems and foundational observability instrumentation.
- Collaborate with cross-functional teams to influence product and architectural decisions that improve overall reliability and customer impact.
- Partner with Infrastructure and Engineering leadership to shape reliability strategy and investment priorities across the organization.
- Build a culture of learning and continuous improvement through blameless incident analysis, proactive reliability investment, and systematic reduction of repeated failure patterns.
- Scale the team through hiring, mentorship, leadership development, and thoughtful organizational design.
Requirements
- 10+ years of experience in infrastructure or platform engineering, including 5+ years managing engineering teams, with experience leading managers or multiple teams a plus.
- You see reliability engineering as a software engineering discipline, not an operations function. You're energized by building tools and systems that scale impact across an engineering organization.
- Proven track record building and operating large-scale distributed systems with strong reliability, observability, and incident response practices.
- Deep technical grounding in one or more of: SLO design, monitoring/alerting, incident tooling, traffic control mechanisms, load and chaos testing, or platform engineering.
- Experience leading teams that ship developer-facing platforms, frameworks, or internal tools, not just operate infrastructure.
- Strong software engineering fundamentals with a passion for improving engineering practices across an organization.
- Demonstrated ability to guide teams through complex system challenges, large-scale migrations, and longer-term reliability initiatives.
- Exceptional communication and leadership skills, with the ability to influence technical direction across teams and organizations.
- A passion for enabling teams to build fast while building safely through well-designed tooling and proactive detection mechanisms.
- Experience leading multiple teams, managing managers, or serving as a site lead is a plus.
Benefits
- EOE - Whatnot is proud to be an Equal Opportunity Employer. We value diversity, and we do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, parental status, disability status, or any other status protected by local law.
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
reliability engineeringSLO designmonitoringincident toolingtraffic control mechanismsload testingchaos testingdistributed systemsobservabilityplatform engineering
Soft Skills
leadershipcommunicationmentorshipcollaborationinfluencecontinuous improvementproblem-solvingorganizational designtechnical directionteam building