FREE ACCESS
5,000–10,000 jobs/day

See all jobs on JobTailor
Search thousands of fresh jobs every day.
Discover
- Fresh listings
- Fast filters
- No subscription required
Create a free account and start exploring right away.
Tech Stack
Tools & technologiesAWSCloudDistributed SystemsDockerGoGoogle Cloud PlatformJavaKubernetesPython
About the role
Key responsibilities & impact- Design, implement and maintain the tools and systems that support service reliability, monitoring, and alerting.
- Collaborate with other engineering teams to ensure services are designed with reliability in mind, and provide guidance on the appropriate use of tooling and automation.
- Identify opportunities to improve the reliability, scalability, and efficiency of our services and drive their implementation.
- Work with infrastructure engineers to understand the challenges they face in operating our services and develop tools and systems to help them manage these challenges.
- Participate in incident response and post-mortems to identify and address systemic issues.
- Continuously evaluate new technologies and industry best practices to improve our SRE tooling and incident response procedures.
- Gain and maintain an intimate understanding of how the critical parts of the site work (services, infrastructure, product, tools, and processes)
- Lead high-urgency incidents and mentor less-experienced engineers in effectively handling incidents.
Requirements
What you’ll need- Bachelor's degree in Computer Science or related field.
- 5+ years of experience in software engineering or SRE roles, with a focus on large scale distributed systems.
- Strong coding skills in at least one programming language, such as Java, Python, or Go.
- Experience with distributed systems and service-oriented architectures.
- Experience with cloud computing platforms such as AWS or Google Cloud Platform.
- Strong conviction in software development best practices, including version control, automated testing, and continuous integration and delivery.
- Experience with containerization technologies such as Docker and Kubernetes.
- Excellent problem-solving and analytical skills, with a strong attention to detail.
- Ability to work effectively in a fast-paced and dynamic environment.
- Fluent in English (Professional Level).
Benefits
Comp & perks- Health insurance
- Paid time off
- Flexible working hours
- Professional development opportunities
ATS Keywords
✓ Tailor your resumeApplicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
software engineeringsite reliability engineeringdistributed systemsservice-oriented architecturesJavaPythonGocloud computingAWSGoogle Cloud Platform
Soft Skills
problem-solvinganalytical skillsattention to detailmentoringcollaborationleadershipcommunicationadaptabilityfast-paced environmentincident response
Certifications
Bachelor's degree in Computer Science
