FREE ACCESS
5,000–10,000 jobs/day

See all jobs on JobTailor
Search thousands of fresh jobs every day.
Discover
- Fresh listings
- Fast filters
- No subscription required
Create a free account and start exploring right away.

Senior Site Reliability Engineer
AuthZedSite Reliability Engineer responsible for maintaining systems reliability and performance at AuthZed. Collaborate globally while developing scalable infrastructure solutions for a cutting-edge authorization platform.
Tech Stack
Tools & technologiesCloudDockerGoGrafanaJavaKubernetesNode.jsPrometheusPythonRubySQLTerraform
About the role
Key responsibilities & impact- Design, implement, and maintain highly available and scalable infrastructure solutions for our projects, products, and customers.
- Monitor and analyze system performance, identifying and resolving bottlenecks and issues to ensure optimal performance and reliability.
- Automate infrastructure deployment and configuration management processes.
- Continuously improve system reliability, security, and efficiency through proactive monitoring, capacity planning, and performance tuning.
- Troubleshoot and resolve complex infrastructure and application issues in production and test environments.
- Collaborate with software engineering teams to design and implement systems that are resilient, scalable, and secure.
- Participate in on-call rotation and respond to production incidents in a timely manner.
- Document system configurations, troubleshooting procedures, and operational guidelines.
Requirements
What you’ll need- Proven experience as a Site Reliability Engineer or in a similar role.
- Strong understanding of networking, operating systems, and cloud infrastructure.
- Experience with Site Reliability Engineering, System Design, and Distributed Computing.
- Experience in various programming languages — we currently have SDKs for NodeJS, Java, Python, Ruby, and Go.
- Experience with containerization technologies such as Docker and Kubernetes.
- Knowledge of infrastructure-as-code tools like Terraform and Pulumi.
- Familiarity with monitoring and logging tools (e.g., Prometheus, Grafana, ELK stack).
- Experience with lower-level implementation details of relational databases (bonus if you have have experience with distributed SQL databases like Google Cloud Spanner or CockroachDB).
- Experience working with Git and GitHub.
- Experience with continuous integration and deployment systems.
- Strong problem-solving and troubleshooting skills.
- Excellent communication and collaboration abilities.
Benefits
Comp & perks- Competitive salary based on experience
- Stock options at an early-stage startup
- Comprehensive benefits including healthcare (US-based) and other insurance
- A full remote and flexible schedule to accommodate different timezones
- Twice-yearly travel for team offsites focused on team bonding, collaboration, and having fun!
ATS Keywords
✓ Tailor your resumeApplicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
Site Reliability EngineeringSystem DesignDistributed ComputingNodeJSJavaPythonRubyGoDockerKubernetes
Soft Skills
problem-solvingtroubleshootingcommunicationcollaboration