FREE ACCESS
5,000–10,000 jobs/day

See all jobs on JobTailor
Search thousands of fresh jobs every day.
Discover
- Fresh listings
- Fast filters
- No subscription required
Create a free account and start exploring right away.

SRE/DevOps Engineer
VersanaSRE/DevOps Engineer at Versana improving cloud observability and efficiency in loan market technologies. Collaborating with teams to enhance system reliability and monitoring practices.
Tech Stack
Tools & technologiesAWSAzureCloudDockerElasticSearchGoogle Cloud PlatformGrafanaJenkinsKafkaKubernetesLinuxTerraform
About the role
Key responsibilities & impact- Design, implement and enhance system observability and monitoring tools
- Monitor system performance, create incident response plans, and implement observability practices to gain insights into system behavior.
- Implement and monitor service-level objectives (SLOs) and indicators.
- Improve system reliability and resiliency.
- Conduct post-incident reviews and implement necessary changes to prevent system failures.
- Assist teams in implementing observability tools and leveraging available telemetry data to troubleshoot and resolve incidents and problems.
- Leverage observability and event management to improve key incident management metrics, such as mean time to detect and mean time to restore services.
- Continually optimize systems and workflows by improving architecture, infrastructure, automation, CI/CD, and observability.
- Collaborate with developers to ensure applications are designed with DevOps best practices in mind.
- Participate in a rotating on-call schedule for weekend releases and being available to respond to production issues outside of regular working hours, including weekends and holidays.
Requirements
What you’ll need- 5+ years of experience as a Site Reliability Engineer or similar role.
- 3+ years of work experience with public cloud (Azure, AWS or GCP).
- 3+ years of direct experience with observability tools like Datadog, Elasticsearch, and Grafana Labs, etc.
- 3+ years of experience with containerization and orchestration technologies like Docker and Kubernetes.
- 2+ years of experience in development and management of CI/CD pipelines (e.g., Azure DevOps, Gitlab CI/CD, Github Actions, Jenkins, etc).
- 2+ years of experience with Infrastructure-as-code tools like Terraform, Azure Bicep, Cloud Formation, etc.
- 1+ years of experience with site reliability tools like Gremlin, Chaos Mesh, or similar.
- Proven track record leveraging core observability concepts, end-user monitoring, and infrastructure monitoring with SaaS solutions.
- Experience with messaging services like Kafka or Azure Event Hubs.
- Good understanding of the Linux operating system.
Benefits
Comp & perks- Equal Opportunity Employer
- Health insurance
- Professional development opportunities
ATS Keywords
✓ Tailor your resumeApplicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
system observabilitymonitoring toolsservice-level objectivessystem reliabilityincident response plansCI/CD pipelinesInfrastructure-as-codecontainerizationorchestration technologiesend-user monitoring
Soft Skills
collaborationtroubleshootingincident managementoptimizationcommunication