Site Reliability Engineer

PayNearMe

full-time

Posted on: 11/27/2025

Location Type: Remote

Location: Remote • California • 🇺🇸 United States

✨ AI Apply

💰 $175,000 - $195,000 per year

Mid-LevelSenior

AWSAzureCloudDockerEC2GoGoogle Cloud PlatformGrafanaKubernetesPrometheusPythonRubyRuby on RailsSplunkTerraform

About the role

Design, implement, and maintain scalable and resilient infrastructure using Terraform for infrastructure as code
Deploy, manage, and optimize Kubernetes clusters and containerized applications using Docker
Develop and maintain comprehensive monitoring and observability solutions using Datadog
Define, monitor, and maintain Service Level Objectives (SLOs) and Service Level Agreements (SLAs)
Respond to incidents, perform root cause analysis, and implement solutions to prevent recurrence
Ensure the reliability and stability of our production environments
Develop automation scripts and tools to reduce manual intervention and improve system reliability using Python, Bash, or Go
Enhance and maintain continuous integration and continuous deployment pipelines using GitLab CI
Assist in capacity planning and ensure that systems are scalable to meet future demands
Implement security best practices and ensure compliance with industry standards
Work closely with development teams to ensure reliability and scalability of new features and services
Participate in an on-call rotation to address production issues and collaborate in incident response efforts

+3 years of experience in SRE, DevOps, or a related role
Proficient with cloud platforms such as AWS, GCP, or Azure
Experience with EC2, RDS, VPCs, and security groups is essential.
Strong experience with Kubernetes and Docker, including deployment, scaling, and management of containerized applications
Expert in using Terraform for infrastructure as code
Extensive experience with monitoring and observability tools like Datadog, Prometheus, Grafana, ELK stack, or Splunk
Proven ability to define, monitor, and maintain SLOs and SLAs to ensure reliable service delivery
Strong skills in scripting languages like Python, Bash, or Go
Familiarity with GitLab CI or similar tool for continuous integration and deployment
Experience supporting production environments running Go or Ruby/Rails applications
Deep understanding of DevOps principles, practices, and tools to drive continuous improvement in the software development lifecycle
Excellent analytical and problem-solving skills to diagnose and resolve complex system issues quickly and effectively.
Strong organizational skills, attention to detail, and the ability to work collaboratively in a team environment
Excellent documentation skills to ensure accurate and detailed records.

Benefits

100% Remote (must be in US)
Fast- paced and professional work culture
Stock options with standard startup vesting - 1 year cliff; 4 years total
$50 monthly communication expense stipend to go towards your phone/internet bill
$250 stipend to enhance your WFH setup
Reimbursement for peripheral equipment: monitor (up to $400), keyboard and mouse (up to $200)
Premium medical benefits including vision and dental (100% coverage for employees)
Company-sponsored life and disability insurance
Paid parental bonding leave
Paid sick leave, jury duty, bereavement
401k plan
Flexible Time Off (our team members typically take off ~3-4 weeks per year)
Volunteer Time Off
13 scheduled holidays
2x / year in-person team meet-ups (2-3 days, company paid)

Tip: use these terms in your resume and cover letter to boost ATS matches.

TerraformKubernetesDockerPythonBashGoGitLab CIDatadogPrometheusGrafana

analytical skillsproblem-solving skillsorganizational skillsattention to detailcollaborationdocumentation skills