PayNearMe

Site Reliability Engineer

PayNearMe

full-time

Posted on:

Location Type: Remote

Location: Remote • California • 🇺🇸 United States

Visit company website
AI Apply
Apply

Salary

💰 $175,000 - $195,000 per year

Job Level

Mid-LevelSenior

Tech Stack

AWSAzureCloudDockerEC2GoGoogle Cloud PlatformGrafanaKubernetesPrometheusPythonRubyRuby on RailsSplunkTerraform

About the role

  • Design, implement, and maintain scalable and resilient infrastructure using Terraform for infrastructure as code
  • Deploy, manage, and optimize Kubernetes clusters and containerized applications using Docker
  • Develop and maintain comprehensive monitoring and observability solutions using Datadog
  • Define, monitor, and maintain Service Level Objectives (SLOs) and Service Level Agreements (SLAs)
  • Respond to incidents, perform root cause analysis, and implement solutions to prevent recurrence
  • Ensure the reliability and stability of our production environments
  • Develop automation scripts and tools to reduce manual intervention and improve system reliability using Python, Bash, or Go
  • Enhance and maintain continuous integration and continuous deployment pipelines using GitLab CI
  • Assist in capacity planning and ensure that systems are scalable to meet future demands
  • Implement security best practices and ensure compliance with industry standards
  • Work closely with development teams to ensure reliability and scalability of new features and services
  • Participate in an on-call rotation to address production issues and collaborate in incident response efforts

Requirements

  • +3 years of experience in SRE, DevOps, or a related role
  • Proficient with cloud platforms such as AWS, GCP, or Azure
  • Experience with EC2, RDS, VPCs, and security groups is essential.
  • Strong experience with Kubernetes and Docker, including deployment, scaling, and management of containerized applications
  • Expert in using Terraform for infrastructure as code
  • Extensive experience with monitoring and observability tools like Datadog, Prometheus, Grafana, ELK stack, or Splunk
  • Proven ability to define, monitor, and maintain SLOs and SLAs to ensure reliable service delivery
  • Strong skills in scripting languages like Python, Bash, or Go
  • Familiarity with GitLab CI or similar tool for continuous integration and deployment
  • Experience supporting production environments running Go or Ruby/Rails applications
  • Deep understanding of DevOps principles, practices, and tools to drive continuous improvement in the software development lifecycle
  • Excellent analytical and problem-solving skills to diagnose and resolve complex system issues quickly and effectively.
  • Strong organizational skills, attention to detail, and the ability to work collaboratively in a team environment
  • Excellent documentation skills to ensure accurate and detailed records.
Benefits
  • 100% Remote (must be in US)
  • Fast- paced and professional work culture
  • Stock options with standard startup vesting - 1 year cliff; 4 years total
  • $50 monthly communication expense stipend to go towards your phone/internet bill
  • $250 stipend to enhance your WFH setup
  • Reimbursement for peripheral equipment: monitor (up to $400), keyboard and mouse (up to $200)
  • Premium medical benefits including vision and dental (100% coverage for employees)
  • Company-sponsored life and disability insurance
  • Paid parental bonding leave
  • Paid sick leave, jury duty, bereavement
  • 401k plan
  • Flexible Time Off (our team members typically take off ~3-4 weeks per year)
  • Volunteer Time Off
  • 13 scheduled holidays
  • 2x / year in-person team meet-ups (2-3 days, company paid)

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard skills
TerraformKubernetesDockerPythonBashGoGitLab CIDatadogPrometheusGrafana
Soft skills
analytical skillsproblem-solving skillsorganizational skillsattention to detailcollaborationdocumentation skills