Senior Site Reliability Engineer

Tango

Senior Site Reliability Engineer at Tango Analytics focusing on cloud platform reliability and scalability in a fully remote role. Collaborate with engineering teams to implement observability and incident management practices.

Posted 4/29/2026full-timeRemote • California • 🇺🇸 United StatesSenior💰 $150,000 - $180,000 per yearWebsite

Tech Stack

Tools & technologies

AnsibleAWSAzureCloudDNSDockerGoGoogle Cloud PlatformGrafanaJavaJenkinsKubernetesLinuxPrometheusPythonSplunkTCP/IPTerraform

About the role

Key responsibilities & impact

Own reliability outcomes for Tango’s cloud platform (availability, latency, performance, and scalability) across production and non-production environments
Design, implement, and operate SLOs/SLIs, error budgets, and reliability reporting; drive prioritization of reliability work with Engineering and Product
Build and maintain observability foundations: metrics, logging, tracing, dashboards, and alerting that are actionable and reduce noise
Lead incident response and post-incident reviews (blameless RCAs); implement remediation and prevention work to measurably reduce repeat incidents
Engineer and evolve CI/CD and release safety practices (progressive delivery, canary/blue-green, automated rollbacks, change controls)
Improve infrastructure-as-code and environment consistency; standardize and harden platform components
Partner with Security and Compliance to support secure operations, vulnerability remediation, audits, and customer trust requirements
Optimize cloud cost and capacity through right-sizing, autoscaling, and performance tuning; track and report on cost drivers
Enable engineering teams with reliable internal tooling, runbooks, and self-service operational capabilities
Mentor engineers on reliability best practices, operational excellence, and automation

Requirements

What you’ll need

8+ years of experience in Site Reliability Engineering, DevOps, or Production Engineering supporting distributed SaaS applications
Strong background in Linux systems engineering, networking fundamentals (TCP/IP, DNS, load balancing), and troubleshooting in production
Proficiency with at least one programming language used for automation (e.g., Python, Go, or Java) and strong scripting skills
Hands-on experience with cloud infrastructure (AWS, Azure, or GCP)
Deep experience with infrastructure-as-code and configuration management (e.g., Terraform, CloudFormation, Ansible)
Expertise in containerization and orchestration (Docker, Kubernetes) and operating cloud-native services
Strong observability practice with tools such as Prometheus/Grafana, Datadog, New Relic, OpenTelemetry, ELK/Splunk, or equivalent
Demonstrated incident management leadership, root cause analysis, and continuous improvement mindset
Experience designing and operating CI/CD pipelines and release management practices (e.g., GitHub Actions, Jenkins, GitLab CI, ArgoCD)
Ability to work cross-functionally with Engineering, Product, Support, and Security; clear written and verbal communication
Bachelor’s degree in Computer Science, Engineering, or equivalent practical experience
Relevant certifications are a plus (e.g., AWS/Azure/GCP, Kubernetes CKA/CKAD, ITIL, or security-focused certifications)

Benefits

Comp & perks

Competitive Compensation
Comprehensive Benefits Including health, dental, and vision insurance
401(k) plan with company match
Generous paid time off
Flexible Work Environment
Inclusive & Collaborative Culture

ATS Keywords

✓ Tailor your resume

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools

Site Reliability EngineeringDevOpsProduction EngineeringLinux systems engineeringNetworking fundamentalsAutomation programming (Python, Go, Java)Infrastructure-as-codeContainerizationOrchestration (Docker, Kubernetes)CI/CD pipelines

Soft Skills

Incident management leadershipRoot cause analysisContinuous improvement mindsetCross-functional collaborationClear communication

Certifications

AWS certificationAzure certificationGCP certificationKubernetes CKAKubernetes CKADITIL certificationSecurity-focused certifications