Apply

Ready to go for it?

AI Apply speeds things up—apply directly if you prefer.

FREE ACCESS
5,000–10,000 jobs/day
JobTailor Logo

See all jobs on JobTailor

Search thousands of fresh jobs every day.

Discover
  • Fresh listings
  • Fast filters
  • No subscription required
Create a free account and start exploring right away.
AlphaSense

Staff Site Reliability Engineer

AlphaSense

Staff Site Reliability Engineer at AlphaSense enhancing reliability, performance, and scalability of systems. Leading SRE practices and mentoring engineers in a global team.

Posted 5/13/2026full-timeRemote • 🇮🇳 IndiaLeadWebsite

Tech Stack

Tools & technologies
AWSAzureCloudDNSGoGoogle Cloud PlatformGrafanaKubernetesPrometheusPythonTCP/IP

About the role

Key responsibilities & impact
  • Architect Reliability Paved Paths: Build frameworks and self-service tooling that let teams own the reliability of their services in a 'You Build It, You Run It' culture
  • Lead AI-Driven Reliability: Drive our AIOps strategy — automating diagnostics, remediation, and proactive failure prevention
  • Champion Reliability Culture: Embed SRE practices across engineering via design reviews, production readiness, and operational standards
  • Incident Leadership: Act as Incident Commander during critical events, modeling operational excellence, and ensuring blameless postmortems lead to lasting improvements
  • Advance Observability: Deliver end-to-end monitoring, tracing, and profiling (Prometheus, Grafana, OTEL, Continuous Profiling) to optimize performance proactively
  • Mentor & Multiply: Elevate engineers across SRE and product teams through mentorship, technical guidance, and knowledge sharing

Requirements

What you’ll need
  • 8+ years of experience in Site Reliability Engineering, DevOps, or a similar role with at least 3+ years in a Senior+ SRE position
  • Strong background in running production SaaS systems at scale
  • Proficiency in at least one programming/scripting language (Python, Go, or similar)
  • Hands-on expertise with cloud platforms (AWS, GCP, or Azure) and Kubernetes
  • Deep understanding of networking fundamentals (TCP/IP, DNS, HTTP/S, load balancing)
  • Experience with monitoring & alerting (Prometheus, Grafana, Datadog, ELK)
  • Familiarity with advanced observability (OTEL, continuous profiling)
  • Proven incident management experience including leading high-severity incidents and postmortems
  • Strong troubleshooting skills across the full stack
  • Excellent communication and collaboration skills

Benefits

Comp & perks
  • Equal-opportunity employer
  • Work environment that supports and respects all individuals
  • Reasonable accommodation for employees with protected disabilities
  • Anti-fraud and security measures against job scams

ATS Keywords

✓ Tailor your resume
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
Site Reliability EngineeringDevOpsPythonGoAWSGCPAzureKubernetesTCP/IPmonitoring
Soft Skills
mentorshipcommunicationcollaborationtroubleshootingincident management