S&P Global

Senior Site Reliability Engineer – Infrastructure

S&P Global

full-time

Posted on:

Location Type: Office

Location: HyderabadIndia

Visit company website

Explore more

AI Apply
Apply

Job Level

About the role

  • Own and operate production services supporting critical financial applications with a strong focus on availability, performance, and reliability
  • Design, build, and manage AWS infrastructure, including EKS-based clusters, across lower and production environments
  • Provision and manage infrastructure using Terraform (Infrastructure as Code) with a strong automation first mindset
  • Deploy, scale, and troubleshoot applications running on like Kubernetes, including cluster creation, upgrades, and lifecycle management
  • Build and maintain automation frameworks and tooling Python based to reduce operational toil and prevent recurring incidents
  • Monitor system health using metrics, logs, and alerts; continuously tune alerts, dashboards, and runbooks
  • Troubleshoot complex issues spanning clusters, networking, certificates, deployments, and application behavior
  • Manage certificate lifecycle and expiration, ensuring secure and uninterrupted service operation
  • Collaborate with InfoSec, Vulnerability Management, and Network Security teams (e.g., Zscaler) to maintain a strong security posture
  • Collaborate with L1/L2 teams, helping them understand infrastructure and operational best practices
  • Participate in on call and lead incident response, drive root cause analysis, and ensure effective post-incident remediation and learnings
  • Identify architectural anti-patterns and drive improvements by reviewing new services for production readiness, resiliency, and secure design prior to release
  • Establish and enforce production readiness standards, including deployment strategies, rollback plans, and observability requirements
  • Optimize infrastructure cost and resource utilization without compromising reliability and performance.

Requirements

  • 6+ years of experience in SRE, DevOps, Platform, or Infrastructure Engineering roles
  • Strong software engineering background, with hands-on Python development used for automation, tooling, and system reliability
  • Experience building or supporting scalable, distributed systems in production
  • Deep experience with AWS cloud environments, including AWS, IAM, networking, and access controls
  • Strong hands-on expertise with similar tools like Kubernetes (EKS preferred): cluster creation, deployments, scaling, and troubleshooting
  • Solid understanding of networking fundamentals (VPCs, routing, DNS, load balancing, security groups)
  • Experience with CI/CD pipelines, deployment tools, and infrastructure automation
  • Working knowledge of databases and query optimization, and understanding how applications behave under load
  • Familiarity with similar tools like Kafka or other messaging systems
  • Comfortable conducting code reviews and participating in coding focused interviews
  • Strong operational mindset with experience in incident management and on‑call rotations
  • Clear communicator and collaborative teammate who values documentation and knowledge sharing.
Benefits
  • Health & Wellness: Health care coverage designed for the mind and body.
  • Flexible Downtime: Generous time off helps keep you energized for your time on.
  • Continuous Learning: Access a wealth of resources to grow your career and learn valuable new skills.
  • Invest in Your Future: Secure your financial future through competitive pay, retirement planning, a continuing education program with a company-matched student loan contribution, and financial wellness programs.
  • Family Friendly Perks: It’s not just about you. S&P Global has perks for your partners and little ones, too, with some best-in class benefits for families.
  • Beyond the Basics: From retail discounts to referral incentive awards—small perks can make a big difference.
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
AWSTerraformKubernetesPythonCI/CDnetworkingdatabasesKafkainfrastructure automationincident management
Soft Skills
collaborationcommunicationdocumentationknowledge sharingoperational mindsetproblem-solvingroot cause analysisleadershipteamworkmentoring