FREE ACCESS
5,000–10,000 jobs/day

See all jobs on JobTailor
Search thousands of fresh jobs every day.
Discover
- Fresh listings
- Fast filters
- No subscription required
Create a free account and start exploring right away.

Senior Site Reliability Engineer, C#, .NET
ClimavisionSenior Site Reliability Engineer at Climavision ensuring reliability of weather data services across various environments. Focused on improving operational maturity and handling complex production issues.
Tech Stack
Tools & technologiesAzureDistributed SystemsKubernetes.NET
About the role
Key responsibilities & impact- Own production reliability for Climavision’s customer-facing platform and radar-derived weather data services across Azure, colocation, and edge Kubernetes environments.
- Contribute to the definition and improvement of SLIs, SLOs, alerting standards, and operational metrics used to measure platform reliability.
- Support and coordinate production incident response efforts, including troubleshooting, mitigation, communication, and postmortem analysis.
- Diagnose and resolve complex production issues across application services, Kubernetes infrastructure, storage, and distributed systems.
- Drive multi-replica and multi-cluster high availability across Climavision’s .NET services.
- Improve reliability and operational maturity of production platform services, including observability, autoscaling, ingress, and distributed storage.
- Partner with software engineering teams to improve production readiness, resiliency patterns, deployment safety, and operational visibility before services reach production.
- Support and evolve Climavision’s observability platform, including metrics, logging, distributed tracing, dashboarding, and alerting.
Requirements
What you’ll need- A bachelor’s degree in computer science, software engineering, or a related field; equivalent professional experience considered.
- Minimum of 7 years of experience in Site Reliability Engineering, DevOps, Production Engineering, Platform Engineering, or a related infrastructure-focused role, with at least 4 years in a role formally titled Site Reliability Engineer or carrying explicit SLO / error-budget accountability.
- Strong, hands-on software engineering experience with a minimum of 3 years of experience supporting and modifying C# / .NET applications in production environments.
- Demonstrated experience refactoring production application code (preferably C# / .NET) to make services horizontally scalable across multiple replicas.
- Experience designing or operating multi-cluster high-availability architectures, including failover behavior, traffic routing, and cross-cluster service deployment.
- Strong hands-on experience operating production workloads in self-managed or highly customized Kubernetes environments.
- Experience diagnosing and resolving production incidents across application, platform and Kubernetes infrastructure layers, including workload scheduling, storage, ingress, and cluster-level failures.
- Strong written and verbal communication skills, including incident documentation and postmortem authoring.
Benefits
Comp & perks- Competitive compensation
- Comprehensive benefits package
- 401(k) Savings Plan
- Medical/Dental/Vision Benefits
- Health Savings Account (HSA) and Flexible Spending Account (FSA)
- Unlimited Paid Time-off
- 11 Paid Holidays
- Paid Parental Leave
- Company Paid Short-term Disability (STD)
- Company Paid Long-term Disability (LTD)
- Company Paid Life Insurance
ATS Keywords
✓ Tailor your resumeApplicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
C#.NETKubernetesAzureobservabilitydistributed systemshigh availabilityproduction engineeringDevOpssite reliability engineering
Soft Skills
communicationtroubleshootingincident responsepostmortem analysiscollaborationproblem-solvingdocumentationresiliency patternsoperational visibilitydeployment safety