Xenon Seven

Site Reliability Engineer – Mobile and Internet Platform

Xenon Seven

full-time

Posted on:

Location Type: Remote

Location: Remote • 🇩🇪 Germany

Visit company website
AI Apply
Apply

Job Level

Entry Level

Tech Stack

ElasticSearchGoGrafanaKubernetesLinuxLogstashNoSQLOpenShiftPrometheusPythonSQLUnix

About the role

  • Monitor and maintain the reliability and performance of Mobile Banking and Internet Banking applications using Prometheus and Grafana dashboards
  • Manage and support OpenShift/Kubernetes infrastructure for containerized banking applications on on-premise servers
  • Respond to and resolve production incidents with minimal mean time to resolution (MTTR)
  • Implement and maintain centralized logging solutions using ELK Stack (Elasticsearch, Logstash, Kibana) for application troubleshooting
  • Develop and execute runbooks and automation scripts to reduce manual operational toil in OpenShift environments
  • Provide 24/7 production support and on-call rotation for critical banking services
  • Analyze logs and metrics from Prometheus and EFK to identify performance bottlenecks and reliability issues
  • Conduct root cause analysis (RCA) on incidents and implement preventive measures
  • Optimize Kubernetes/OpenShift deployments, pod management, and resource allocation on-premise
  • Implement alerting strategies and threshold management in Prometheus and Grafana
  • Support infrastructure scaling, capacity planning, and load balancing in production environments
  • Implement security best practices and compliance requirements for financial systems in containerized environments
  • Manage on-premise data center infrastructure and server resources
  • Document operational procedures, troubleshooting guides, and create knowledge base articles

Requirements

  • BSc in Computer Science, Information Technology, Software Engineering, or related field
  • 2+ years of hands-on experience in SRE, DevOps, or Production Engineering roles
  • Hands-on experience supporting production applications in Kubernetes/OpenShift environments
  • Strong experience with OpenShift container platform administration and troubleshooting on on-premise infrastructure
  • Proficiency with Prometheus for metrics collection and monitoring
  • Proficiency with Grafana for dashboard creation and visualization
  • Experience with ELK Stack (Elasticsearch, Logstash, Kibana) for centralized logging
  • Strong understanding of Linux/Unix operating systems and networking fundamentals
  • Practical experience with CI/CD tools and automation frameworks
  • Proficiency in at least one programming/scripting language (Python, Go, or Bash)
  • Experience with database management (SQL and NoSQL) on-premise
  • Excellent troubleshooting and analytical skills for production support
  • Strong communication skills and ability to work in cross-functional teams
  • Experience in 24/7 production support environments
  • Experience with on-premise data center infrastructure management
  • Previous experience in financial services or banking sector is a plus

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard skills
KubernetesOpenShiftPrometheusGrafanaELK StackLinuxCI/CDPythonGoBash
Soft skills
troubleshootinganalytical skillscommunication skillscross-functional teamworkincident responseroot cause analysiscapacity planningload balancingautomationdocumentation