GE Aerospace

Staff Site Reliability Engineer

GE Aerospace

full-time

Posted on:

Location Type: Remote

Location: United States

Visit company website

Explore more

AI Apply
Apply

Job Level

About the role

  • Establish performance baseline, capacity thresholds, correlate events, and define monitoring/alerting criteria
  • Develop automated solutions to address potential problems before they result in a service interruption
  • Provide impact assessment and mitigation plan for changes going into the production environment
  • Investigate root cause of severe and systemic outages, identify corrective actions and apply across the enterprise
  • Develop availability measures that align with consumer experience to accurately assess the usability of crucial services
  • Build capacity models to baseline transactional load compared to resource performance and leverage data to predict overall system capacity while automating load placement to avoid outages
  • Identify thresholds for all critical links in the data path to quickly isolate where imbalances may result in potential outages
  • Analyze failure points in services to model risk level and resolution steps if failure occurs
  • Assist in driving architecture enhancements into system to mitigate potential failure points
  • Programmatically monitor for and remediate configuration drift of critical devices
  • Develop response plans to potential failure points and evaluate effectiveness during planned tests
  • Perform comprehensive operational health checks of the entire services to identify areas of concern and track activities to drive improvements at all levels of the architecture
  • Provide technical coaching and direction to more junior teammates

Requirements

  • Bachelor’s degree from accredited university or college with minimum of 4 years of professional experience OR Associates degree with minimum of 7 years of professional experience OR High School Diploma with minimum of 9 years of professional experience
  • Legal authorization to work in the U.S. is required
  • Excellent knowledge of AWS/Azure cloud services
  • Strong oral and written communication skills
  • Demonstrated experience scripting or developing software and services for the cloud (Python, Go, Java, Node.js, .NET, etc.)
  • Extensive knowledge of network protocols (TCP/IP, SNMP, FTP, syslog, TFTP, etc.)
  • Experience managing version control systems such as Git
  • Experience deploying and managing infrastructure on public clouds such as AWS or Azure
  • Experience using an automated configuration management system (Terraform, Chef, Puppet, Ansible, Salt, etc.)
  • Strong organizational and project management skills
  • Strong analytical and problem resolution skills
  • Excellent knowledge of Network Management (SNMP, MIB)
  • Experience with configuring, customizing, and extending monitoring tools (Datadog, Sensu, Grafana, Splunk, etc.)
  • Excellent knowledge of TCP/IP networking, and inter-networking technologies (routing/switching, proxy, firewall, load balancing, etc.)
  • Knowledge and experience using Analytics Software Packages (like Matlab, SAS, JMPro) is a plus.
Benefits
  • Great work environment
  • Professional development
  • Competitive compensation
  • Equal Opportunity Employer
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
PythonGoJavaNode.js.NETTCP/IPSNMPFTPGitTerraformDatadog
Soft Skills
communication skillsorganizational skillsproject management skillsanalytical skillsproblem resolution skillstechnical coachingleadershipcollaborationcritical thinkingattention to detail