
Site Reliability Engineer
AMP
full-time
Posted on:
Location Type: Remote
Location: United States
Visit company websiteExplore more
Salary
💰 $100,000 - $120,000 per year
About the role
- Triage and respond to tickets, adhering to SLAs from 9:00am - 5:00pm in the Eastern Time Zone.
- Participation in the rotation of pager duty as we establish 24/5 escalation support for the facilities.
- Provide support for CoreTech devices including commissioning support, software upgrades, tooling maintenance, and troubleshooting.
- Troubleshoot operating system, on-prem hardware, networking, container, and application issues to the point of mitigation, resolution, or hand-off. All devices are on-prem in AMP facilities.
- Maintain and extend documentation for the engineering support process.
- Help define improvements to the Jira ticketing system for ease of use and analytics tracking.
- Development tasks will be focused on increasing observability of software issues and creating mitigation tools to leverage when the software issues present. When subject matter experts are called upon in escalations, it will be the job of this role to take those lessons learned and turn them into tools enabling facilities to better self-serve.
- Monitoring stack: Prometheus/OpenMetrics exporters, Prometheus aggregator (PromQL), Grafana dashboards
- Alerting stack: Grafana alerting with Slack integrations
- Mitigation stack: Ansible/Jenkins
Requirements
- Strong technical communication skills for collaborating with the rest of the software team through ticket escalations
- Strong interpersonal skills for communicating with individuals in industrial environments experiencing downtime issues that can be overwhelming
- Experience troubleshooting Linux systems
- Desire to learn and gain experience writing code, including professional software engineering practices like coding standards, code reviews, source control management, build processes, testing, and operations
- The growth of facilities requires this role to become more efficient over time
- Proficiency managing task level scoping for yourself under a sprint based or kanban methodology
- Passion for green technology and emissions reduction.
- Real world experience with deployed hardware
- Experience with Docker or similar technologies
- Experience troubleshooting to minimize mean time to recovery in downtime situations
- Comfort with reactive multitasking and rapid reprioritization.
Benefits
- Medical - The company covers between 77% to 100% of the premium for Cigna medical healthcare plans depending on the selection.
- Dental, Vision, Short-Term and Long-Term Disability
- Life Insurance: The company covers the cost of Basic Life / AD&D 1 x Salary, option to purchase additional through New York Life
- Benefits start the day you start
- HSA Eligible Health Plans, Company Monthly Contributions!
- 401(k) retirement plan (non-matching)
- FTO - Flexible Time Off
- 6 Accrued Sick Days
- Eight (8) paid holidays
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
Linux troubleshootingsoftware engineering practicescoding standardscode reviewssource control managementbuild processestestingDockerAnsibleJenkins
Soft Skills
technical communicationinterpersonal skillsreactive multitaskingrapid reprioritizationcollaborationproblem-solvingadaptabilityefficiency improvementcustomer serviceteamwork