
Associate Site Reliability Engineer
T-Mobile
full-time
Posted on:
Location Type: Office
Location: Frisco • Kansas, Texas • 🇺🇸 United States
Visit company websiteSalary
💰 $64,300 - $115,900 per year
Job Level
JuniorMid-Level
About the role
- Supports the operation and maintenance of high-traffic, business-critical internet communication systems to ensure continuous availability.
- Primarily focuses on automating system administration and monitoring to enhance network efficiency and reliability.
- Conducts tests for redundancy, resilience, and failover to maintain uptime standards.
- Measures success by system performance, uptime, and the ability to identify and resolve faults proactively.
- Builds effective monitoring systems that alert on symptoms rather than outages and gathers and analyzes metrics from both operating systems and applications for performance tuning and fault finding.
- Manages platform infrastructure, including capacity planning and scaling.
- Collaborates with development teams to improve services through rigorous testing and release procedures.
- Ensures high system performance and uptime, designs, implements, and monitors complex architectures, automates processes, and collaborates with cross-functional teams.
- Responsible for other Duties/Projects as assigned by business management as needed.
Requirements
- Bachelor's Degree OR combination of education and experience deemed equivalent (Required)
- Acceptable areas of study include Computer Science, Engineering
- Master's/Advanced Degree Computer Science, Engineering or Related Field (Preferred)
- Less than 2+ years - Operating and maintaining high traffic, business critical internet site communications systems.
- Less than 2+ years - Automating the administration and monitoring of network systems.
- Less than 2+ years - Conducting tests for redundancy, resilience, and failover in digital infrastructure.
- Understanding of high traffic network systems and their operations. (Required)
- Proficiency in automating the administration and monitoring of network systems. (Required)
- Ability to conduct rigorous tests for redundancy, resilience, and failover. (Required)
- Proficiency in using DevOps-centric automation tools and technologies for CICD, configuration management, etc. (Required)
- Ability to use software to improve the availability, scalability, latency, and efficiency of services. (Required)
- Ability to use dashboards for continuous monitoring and health check of applications, and the underlying infrastructure. (Required)
- Understanding of how to improve the quality of services using the monitoring feedback for non-production environment. (Required)
- Ability to develop and prepare data for dashboard views. (Required)
Benefits
- medical, dental and vision insurance
- flexible spending account
- 401(k)
- employee stock grants
- employee stock purchase plan
- paid time off and up to 12 paid holidays
- paid parental and family leave
- family building benefits
- back-up care
- enhanced family support
- childcare subsidy
- tuition assistance
- college coaching
- short- and long-term disability
- voluntary AD&D coverage
- voluntary accident coverage
- voluntary life insurance
- voluntary disability insurance
- voluntary long-term care insurance
- mobile service & home internet discounts
- pet insurance
- access to commuter and transit programs
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard skills
system administrationnetwork monitoringredundancy testingresilience testingfailover testingperformance tuningcapacity planningscalingautomationCICD
Soft skills
collaborationproblem-solvingcommunicationanalytical thinkingproactive fault resolution
Certifications
Bachelor's DegreeMaster's Degree