T-Mobile

Associate Site Reliability Engineer

T-Mobile

full-time

Posted on:

Location Type: Office

Location: Frisco • Kansas, Texas • 🇺🇸 United States

Visit company website
AI Apply
Apply

Salary

💰 $64,300 - $115,900 per year

Job Level

JuniorMid-Level

About the role

  • Supports the operation and maintenance of high-traffic, business-critical internet communication systems to ensure continuous availability.
  • Primarily focuses on automating system administration and monitoring to enhance network efficiency and reliability.
  • Conducts tests for redundancy, resilience, and failover to maintain uptime standards.
  • Measures success by system performance, uptime, and the ability to identify and resolve faults proactively.
  • Builds effective monitoring systems that alert on symptoms rather than outages and gathers and analyzes metrics from both operating systems and applications for performance tuning and fault finding.
  • Manages platform infrastructure, including capacity planning and scaling.
  • Collaborates with development teams to improve services through rigorous testing and release procedures.
  • Ensures high system performance and uptime, designs, implements, and monitors complex architectures, automates processes, and collaborates with cross-functional teams.
  • Responsible for other Duties/Projects as assigned by business management as needed.

Requirements

  • Bachelor's Degree OR combination of education and experience deemed equivalent (Required)
  • Acceptable areas of study include Computer Science, Engineering
  • Master's/Advanced Degree Computer Science, Engineering or Related Field (Preferred)
  • Less than 2+ years - Operating and maintaining high traffic, business critical internet site communications systems.
  • Less than 2+ years - Automating the administration and monitoring of network systems.
  • Less than 2+ years - Conducting tests for redundancy, resilience, and failover in digital infrastructure.
  • Understanding of high traffic network systems and their operations. (Required)
  • Proficiency in automating the administration and monitoring of network systems. (Required)
  • Ability to conduct rigorous tests for redundancy, resilience, and failover. (Required)
  • Proficiency in using DevOps-centric automation tools and technologies for CICD, configuration management, etc. (Required)
  • Ability to use software to improve the availability, scalability, latency, and efficiency of services. (Required)
  • Ability to use dashboards for continuous monitoring and health check of applications, and the underlying infrastructure. (Required)
  • Understanding of how to improve the quality of services using the monitoring feedback for non-production environment. (Required)
  • Ability to develop and prepare data for dashboard views. (Required)
Benefits
  • medical, dental and vision insurance
  • flexible spending account
  • 401(k)
  • employee stock grants
  • employee stock purchase plan
  • paid time off and up to 12 paid holidays
  • paid parental and family leave
  • family building benefits
  • back-up care
  • enhanced family support
  • childcare subsidy
  • tuition assistance
  • college coaching
  • short- and long-term disability
  • voluntary AD&D coverage
  • voluntary accident coverage
  • voluntary life insurance
  • voluntary disability insurance
  • voluntary long-term care insurance
  • mobile service & home internet discounts
  • pet insurance
  • access to commuter and transit programs

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard skills
system administrationnetwork monitoringredundancy testingresilience testingfailover testingperformance tuningcapacity planningscalingautomationCICD
Soft skills
collaborationproblem-solvingcommunicationanalytical thinkingproactive fault resolution
Certifications
Bachelor's DegreeMaster's Degree