Apply

Ready to go for it?

AI Apply speeds things up—apply directly if you prefer.

FREE ACCESS
5,000–10,000 jobs/day
JobTailor Logo

See all jobs on JobTailor

Search thousands of fresh jobs every day.

Discover
  • Fresh listings
  • Fast filters
  • No subscription required
Create a free account and start exploring right away.
GM Financial

Lead Site Reliability Engineer

GM Financial

. Manage/Administer/Deploy Kubernetes and Spark cluster environments, on bare-metal and container infrastructure, including service allocation and configuration for the cluster, capacity planning, performance tuning, and ongoing monitoring .

Posted 5/22/2026full-timeArlington • Texas • 🇺🇸 United StatesSeniorWebsite

Tech Stack

Tools & technologies
AzureCloudCyber SecurityDistributed SystemsDockerETLKubernetesLinuxOpenShiftPerlPythonRubySpark

About the role

Key responsibilities & impact
  • Manage/Administer/Deploy Kubernetes and Spark cluster environments, on bare-metal and container infrastructure, including service allocation and configuration for the cluster, capacity planning, performance tuning, and ongoing monitoring
  • Define and refine processes and procedures for the site reliability engineering practice
  • Setup, manage and maintain Kubernetes based scalable environments for high-availability and work with vendors for smooth and continuous operations
  • Work closely with data scientists, data architects, data engineers, ETL developers, cybersecurity, network, Linux, other IT counterparts, and business partners to design and setup the environments to manage the ingested and processed datasets from the external sources, internal systems, and the data warehouse to extract features of interest
  • Evaluate, research, experiment with data processing, management and scalability technologies in a lab to keep pace with industry innovation while assessing business impact and viability for use cases associated with efforts in hand
  • Design, setup, test, deploy, monitor, document, and troubleshoot data processing and associated automation issues from the operations perspective
  • Work with IT Operations and Information Security Operations with monitoring and troubleshooting of incidents to maintain service levels
  • Work with Information Security Vulnerability Management and vendors to remediate known impacting vulnerabilities
  • Contribute to the evolving distributed systems architecture to meet changing requirements for scaling, reliability, performance, manageability, and cost
  • Report utilization and performance metrics to user communities
  • Contribute to planning and implementation of new/upgraded hardware and software releases
  • Responsible for monitoring the Linux, Kubernetes, Object Storage(MinIO), Feature Store, and Spark
  • Research and recommend innovative, and where possible, automated approaches for administration tasks
  • Identify approaches to efficiencies in resource utilization, provide economies of scale, and simplify support issues
  • Responsible for administration of Machine Learning platforms & Operations (MLOps) Such as Kubeflow/Jupyterhub/Python
  • This role will support GMF international operations and will closely align with our GMF IT NorthStar architecture and operating Principles

Requirements

What you’ll need
  • 5-7 years of hands-on experience with supporting Linux production environments required
  • 5-7 years of hands-on administration experience on Spark required
  • 3-5 years hands-on experience with scripting with bash, perl, ruby, or python required
  • 3-5 years experience with Docker Datacenter required
  • 2-4 years of hands-on administration experience on Machine learning platforms required
  • Minimum of 1 year of experience in Mesos, Kubernetes, OpenShift and/or Deis or other such container/platform-as-a-service orchestrator required
  • Minimum of 1 year of hands-on experience on CICD tools & Technologies required
  • Minimum of 1 year of lead experience of site reliability engineering team required
  • Hands-on experience in cloud technologies with Microsoft Azure required
  • High School Diploma or equivalent required
  • Bachelor’s Degree in related field or equivalent experience required
  • Master’s Degree Preferred.

Benefits

Comp & perks
  • 401K matching
  • bonding leave for new parents (12 weeks, 100% paid)
  • training
  • GM employee auto discount
  • community service pay
  • nine company holidays.

ATS Keywords

✓ Tailor your resume
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
KubernetesSparkLinuxDockerMachine LearningMLOpsScriptingCI/CDCloud TechnologiesCapacity Planning
Soft Skills
CollaborationProblem SolvingPerformance TuningMonitoringDocumentationProcess ImprovementCommunicationLeadershipInnovationEfficiency
Certifications
Bachelor’s DegreeMaster’s DegreeHigh School Diploma