
Intermediate Site Reliability Engineer, Tenant Services
GitLab
full-time
Posted on:
Location Type: Remote
Location: Anywhere in North America
Visit company websiteExplore more
About the role
- Design and implement highly scalable infrastructure for GitLab.com to support current and future growth.
- Collaborate with cross-functional teams across the Infrastructure organization to plan and deliver projects that shape GitLab’s platform direction.
- Operate and improve edge services and Kubernetes workloads, acting as a subject matter expert within the infrastructure department.
- Participate in a global on-call rotation during your local daytime hours, respond to production incidents, and contribute to clear, constructive incident reviews.
- Reduce toil by automating operational tasks and building tools that improve reliability, availability, and scalability.
- Apply infrastructure as code and configuration management practices to manage cloud resources and environments consistently.
- Write and maintain production-quality code, preferably in Go or Ruby, to enhance our systems and automation toolchain.
Requirements
- Background working with the Kubernetes ecosystem, including tools such as Helm, and running production workloads.
- Experience operating cloud infrastructure on platforms like Google Cloud Platform or Amazon Web Services, especially networking, hosted Kubernetes services, and scaling.
- Hands-on practice with infrastructure as code and configuration management tools such as Ansible or Chef.
- Strong programming skills in a modern language, preferably Go or Ruby, applied to automation and reliability problems.
- Ability to clearly define problems, think beyond short-term fixes, and design solutions that improve systems over time.
- Consistent focus on reducing toil through automation and thoughtful system design.
- Independent, proactive working style with a bias for action and comfort operating as a "manager of one" in a distributed, asynchronous environment.
- Clear written and verbal communication skills, with openness to candidates who bring transferable experience from related reliability, infrastructure, or platform roles.
Benefits
- Benefits to support your health, finances, and well-being
- Flexible Paid Time Off
- Team Member Resource Groups
- Equity Compensation & Employee Stock Purchase Plan
- Growth and Development Fund
- Parental leave
- Home office support
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard skills
KubernetesGoRubyinfrastructure as codeconfiguration managementautomationcloud infrastructureproduction workloadsscalabilityreliability
Soft skills
problem definitionsystem designindependent workingproactivecommunicationcollaborationbias for actionclear writingconstructive feedbackcritical thinking