Cayuse Holdings

Site Reliability Engineer

Cayuse Holdings

full-time

Posted on:

Location Type: Hybrid

Location: AustinTexasUnited States

Visit company website

Explore more

AI Apply
Apply

Salary

💰 $108,160 - $153,920 per year

Job Level

About the role

  • Ensure the reliability, availability, performance, and scalability of production systems using software engineering practices.
  • Collaborate closely with development teams to design, build, and maintain resilient, observable, and automated platforms that meet defined service level objectives (SLOs).
  • Develop and implement automation tools to streamline manual and repetitive operational tasks.
  • Document processes, workflows, and system configurations to support ongoing operations and future enhancements.
  • Continuously monitor production systems, proactively addressing incidents and performance issues.
  • Participate in capacity planning and ongoing improvements to system resilience and scalability.
  • Maintain effective communication with executive management, business stakeholders, and cross-functional technical teams.
  • Stay current with emerging site reliability engineering practices, tools, and technologies.

Requirements

  • 8 years of experience in systems engineering, DevOps, or site reliability engineering roles.
  • 8 years of strong experience with Linux/Unix systems and system internals.
  • 8 years of proficiency in one or more programming/scripting languages (e.g., Python, Go, Java, Bash).
  • 8 years of experience designing and operating highly available, distributed systems.
  • 8 years of strong knowledge of cloud platforms (such as AWS or GCP) and cloud-native services.
  • 8 years of experience with containerization and orchestration (e.g., Docker, Kubernetes).
  • 8 years of strong understanding of monitoring, alerting, and logging concepts.
  • 8 years of experience defining and managing Service Level Indicators (SLIs), Service Level Objectives (SLOs), and error budgets.
  • 8 years of familiarity with incident management, root cause analysis (RCA), and postmortems.
  • 8 years of experience integrating security and compliance into operational workflows.
  • Must be able to pass a background check.
  • May require additional background checks as required by projects and/or clients at any time during employment.
Benefits
  • Medical, Dental and Vision Insurance
  • Wellness Program
  • Flexible Spending Accounts (Healthcare, Dependent Care, Commuter)
  • Short-Term and Long-Term Disability options
  • Basic Life and AD&D Insurance (Company Provided)
  • Voluntary Life and AD&D options
  • 401(k) Retirement Savings Plan with matching after one year
  • Paid Time Off
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
LinuxUnixPythonGoJavaBashAWSGCPDockerKubernetes
Soft Skills
collaborationcommunicationincident managementroot cause analysisdocumentationcapacity planningproactive problem solvingstakeholder managementteamworkprocess improvement