Clarifai

Senior Site Reliability Engineer

Clarifai

full-time

Posted on:

Location Type: Remote

Location: United States

Visit company website

Explore more

AI Apply
Apply

Job Level

About the role

  • Ensure the smooth operation and high availability of Clarifai's core services
  • Monitor system performance, identify bottlenecks, and implement optimizations to enhance reliability and efficiency
  • Develop Kubernetes resources and custom tooling for seamless cloud and on-premise deployments
  • Design and implement scalable, secure, and cost-effective infrastructure solutions.
  • Partner with teams across the organization to identify & solve engineering challenges

Requirements

  • BS/BA in Computer Science or related degree
  • Good knowledge of cloud providers (AWS, GCP or similar)
  • Expertise with Kubernetes (EKS, GKE, self-hosted) and Infrastructure as Code using Terraform, Helm
  • Solid understanding of web and networking (HTTP, TLS, DNS, Certificates, etc)
  • Experience with CI/CD pipelines using tools such as GitHub Actions, ArgoCD, and Atlantis
  • Strong interpersonal skills working with teams across different time zones and regions
  • Knowledge of basic Microservice Architecture principles
  • Familiarity with security best practices for cloud-based systems.
  • Experience with relational databases, message queues, key value stores
  • Experience writing python, golang, or any other popular programming language
  • Familiarity with any RPC framework
  • Experience developing & building custom Kubernetes operators

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard skills
KubernetesTerraformHelmCI/CDPythonGolangMicroservice ArchitectureRelational DatabasesMessage QueuesKey Value Stores
Soft skills
interpersonal skills
Certifications
BS/BA in Computer Science