
Senior Site Reliability Engineer
Clarifai
full-time
Posted on:
Location Type: Remote
Location: United States
Visit company websiteExplore more
Job Level
About the role
- Ensure the smooth operation and high availability of Clarifai's core services
- Monitor system performance, identify bottlenecks, and implement optimizations to enhance reliability and efficiency
- Develop Kubernetes resources and custom tooling for seamless cloud and on-premise deployments
- Design and implement scalable, secure, and cost-effective infrastructure solutions.
- Partner with teams across the organization to identify & solve engineering challenges
Requirements
- BS/BA in Computer Science or related degree
- Good knowledge of cloud providers (AWS, GCP or similar)
- Expertise with Kubernetes (EKS, GKE, self-hosted) and Infrastructure as Code using Terraform, Helm
- Solid understanding of web and networking (HTTP, TLS, DNS, Certificates, etc)
- Experience with CI/CD pipelines using tools such as GitHub Actions, ArgoCD, and Atlantis
- Strong interpersonal skills working with teams across different time zones and regions
- Knowledge of basic Microservice Architecture principles
- Familiarity with security best practices for cloud-based systems.
- Experience with relational databases, message queues, key value stores
- Experience writing python, golang, or any other popular programming language
- Familiarity with any RPC framework
- Experience developing & building custom Kubernetes operators
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard skills
KubernetesTerraformHelmCI/CDPythonGolangMicroservice ArchitectureRelational DatabasesMessage QueuesKey Value Stores
Soft skills
interpersonal skills
Certifications
BS/BA in Computer Science