FREE ACCESS
5,000–10,000 jobs/day

See all jobs on JobTailor
Search thousands of fresh jobs every day.
Discover
- Fresh listings
- Fast filters
- No subscription required
Create a free account and start exploring right away.

DevOps Engineer – Site Reliability Engineer
Bespoke LabsDevOps/Site Reliability Engineer maintaining AWS cloud infrastructure for AI research company. Supporting reliable, observable systems for AI data pipelines at scale.
Tech Stack
Tools & technologiesAWSCloudDistributed SystemsEC2GoGrafanaKubernetesPrometheusPythonTerraform
About the role
Key responsibilities & impact- Own cloud infrastructure on AWS — EC2, EKS, RDS, S3, IAM
- Manage Kubernetes clusters and container orchestration end-to-end
- Build and maintain CI/CD pipelines using GitHub Actions or similar
- Implement monitoring, alerting, and observability stacks (Prometheus, Grafana, or DataDog)
- Improve reliability, performance, and security of production systems
- Automate infrastructure with Terraform or similar IaC tools
- Debug and resolve issues across complex, distributed systems
- Participate in design reviews and help raise the infrastructure bar
Requirements
What you’ll need- 3–5 years in DevOps, SRE, or infrastructure engineering
- Strong AWS experience — EKS, EC2, RDS, S3, IAM
- Kubernetes — deployment, scaling, troubleshooting in production
- CI/CD pipelines — GitHub Actions, ArgoCD, or similar
- Infrastructure as Code — Terraform, Pulumi, or CDK
- Python or Go scripting
- Experience working in production environments with real users
- Comfort with ambiguity and ability to operate autonomously
Benefits
Comp & perks- Competitive compensation and meaningful equity
- Direct impact on frontier AI model training and evaluation infrastructure
- Flexible, remote-friendly environment with low bureaucracy
- A small, high-caliber team with deep AI research expertise
- Health, wellness, and learning & development benefits
ATS Keywords
✓ Tailor your resumeApplicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
AWSEC2EKSRDSS3IAMKubernetesCI/CDTerraformPython
Soft Skills
problem-solvingautonomyadaptability