Apply

Ready to go for it?

AI Apply speeds things up—apply directly if you prefer.

FREE ACCESS
5,000–10,000 jobs/day
JobTailor Logo

See all jobs on JobTailor

Search thousands of fresh jobs every day.

Discover
  • Fresh listings
  • Fast filters
  • No subscription required
Create a free account and start exploring right away.
Zingtree

Senior DevOps – Platform Reliability Engineer

Zingtree

Senior DevOps / Platform Reliability Engineer managing CI/CD, infrastructure for AI-driven platform. Collaborating across teams to automate processes and ensure reliability.

Posted 5/8/2026full-timeRemote • New York • 🇺🇸 United StatesSeniorWebsite

Tech Stack

Tools & technologies
AnsibleAWSCloudDNSGrafanaJenkinsKafkaKubernetesLinuxMicroservicesMySQLPrometheusPythonRedisTerraform

About the role

Key responsibilities & impact
  • Own and evolve CI/CD pipelines using GitHub Actions and OIDC-based authentication for microservices and agentic workloads, with safe, fast, and reversible deployments.
  • Automate infrastructure provisioning using Infrastructure as Code (IaC) tools such as Terraform and CloudFormation.
  • Operate and scale our Kubernetes platform (EKS + Argo CD), including autoscaling, ingress, external-dns, cert-manager, External Secrets Operator, backups, runtime guardrails, and multi-tenant isolation for enterprise customers.
  • Manage the edge and network perimeter, including Cloudflare (CDN, WAF, Bot Management, DDoS protection, Zero Trust / Access), CloudFront, API Gateway, ALB/NLB, Route 53, and network security controls.
  • Operate the data and event tier, including Aurora MySQL, ElastiCache/Redis, S3, and MSK (Kafka), with responsibility for backups, point-in-time recovery (PITR), and multi-AZ disaster recovery aligned to defined RTO/RPO objectives.
  • Build and maintain Lambda workloads where event-driven or serverless architectures are the right fit.
  • Build observability as a product using Prometheus, Grafana, and OpenTelemetry, including telemetry for LLM and agentic systems such as token cost, tool-call latency, evaluation signals, and prompt/version tracking.
  • Strengthen our security and compliance posture for SOC 2 and HIPAA, including least-privilege IAM, SCPs, secrets management, SAST/DAST, dependency and container scanning, image signing, AWS Config, Security Hub, GuardDuty, Inspector, and evidence automation.
  • Drive FinOps initiatives, including tagging standards, Savings Plans and Reserved Instances, per-tenant and per-workload cost attribution, and LLM cost controls.
  • Build and evolve our AI-native DevOps capabilities.

Requirements

What you’ll need
  • 5+ years of experience in DevOps, SRE, or Platform Engineering operating production systems on AWS.
  • Strong experience with CI/CD pipelines and tools such as GitHub Actions, GitLab CI, Jenkins, or CircleCI.
  • Hands-on experience operating production EKS environments, including autoscaling, ingress, secrets management, and cluster upgrades.
  • Strong AWS networking experience, including multi-account VPC design, subnets, routing, security groups, NACLs, Route 53, ACM, and load balancers.
  • Deep experience with Terraform and GitHub Actions, ideally using OIDC-based cloud authentication.
  • Experience with Aurora/RDS MySQL, Redis (ElastiCache), and S3, including backups, PITR, migrations, and lifecycle management.
  • Strong observability experience using Prometheus, Grafana, and OpenTelemetry.
  • Experience operating Argo CD at scale.
  • Experience with Infrastructure as Code tools such as Terraform, CloudFormation, or Ansible.
  • Experience managing Cloudflare services including WAF, Bot Management, Rate Limiting, and Zero Trust / Access, along with CloudFront.
  • Experience operating Kafka/MSK at scale, including topics, consumer groups, and schema registries.
  • Experience with Lambda and event-driven architectures.
  • Comfortable working with Python, Bash, and Linux systems.
  • Strong understanding of security best practices across IAM, KMS, secrets management, networking, and software supply chain security.
  • Familiarity with vulnerability scanning and compliance tooling.

Benefits

Comp & perks
  • Competitive compensation packages
  • Comprehensive health benefits:
  • 100% of employee premiums covered
  • 75%–80% of dependent premiums covered for most health, dental, and vision plans
  • 401(k) plans to support retirement planning (no employer matching currently)
  • Paid parental leave
  • Unlimited PTO
  • Flexible remote work from anywhere
  • Up to $200/month co-working reimbursement
  • Home office stipend:
  • Up to $500 for home office setup
  • $100/month for internet, phone, and related expenses

ATS Keywords

✓ Tailor your resume
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
CI/CD pipelinesGitHub ActionsTerraformCloudFormationKubernetesAWSPrometheusGrafanaOpenTelemetryPython