Weekday (YC W21)

Staff Engineer – DevOps

Weekday (YC W21)

full-time

Posted on:

Location Type: Remote

Location: India

Visit company website

Explore more

AI Apply
Apply

Job Level

About the role

  • Architect and evolve our DevOps ecosystem, champion cloud cost governance, and implement best-in-class container orchestration practices.
  • Work cross-functionally with engineering, security, and finance teams to ensure operational excellence while proactively managing infrastructure spend.
  • Lead end-to-end DevOps strategy, including CI/CD pipelines, automation, infrastructure-as-code, and release engineering.
  • Design scalable, resilient cloud-native architectures aligned with business growth.
  • Establish DevOps best practices, reliability standards, and operational governance.
  • Architect and manage large-scale Kubernetes environments for production workloads.
  • Optimize workloads across clusters for performance, reliability, and cost efficiency.
  • Build and maintain containerized applications using Docker and Kubernetes, ensuring portability and scalability.
  • Drive multi-cluster, multi-region deployments where necessary.
  • Own infrastructure cost visibility and optimization initiatives.
  • Implement cloud cost-saving strategies including rightsizing, reserved capacity planning, auto-scaling optimization, and workload scheduling.
  • Create dashboards and reporting mechanisms to track infrastructure ROI and spend trends.
  • Continuously identify inefficiencies and implement measurable cost-reduction initiatives without compromising performance.
  • Design and implement comprehensive monitoring systems using Grafana and related observability tools.
  • Build real-time dashboards for system health, performance metrics, and cost insights.
  • Establish alerting frameworks to minimize downtime and improve incident response.
  • Drive improvements in system reliability through data-driven monitoring and post-incident analysis.
  • Automate provisioning, deployments, scaling, and recovery processes.
  • Improve system resilience, availability, and disaster recovery strategies.

Requirements

  • 9–15 years of experience in DevOps, Site Reliability Engineering, or Cloud Infrastructure roles.
  • Deep expertise in Kubernetes, container orchestration, and production-grade Docker and Kubernetes implementations.
  • Strong hands-on experience with Grafana, monitoring systems, and observability frameworks.
  • Proven track record in cost savings initiatives and infrastructure cost planning in cloud environments.
  • Experience designing highly available, scalable systems in AWS, Azure, or GCP.
  • Strong understanding of Infrastructure-as-Code (Terraform, CloudFormation, etc.).
  • Expertise in CI/CD automation and release management.
  • Solid knowledge of networking, security best practices, and cloud architecture patterns.
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
DevOpsSite Reliability EngineeringCloud InfrastructureKubernetesDockerGrafanaInfrastructure-as-CodeCI/CD automationcloud architecturecost optimization
Soft Skills
cross-functional collaborationoperational excellenceleadershipdata-driven decision makingincident responseproblem-solvingcommunicationcost managementefficiency improvementstrategic planning