Salary
💰 $154,800 - $248,400 per year
Tech Stack
AnsibleAWSAzureCloudEC2Google Cloud PlatformKafkaKubernetesTerraform
About the role
- Build, manage, and optimize foundational systems such as Kubernetes clusters, ecosystems, cloud-based resources (e.g., EC2, S3), and edge routing
- Develop building blocks with well-defined lifecycles, scaling strategies, and maintenance
- Implement robust, scalable architecture to support the seamless operation of platform services
- Ensure high availability and fault tolerance across all infrastructure components
- Collaborate with platform software engineers and site reliability engineers to define and meet Service Level Objectives (SLOs)
- Continuously monitor and tune system performance and iterate on infrastructure based on user feedback and metrics
- Participate in on-call rotations, incident resolution, retrospectives, and proactive prevention
- Implement self-healing, failover strategies, backups, disaster recovery plans, and proactive monitoring/alerting
- Partner with engineering teams to integrate infrastructure solutions, share expertise, and document processes and best practices
Requirements
- 5+ years managing large-scale infrastructure systems in production environments
- Hands-on expertise with Kubernetes, cloud services (AWS/GCP/Azure), and configuration management tools
- Nice to have: Kafka
- Proficiency in infrastructure as code (IaC) tools like Terraform or Ansible
- Strong knowledge of network architecture, security, and performance tuning
- Familiarity with containerization, service discovery, and load-balancing technologies
- Experience participating in on-call rotations for infrastructure-related incidents and incident response
- Experience implementing backups, disaster recovery, self-healing, failover strategies, monitoring and alerting
- Competitive compensation that may include equity
- Equity grants of restricted stock (RSUs)
- Retirement and Employee Stock Purchase Plans
- Flexible paid time off
- Comprehensive benefit plans covering medical, dental, vision, life, and disability
- Family services that include fertility benefits and equal paid parental leave
- Professional development supported by formal career pathing, learning platforms, and a yearly learning stipend
- A curated in-office employee experience, designed to foster community, team connections, and innovation
- Opportunities to give back to your community, including an annual company-wide Volunteer Week and donation matching
- Employee Resource Groups that provide supportive communities within Braze
- Collaborative, transparent, and fun culture recognized as a Great Place to Work®
ATS Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard skills
KubernetesAWSGCPAzureTerraformAnsibleKafkainfrastructure as codenetwork architectureperformance tuning
Soft skills
collaborationincident resolutionproactive preventiondocumentationcommunication