Replit

Senior Infrastructure Engineer

Replit

full-time

Posted on:

Location Type: Hybrid

Location: Foster CityCaliforniaUnited States

Visit company website

Explore more

AI Apply
Apply

Salary

💰 $190,000 - $240,000 per year

Job Level

About the role

  • Drive Automation and Infrastructure as Code: Build and improve automation to eliminate toil and operational work. Maintain CI/CD pipelines and infrastructure automation using tools like Terraform or Pulumi. Create self-healing systems that can automatically respond to common failure scenarios.
  • Optimize Performance and Infrastructure: Collaborate with core infrastructure and product teams to performance tune and optimize our cloud deployments (Kubernetes, Docker, GCP). Identify and resolve performance bottlenecks and implement capacity planning strategies.
  • Elevate Developer Experience: Design and implement improvements to our build, test, and deployment systems to make software delivery faster, safer, and more reliable for all engineers.
  • Drive Cross-Team Improvements: Partner with service owners across Replit to understand their pain points, and collaborate on implementing build/test/deploy enhancements within their specific services.
  • Build Shared Tooling: Create and maintain centralized tooling and automation that improves the engineering lifecycle, from local development to production monitoring.
  • Debug and Harden Systems: Dive deep into debugging difficult technical problems, making our systems and products more robust, operable, and easier to diagnose.
  • Collaborate on Design Reviews: Participate in feature and system design reviews, contributing expertise on security, scale, and operational considerations.
  • Build and Integrate: Write high-quality, well-tested code to meet the needs of your customers, including building pipelines to integrate with 3rd party vendors.

Requirements

  • 4+ years of experience in Site Reliability Engineering or similar roles (DevOps, Systems Engineering, Infrastructure Engineering).
  • Strong programming skills in languages like Python or Go.
  • You write high-quality, well-tested code.
  • Solid understanding of distributed systems. You've built, scaled, and maintained production services and understand service-oriented architecture.
  • Experience with container orchestration platforms (Kubernetes) and cloud-native technologies.
  • Experience implementing and maintaining monitoring/observability solutions, with strong skills in debugging and performance tuning.
  • Strong incident management skills with experience participating in incident response and demonstrated critical thinking under pressure.
  • Experience with infrastructure as code (e.g., Terraform) and configuration management tools.
  • Excellent written and verbal communication skills, with an ability to explain technical concepts clearly.
  • A willingness to dive into understanding, debugging, and improving any layer of the stack.
  • You're passionate about making software creation accessible and empowering the next generation of builders.
Benefits
  • Competitive Salary & Equity
  • 401(k) Program
  • Health, Dental, Vision and Life Insurance
  • Short Term and Long Term Disability
  • Paid Parental, Medical, Caregiver Leave
  • Commuter Benefits
  • Monthly Wellness Stipend
  • Autonoumous Work Environement
  • In Office Set-Up Reimbursement
  • Flexible Time Off (FTO) + Holidays
  • Quarterly Team Gatherings
  • In Office Amenities
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
PythonGoCI/CDTerraformPulumiKubernetesDockerGCPmonitoring solutionsperformance tuning
Soft Skills
incident managementcritical thinkingcommunicationcollaborationdebuggingproblem-solvingdesign review participationcustomer focusadaptabilityteamwork