
Senior Infrastructure Engineer
VGS
full-time
Posted on:
Location Type: Hybrid
Location: California, Colorado, Connecticut, Florida, Illinois, New York, North Carolina, Oregon, Texas, Virginia, Washington • 🇺🇸 United States
Visit company websiteSalary
💰 $140,000 - $190,000 per year
Job Level
Senior
Tech Stack
AWSCloudDistributed SystemsDockerGoGrafanaJavaKafkaKubernetesLinuxMicroservicesPrometheusPythonSpringSwiftTerraform
About the role
- Architect and maintain scalable, reliable infrastructure: Design and optimize infrastructure for high availability, fault tolerance, and performance across distributed systems.
- Lead incident management and root cause analysis: Own incident response processes, ensure swift resolution of issues, and drive post-incident improvements to prevent recurrences.
- Service monitoring and automation: Build and maintain automated monitoring, alerting, and healing systems that improve system health, reduce manual intervention, and minimize downtime.
- Performance tuning and capacity planning: Identify bottlenecks and optimization opportunities, and implement scaling strategies to handle traffic spikes and growing workloads efficiently.
- Collaborate with cross-functional teams: Work closely with software engineers, product teams, and DevOps to enhance system reliability and delivery pipelines.
- Improve operational processes: Champion continuous improvement initiatives in deployment, scaling, and performance testing, while advocating for the adoption of SRE best practices across the organization.
- Mentorship and leadership: Provide technical mentorship to junior engineers, contribute to strategic decisions around infrastructure, and ensure best practices are implemented at scale.
- Be proactive and innovative: we rely on your feedback to build a world-class product.
- Be a part of a team that believes in the core values of transparency, collaboration, grit, and humility; in going above and beyond what is required to do the right thing for our customers and the company; and in having fun while doing all this!
Requirements
- Proven experience in Infrastructure/SRE roles, with a track record of managing production systems in complex, large-scale environments.
- Strong proficiency in AWS, including infrastructure-as-code (Terraform, CloudFormation, etc.).
- Solid understanding of cloud-native architecture, Linux Systems, microservices, Infrastructure-as-code (Terraform, CloudFormation, CDK), CI/CD (CircleCI, GitHub Actions, Argo), GitOps, Authentication and Authorization, APIs and API Gateway, Docker, Kubernetes (EKS), Kafka (MSK), Java, Spring Framework, Python, and AWS services.
- Strong plus if you are a database wiz.
- Expertise in monitoring and observability tools like Prometheus, Grafana, Open Telemetry, New Relic, or similar tools to measure system health and performance.
- Programming and scripting experience in languages such as Python, Go, Bash, or other relevant languages used in automating infrastructure.
- Solid understanding of networking, security, and load balancing in cloud-native environments.
- Strong communication and collaboration skills, with the ability to lead cross-functional initiatives and mentor junior team members.
- Experience with incident management and disaster recovery best practices.
- Strong written and verbal communication skills.
Benefits
- Flexible work hours and flexible PTO
- Competitive health benefits
- VGS stock options
- 401k plan, with employer matching 4% and immediate vesting (available only for US employees)
- Life & disability insurance
- Pre-tax flexible spending accounts, dependent and healthcare FSA (available only for US employees)
- Global parental leave program
- Employee Assistance Program
- Home Internet reimbursement
- New hire home office set-up allowance
- Professional learning reimbursement
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard skills
AWSTerraformCloudFormationLinux SystemsmicroservicesCI/CDDockerKubernetesJavaPython
Soft skills
leadershipmentorshipcommunicationcollaborationproactiveinnovativetransparencygrithumilitycontinuous improvement