VGS

Senior Infrastructure Engineer

VGS

full-time

Posted on:

Location Type: Hybrid

Location: California, Colorado, Connecticut, Florida, Illinois, New York, North Carolina, Oregon, Texas, Virginia, Washington • 🇺🇸 United States

Visit company website
AI Apply
Apply

Salary

💰 $140,000 - $190,000 per year

Job Level

Senior

Tech Stack

AWSCloudDistributed SystemsDockerGoGrafanaJavaKafkaKubernetesLinuxMicroservicesPrometheusPythonSpringSwiftTerraform

About the role

  • Architect and maintain scalable, reliable infrastructure: Design and optimize infrastructure for high availability, fault tolerance, and performance across distributed systems.
  • Lead incident management and root cause analysis: Own incident response processes, ensure swift resolution of issues, and drive post-incident improvements to prevent recurrences.
  • Service monitoring and automation: Build and maintain automated monitoring, alerting, and healing systems that improve system health, reduce manual intervention, and minimize downtime.
  • Performance tuning and capacity planning: Identify bottlenecks and optimization opportunities, and implement scaling strategies to handle traffic spikes and growing workloads efficiently.
  • Collaborate with cross-functional teams: Work closely with software engineers, product teams, and DevOps to enhance system reliability and delivery pipelines.
  • Improve operational processes: Champion continuous improvement initiatives in deployment, scaling, and performance testing, while advocating for the adoption of SRE best practices across the organization.
  • Mentorship and leadership: Provide technical mentorship to junior engineers, contribute to strategic decisions around infrastructure, and ensure best practices are implemented at scale.
  • Be proactive and innovative: we rely on your feedback to build a world-class product.
  • Be a part of a team that believes in the core values of transparency, collaboration, grit, and humility; in going above and beyond what is required to do the right thing for our customers and the company; and in having fun while doing all this!

Requirements

  • Proven experience in Infrastructure/SRE roles, with a track record of managing production systems in complex, large-scale environments.
  • Strong proficiency in AWS, including infrastructure-as-code (Terraform, CloudFormation, etc.).
  • Solid understanding of cloud-native architecture, Linux Systems, microservices, Infrastructure-as-code (Terraform, CloudFormation, CDK), CI/CD (CircleCI, GitHub Actions, Argo), GitOps, Authentication and Authorization, APIs and API Gateway, Docker, Kubernetes (EKS), Kafka (MSK), Java, Spring Framework, Python, and AWS services.
  • Strong plus if you are a database wiz.
  • Expertise in monitoring and observability tools like Prometheus, Grafana, Open Telemetry, New Relic, or similar tools to measure system health and performance.
  • Programming and scripting experience in languages such as Python, Go, Bash, or other relevant languages used in automating infrastructure.
  • Solid understanding of networking, security, and load balancing in cloud-native environments.
  • Strong communication and collaboration skills, with the ability to lead cross-functional initiatives and mentor junior team members.
  • Experience with incident management and disaster recovery best practices.
  • Strong written and verbal communication skills.
Benefits
  • Flexible work hours and flexible PTO
  • Competitive health benefits
  • VGS stock options
  • 401k plan, with employer matching 4% and immediate vesting (available only for US employees)
  • Life & disability insurance
  • Pre-tax flexible spending accounts, dependent and healthcare FSA (available only for US employees)
  • Global parental leave program
  • Employee Assistance Program
  • Home Internet reimbursement
  • New hire home office set-up allowance
  • Professional learning reimbursement

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard skills
AWSTerraformCloudFormationLinux SystemsmicroservicesCI/CDDockerKubernetesJavaPython
Soft skills
leadershipmentorshipcommunicationcollaborationproactiveinnovativetransparencygrithumilitycontinuous improvement