Staff DevOps Engineer

Willowtree, LLC

full-time

Posted on:

Location Type: Hybrid

Location: Toronto • 🇨🇦 Canada

Visit company website
AI Apply
Apply

Salary

💰 CA$118,400 - CA$148,000 per year

Job Level

Lead

Tech Stack

ApacheAWSCloudDistributed SystemsDNSGoGoogle Cloud PlatformGrafanaJenkinsKafkaKubernetesMicroservicesPrometheusPythonTCP/IPTerraform

About the role

  • Architect and design scalable, secure, and efficient cloud infrastructure solutions
  • Build, refine, and scale custom software solutions that power operational infrastructure
  • Drive technical decisions and establish infrastructure standards and best practices
  • Provide technical leadership and mentorship through code reviews and architecture discussions
  • Design and implement complex infrastructure automation solutions
  • Lead critical infrastructure initiatives and proof-of-concept projects
  • Collaborate with cross-functional teams to solve complex technical challenges
  • Contribute to technical strategy and roadmap planning
  • Research and evaluate new technologies for potential adoption
  • Create and maintain technical documentation for infrastructure systems
  • Design and implement infrastructure to support LLM-provider integrations
  • Architect event-driven systems to support AI-powered platform
  • Optimize API management and security using tools like Apigee

Requirements

  • Extensive, deep experience in infrastructure engineering
  • Deep technical knowledge of GCP and AWS cloud platforms
  • One or more relevant certifications from Google or AWS
  • Expert-level experience with infrastructure-as-code (Terraform, CloudFormation)
  • Strong background in automation, CI/CD, and DevOps practices
  • Experience designing and implementing large-scale distributed systems
  • Expertise in container orchestration (Kubernetes) and microservices architecture
  • Strong understanding of security best practices and compliance requirements
  • Experience with monitoring, logging, and observability tools (Prometheus, Grafana or similar)
  • Excellent problem-solving and system design skills
  • Strong written and verbal communication skills
  • Bachelor's degree in Computer Science, Engineering, or related field (or equivalent experience)
  • Scripting skills: Python, Bash, Go
  • CI/CD tools: Jenkins, GitHub Actions, or similar
  • Networking knowledge: TCP/IP, DNS, Load Balancing
  • Security: IAM, Network Security, Encryption
  • API management: Apigee (preferred)
  • Event-driven architectures: Apache Kafka, Google Pub/Sub, or AWS EventBridge
  • AI/ML infrastructure experience: Kubeflow, MLflow, or similar
Benefits
  • Healthcare benefits - Medical, Vision, Dental
  • Retirement Savings Matching
  • Competitive PTO Policy
  • Employee Assistance Program (EAP)
  • Life & Disability Insurance
  • Annual performance bonus eligible
  • And more!

ATS Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard skills
infrastructure engineeringGCPAWSinfrastructure-as-codeTerraformCloudFormationautomationCI/CDDevOpsKubernetes
Soft skills
technical leadershipmentorshipproblem-solvingsystem designwritten communicationverbal communication
Certifications
Google certificationAWS certification
qode.world

Infrastructure Engineer, Kafka and GenAI

qode.world
Mid · Seniorfull-time🇺🇸 United States
Posted: 31 days agoSource: apply.workable.com
ApacheAWSAzureCloudDockerGoGoogle Cloud PlatformGrafanaJenkinsKafkaKubernetesPrometheus+4 more
PEAK6

Software Engineering Intern

PEAK6
Entryinternship$16–$28Illinois, New York · 🇺🇸 United States
Posted: 8 days agoSource: peak6group.wd1.myworkdayjobs.com
ApacheAWSCloudGoogle Cloud PlatformGrafanaKafkaMicroservicesPrometheus
Articul8 AI

Senior Site Reliability Engineer, SRE

Articul8 AI
Seniorfull-timeCalifornia · 🇺🇸 United States
Posted: 20 days agoSource: jobs.ashbyhq.com
AWSAzureCloudDistributed SystemsDockerGoGoogle Cloud PlatformGrafanaKubernetesNoSQLPrometheusPython+2 more
Taulia

Senior CloudOps Engineer

Taulia
Seniorfull-time🇺🇸 United States
Posted: 1 day agoSource: careers.taulia.com
CassandraCloudDNSDockerGoogle Cloud PlatformGrafanaJenkinsKafkaLinuxMicroservicesMySQLPrometheus+6 more
SearchStax

Staff Site Reliability Engineer, AWS

SearchStax
Leadfull-time$170k–$240k / year🇺🇸 United States
Posted: 21 days agoSource: jobs.ashbyhq.com
ApacheAWSCloudDistributed SystemsDockerEC2ElasticSearchGoGrafanaJenkinsKubernetesOpen Source+3 more