Apply

Ready to go for it?

AI Apply speeds things up—apply directly if you prefer.

FREE ACCESS
5,000–10,000 jobs/day
JobTailor Logo

See all jobs on JobTailor

Search thousands of fresh jobs every day.

Discover
  • Fresh listings
  • Fast filters
  • No subscription required
Create a free account and start exploring right away.
Parallel Domain

Senior Site Reliability Engineer

Parallel Domain

Senior Site Reliability Engineer managing AWS infrastructure and Kubernetes for autonomous systems testing. Collaborating across teams to ensure system reliability and security.

Posted 4/29/2026full-timeRemote • Oregon, Washington • 🇺🇸 United StatesSenior💰 CA$145,000 - CA$185,000 per yearWebsite

Tech Stack

Tools & technologies
AWSCloudDNSGrafanaKubernetesLinuxNode.jsPrometheusPythonTerraform

About the role

Key responsibilities & impact
  • Design, build, and maintain multi-region AWS infrastructure using Terraform.
  • Operate and scale EKS clusters across production regions: autoscaling, node lifecycle, workload health.
  • Manage networking across environments: VPC design, DNS, load balancing, and cross-region connectivity.
  • Support infrastructure changes, migrations, and expansions into new regions.
  • Help build and run incident management processes: severity definitions, escalation paths, on-call practices.
  • Lead incident response, debugging, and root-cause analysis.
  • Write postmortems and drive systemic reliability improvements from what they surface.
  • Improve observability across metrics, logging, tracing, and dashboards.
  • Provide security-conscious feedback on platform architecture decisions.
  • Own cloud IAM governance: roles, policies, and access boundaries across accounts and services.
  • Improve CI/CD pipelines and infrastructure validation.
  • Support engineers with infrastructure debugging, environment setup, and performance issues.
  • Contribute to tooling and automation in Python and Bash.

Requirements

What you’ll need
  • 5+ years in SRE, DevOps, or infrastructure engineering roles, with a track record of operating production systems across multiple regions.
  • Terraform experience: Modules, state management, and multi-environment patterns.
  • AWS depth: Solid experience across VPC, IAM, EKS, S3, and CloudWatch.
  • Kubernetes expertise: Cluster operations, autoscaling, RBAC, and Helm.
  • CI/CD and GitOps: Experience with GitHub Actions, ArgoCD, or similar workflows.
  • Networking fundamentals: CIDR, DNS, load balancing, VPN, and cross-region connectivity.
  • Observability: Experience with tooling such as Prometheus and Grafana.
  • Scripting: Comfort with Python and Bash for tooling and automation.
  • Cross-platform familiarity: Working knowledge of both Linux and Windows environments. Operational experience supporting Windows-based workloads is a meaningful advantage.
  • Pragmatism and ownership: Comfortable in a fast-moving startup with evolving priorities. You take ownership of systems while collaborating closely with other teams, and you're pragmatic about tradeoffs between speed, reliability, and complexity.

Benefits

Comp & perks
  • equity
  • full health/dental/vision coverage
  • learning stipend
  • generous vacation

ATS Keywords

✓ Tailor your resume
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
AWSTerraformEKSKubernetesCI/CDGitOpsPythonBashNetworkingObservability
Soft Skills
pragmatismownershipcollaborationincident managementdebuggingroot-cause analysissystemic reliability improvement