Salary
💰 $130,000 - $165,000 per year
Tech Stack
AnsibleAWSChefCloudDistributed SystemsDockerGoGoogle Cloud PlatformGrafanaJenkinsKubernetesPackerPerlPrometheusPythonRubyTerraform
About the role
- WideOrbit is looking for a Senior Platform Engineer with a strong focus on Observability to help design, build, and maintain the internal platform and tooling that powers our engineering organization.
In addition to contributing to the reliability of our products, core platform capabilities and infrastructure automation, you’ll lead efforts to improve our product observability footprint - defining meaningful SLIs/SLOs, building reliable dashboards and alerts, and driving a culture of proactive monitoring and resilience across our systems and services.
You’ll collaborate closely within DevOps and across Development, QA and Security teams to ensure that our services are well-instrumented, scalable, and resilient.
Your work will empower engineers with the tools and insights they need to ship confidently and operate efficiently.
At WideOrbit, we value intellectual curiosity, collaboration, and a growth mindset. We encourage engineers to take ownership of impactful projects while supporting continuous learning and innovation.
This role offers you the opportunity to influence both how we build and how we observe our systems - shaping the foundation for reliability at scale.
Here’s what success will look like: Own reliability & performance for WideOrbit products - designing, building and operating core platform services in code (Terraform, Ansible etc)
Engineer observability end-to-end - metrics, logs, traces, SLOs, actionable alerts - so incidents are detected and resolved before customers notice.
Create self-service tooling (provisioning, deploy, rollback, scale-out) so Dev teams can move fast without sacrificing quality
Automate everything: golden AMIs/container images, CI/CD pipelines (GitHub Actions, Team City, Octopus), and recurring maintenance tasks.
Manage modern runtimes - Docker/K8s (required) and serverless where it fits - balancing cost, performance, and security
Embed security & governance (least-privilege IAM, secret rotation, compliance checks) directly into IaC and pipeline stages.
Troubleshoot production issues in partnership with Dev, DevOps and DBAs; participate in an on-call rotation
Shape the platform roadmap, evaluate new tech, document best practices, and mentor peers in SRE/DevOps principles
Requirements
- 8+ years of experience in Platform/SRE / DevOps / Infrastructure Engineering with focus on Availability and Reliability
Proficiency in Infrastructure as Code (IaC) tools (e.g, Terraform, CloudFormation, Ansible).
Strong experience with at least one major cloud provider (GCP or AWS preferred)
Bash, Python, Powershell, Go, Perl, Ruby – you know which one to use and when.
OS is just one part of the ecosystem –you're well versed in the rest (networking, storage, DBs, etc).
Terraform, Packer, CloudFormation, Ansible, Chef, Helm – you know where each is strong and weak
Experience designing and implementing CI/CD pipelines (e.g., Github Actions, Team City, Octopus, Jenkins)
Experience designing, analyzing, and troubleshooting large-scale distributed systems.
Containerization experience required.
Serverless Technologies nice-to-have.
Prometheus, Grafana, LogicMonitor, Solarwinds, DataDog – you know that observability is imperative to success.
A strong understanding of security best practices in cloud environments.
Contributions to open-source projects is nice-to-have.
Ability to coach others, write great docs, and thrive in ambiguity
Bias for automation – you look for repeatable patterns and eliminate toil.
Resourceful & self-directed – you solve problems independently but escalate early when needed