Senior DevOps Engineer, Infrastructure – Reliability

Worth AI

full-time

Posted on: 2/16/2026

Location Type: Hybrid

Location: Orlando • Florida • United States

Visit company website

Explore more

DevOps Engineer jobs

✨ AI Apply

Apply

Job Level

Senior

Tech Stack

AWS Cloud Distributed Systems Kafka Kubernetes Postgres Terraform

About the role

Conduct regular interviews with engineering teams to identify operational pain points in CI/CD, deployments, observability, and cloud environments and proactively eliminate them.
Design and implement scalable Infrastructure-as-Code patterns using tools like Terraform to standardize cloud provisioning and reduce configuration drift.
Own and evolve our Kubernetes platform (EKS or self-managed), ensuring workloads are secure, scalable, and resilient by default.
Architect and optimize CI/CD pipelines to improve deployment frequency, reduce lead time, and increase confidence in releases.
Lead systemic reliability initiatives, including incident response improvements, root cause analysis practices, and postmortem frameworks.
Design and enforce secure networking, IAM, and secrets management strategies across environments.
Improve observability by refining metrics, logs, and tracing using tools like DataDog, ensuring actionable insight into system health.
Optimize cloud cost efficiency through rightsizing, autoscaling strategies, and architectural improvements.
Own disaster recovery planning, backup strategies, and multi-region resilience initiatives.
Refactor brittle or manually managed infrastructure into automated, testable, and reproducible systems.
Introduce new infrastructure tooling or architectural shifts and drive adoption through documentation, workshops, and hands-on support.
Lead by example in incident management, risk mitigation, and operational excellence.
Communicate technical trade-offs clearly across engineering and product stakeholders, balancing speed with safety.

Requirements

8+ years of experience in DevOps, SRE, or Infrastructure Engineering roles.
Proven experience designing and operating production Kubernetes environments at scale.
Deep hands-on expertise with AWS infrastructure and cloud networking.
Strong experience building and maintaining Terraform modules across large cloud environments.
Demonstrated ownership of CI/CD systems and measurable improvement of DORA metrics.
Experience leading incident response processes and driving meaningful postmortem outcomes.
Strong understanding of distributed systems, event-driven architectures (Kafka), and database performance (PostgreSQL).
Proven ability to modernize legacy infrastructure and eliminate manual operational toil.
Experience navigating high-ambiguity environments and translating operational friction into prioritized infrastructure roadmaps.
Demonstrated ability to build trust across teams while raising the reliability bar.

Benefits

Health Care Plan (Medical, Dental & Vision)
Retirement Plan (401k, IRA)
Life Insurance
Flexible Vacation
Work From Home
Free Food & Snacks (in office)
Orlando, Florida (Hybrid)
Wellness Resources

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools

Infrastructure-as-CodeTerraformKubernetesCI/CDAWScloud networkingDORA metricsdistributed systemsevent-driven architecturesPostgreSQL

Soft Skills

incident managementrisk mitigationoperational excellencecommunicationleadershiptrust buildingproblem solvingcollaborationprioritizationadaptability