DistantJob

Senior DevOps Engineer

DistantJob

full-time

Posted on:

Origin:  • 🇺🇸 United States

Visit company website
AI Apply
Apply

Job Level

Senior

Tech Stack

AWSCloudDockerGoGrafanaJenkinsKubernetesPrometheusPythonTerraform

About the role

  • Design, implement, and maintain secure, scalable cloud infrastructure for high-throughput, low-latency applications (mainly AWS, with flexibility for multi-cloud environments)
  • Develop and enhance CI/CD pipelines for efficient, reliable, and consistent deployment processes (GitHub Actions and similar tools)
  • Continuously identify and implement infrastructure improvements across the entire product ecosystem
  • Architect and support serverless solutions using AWS Lambda, ECS Fargate, and event-driven architecture components
  • Establish comprehensive monitoring, alerting, and logging frameworks to ensure system reliability and visibility
  • Manage infrastructure-as-code implementations using CloudFormation, Terraform, and related tools
  • Partner with development teams to determine infrastructure requirements, deployment approaches, and operational toolsets
  • Oversee secrets management and identity systems utilizing AWS IAM and similar platforms
  • Maintain compliance with security and privacy standards, including access controls, encryption protocols, and audit mechanisms
  • Resolve production incidents across all service layers with a focus on rapid response and thorough post-incident analysis
  • Create and maintain automated backup, disaster recovery, and failover systems
  • Research and adopt emerging DevOps methodologies and technologies to enhance platform performance and reliability

Requirements

  • 10+ years of software engineering background with 6+ years focused on DevOps, Site Reliability Engineering, or Infrastructure Engineering
  • Demonstrated experience managing production environments with comprehensive infrastructure responsibilities
  • Extensive AWS expertise, including compute, storage, networking, and identity management services
  • Practical experience with serverless technologies (AWS Lambda, Step Functions, EventBridge, API Gateway, ECS Fargate)
  • Advanced skills in Docker and container orchestration platforms (Kubernetes, ECS, GKE)
  • Proficiency with CI/CD platforms such as GitHub Actions, CircleCI, ArgoCD, or Jenkins
  • Strong scripting capabilities in Bash, Python, or Go for automation and tooling development
  • Experience with observability solutions (Datadog, Prometheus, Grafana, ELK stack)
  • Solid understanding of network design, security frameworks, and zero-trust access architectures
  • Knowledge of secrets management systems and infrastructure-level access policy enforcement
  • Exceptional troubleshooting and root cause analysis abilities
  • Strong collaborative and communication skills across diverse technical and business teams
  • Continuous improvement mindset with focus on automation, optimization, and security enhancement
Scientific Games

Technical Operations Engineer

Scientific Games
Senior · Leadfull-time🇺🇸 United States
Posted: 23 days agoSource: sglottery.wd5.myworkdayjobs.com
AWSCloudGoGrafanaJenkinsKubernetesPrometheusPythonTerraform
CodingChiefs: Dedicated Remote Developers

Senior Site Reliability Engineer

CodingChiefs: Dedicated Remote Developers
Seniorfull-time🇵🇭 Philippines
Posted: 13 days agoSource: codingchiefsbv.recruitee.com
AWSCloudDockerEC2GoGrafanaJavaJenkinsKubernetesMySQLPostgresPrometheus+2 more
Aldea

Foundational AI Researcher

Aldea
Mid · Seniorfull-timeFlorida · 🇺🇸 United States
Posted: 16 days agoSource: apply.workable.com
AWSCloudDNSDockerElasticSearchFirewallsGrafanaKubernetesLinuxPostgresPrometheusPython+3 more
EXL

Full Stack Developer, FastAPI

EXL
Mid · Seniorfull-time🇮🇪 Ireland
Posted: 5 days agoSource: fa-ewjt-saasfaprod1.fa.ocs.oraclecloud.com
AWSCloudDockerGrafanaJavaScriptJenkinsKubernetesNoSQLPostgresPrometheusPythonReact+3 more
InStride

Principal Site Reliability Engineer, SRE

InStride
Leadfull-time$165k–$185k / yearArizona, California, Colorado · 🇺🇸 United States
Posted: 6 days agoSource: boards.greenhouse.io
AWSCloudGoGrafanaKubernetesPrometheusPythonTerraformTypeScript