Kraken

Site Reliability Engineer II

Kraken

full-time

Posted on:

Location: 🇺🇸 United States

Visit company website
AI Apply
Apply

Salary

💰 $118,000 - $150,000 per year

Job Level

Mid-LevelSenior

Tech Stack

AWSDjangoDockerKubernetesLinuxPostgresPythonRabbitMQRDBMSTerraform

About the role

  • Teach and support product teams on best practices for reliability, implementation patterns and effective usage of our existing platforms
  • Support product teams in improving the performance and availability of their systems
  • Be hands-on in code and infrastructure to help product teams with reliability improvements
  • Provide comprehensive feedback to the wider Platform group on improvements to be made to core infrastructure based on observations and first-hand experience in the code base
  • Support the build-out of proof-of-concept requirements in product teams as needed to evolve application deployment architecture to align with business growth as well as enhance scalability and system resilience
  • Collaborate with product teams to support the release of new features and services, ensuring adherence to reliability and performance standards
  • Guide product teams in designing systems for resilience and graceful failure under heavy load
  • Assist application teams with post-incident tasks and follow-ups, and contribute to the creation and review of post-mortem documentation
  • Analyse incident metrics to identify trends and potential improvements, communicating these insights to the product teams
  • Help solve interesting and difficult problems to drive disruption in the global energy market

Requirements

  • Great communication skills, working effectively with developers, product managers and other business stakeholders to understand, design and deliver impactful projects and reliability improvements
  • Proficient using AWS; we use a lot of different AWS services and not just the standard few
  • Strong Python skills; particularly with Django, the Django ORM and Celery
  • Expertise in PostgreSQL or a similar RDBMS, particularly in Amazon RDS at scale
  • Experience with Docker and Kubernetes (we use Amazon EKS in production)
  • Experience with Datadog or a similar logging/monitoring tool
  • Experience with messaging queues, event-driven async processing (we use RabbitMQ)
  • Experience with Terraform or a similar infrastructure-as-code tool
  • Experience working with a Linux distribution
  • Previous experience working in small, highly-autonomous teams
  • (Helpful) Previous experience as a Site Reliability Engineer
  • (Helpful) Experience working on SaaS platforms and engaging product teams
  • (Helpful) Experience managing and supporting large scale internet-facing services
  • (Helpful) Experience responding to incidents and outages, writing incident reports and organising retrospectives
  • (Helpful) Experience working with very large relational databases
  • (Helpful) Experience using service level objectives to improve application performance
  • (Helpful) A proactive, innovative mindset
3Pillar Global

Senior DevOps Engineer, AWS

3Pillar Global
Seniorfull-time🇷🇴 Romania
Posted: 14 hours agoSource: jobs.lever.co
AWSCloudDockerDynamoDBEC2GoJenkinsKubernetesLinuxMySQL.NETPrometheus+4 more
CodingChiefs: Dedicated Remote Developers

Senior Site Reliability Engineer

CodingChiefs: Dedicated Remote Developers
Seniorfull-time🇵🇭 Philippines
Posted: 17 days agoSource: codingchiefsbv.recruitee.com
AWSCloudDockerEC2GoGrafanaJavaJenkinsKubernetesMySQLPostgresPrometheus+2 more
Aldea

Foundational AI Researcher

Aldea
Mid · Seniorfull-timeFlorida · 🇺🇸 United States
Posted: 20 days agoSource: apply.workable.com
AWSCloudDNSDockerElasticSearchFirewallsGrafanaKubernetesLinuxPostgresPrometheusPython+3 more
DistantJob

Senior DevOps Engineer

DistantJob
Seniorfull-time🇺🇸 United States
Posted: 4 days agoSource: boards.greenhouse.io
AWSCloudDockerGoGrafanaJenkinsKubernetesPrometheusPythonTerraform
Initiate Government Solutions, LLC.

Data Architect

Initiate Government Solutions, LLC.
Mid · Seniorfull-timeDistrict of Columbia, Washington · 🇺🇸 United States
Posted: 10 days agoSource: recruiting.paylocity.com
AWSAzureCloudDockerDynamoDBETLJavaScriptKubernetesMongoDBMySQLOraclePython+2 more