Apply

Ready to go for it?

AI Apply speeds things up—apply directly if you prefer.

FREE ACCESS
5,000–10,000 jobs/day
JobTailor Logo

See all jobs on JobTailor

Search thousands of fresh jobs every day.

Discover
  • Fresh listings
  • Fast filters
  • No subscription required
Create a free account and start exploring right away.
Writer

Infrastructure Engineer

Writer

. Automate operational tasks and infrastructure management by developing robust tools and platforms using Python, Go, or similar languages, significantly reducing manual toil across our production environment .

Posted 5/11/2026full-timeLondon • 🇬🇧 United KingdomSeniorLeadWebsite

Tech Stack

Tools & technologies
AWSAzureCloudDistributed SystemsDockerGoGoogle Cloud PlatformGrafanaJavaKubernetesPrometheusPythonTerraform

About the role

Key responsibilities & impact
  • Automate operational tasks and infrastructure management by developing robust tools and platforms using Python, Go, or similar languages, significantly reducing manual toil across our production environment
  • Design and implement scalable, fault-tolerant infrastructure solutions on public cloud providers (AWS, GCP, Azure) to support WRITER's rapidly expanding, high-traffic AI platform
  • Own the reliability, performance, and efficiency of WRITER’s core services, defining and upholding stringent Service Level Objectives (SLOs) and Error Budgets
  • Own the observability stack for monitoring, logging, and alerting systems to ensure rapid detection of issues across our complex distributed systems
  • Lead incident response, post-mortems, and root cause analyses, applying learnings to proactively prevent future outages and build a more resilient system architecture
  • Collaborate closely with product and engineering teams, providing expert guidance on system design for reliability, performance, and scalability from conception through launch

Requirements

What you’ll need
  • A solid 7+ years of experience in infrastructure engineering, DevOps, or a similar role focused on building and operating large-scale, high-availability production systems
  • Deep expertise with cloud platforms (AWS strongly preferred), containerization technologies like Docker and Kubernetes, and Infrastructure-as-Code tools such as Terraform
  • Strong proficiency in programming languages such as Python, Java, Go for automation and monitoring
  • Knowledge of monitoring and logging tools (e.g., Prometheus, Grafana, ELK Stack) to maintain system health and performance
  • Demonstrated ability to Challenge the status quo, proactively identify systemic weaknesses, and propose innovative solutions to complex reliability problems
  • Excellent communication, collaboration, and problem-solving skills, with a talent for building strong relationships and Connecting with cross-functional teams
  • A strong sense of ownership and accountability, eager to Own mission-critical systems and drive them toward peak performance and unparalleled reliability

Benefits

Comp & perks
  • Generous PTO, plus company holidays
  • Comprehensive medical and dental insurance
  • Paid parental leave for all parents (16 weeks)
  • Fertility and family planning support
  • Early-detection cancer testing through Galleri
  • Competitive pension scheme and company contribution
  • Annual work-life stipends for:
  • Wellness stipend for gym, massage/chiropractor, personal training, etc.
  • Learning and development stipend
  • Company-wide off-sites and team off-sites
  • Competitive compensation and company stock options

ATS Keywords

✓ Tailor your resume
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
PythonGoJavaInfrastructure-as-CodeDevOpsContainerizationMonitoringAutomationFault-tolerant systemsScalable infrastructure
Soft Skills
CommunicationCollaborationProblem-solvingOwnershipAccountabilityInnovative thinkingRelationship buildingProactive identification of issuesLeadershipIncident response