RELX

Systems Engineering Lead (Lead Ops Engineer)

RELX

full-time

Posted on:

Origin:  • 🇺🇸 United States • Kentucky

Visit company website
AI Apply
Manual Apply

Job Level

Senior

Tech Stack

AnsibleAWSAzureChefCloudDockerGoGoogle Cloud PlatformJenkinsKubernetesLinuxOpenShiftPuppetPythonRubySDLCTerraform

About the role

  • Mentoring DevOps & Cloud within the team and organization, you will lead working sessions with team members and provide guidance to the team, ensuring that engineers are working efficiently, coaching your engineers into a highly effective, adaptable, cross functional Technical Infrastructure & Operations team.
  • Contributing to common ‘paved road’ and self-service modules which allows teams to move at pace, operate, and maintain software systems across the full breadth of the SDLC.
  • Day to day leadership and mentorship of your team facilitating an environment where innovation and learning will thrive, developing an inclusive, engaged, agile culture to promote innovation and collaboration where our colleagues can grow their careers.
  • Ensuring Elsevier’s operational frameworks, policies and best practices are consistently applied, improving the reliability and performance of Elsevier’s product portfolio, including systems design, incident management, disaster recovery, lifecycle management products, L3 support response to tool outages and performance alerts.
  • Leading Platform automation to increase deployment frequency, minimize change failures, maintain service levels, and ensure security though optimal construction and implementation of CI/CD pipelines providing consistency and reliability throughout the lifecycle, using scripting languages (Python, Bash, PowerShell) infrastructure as code (Terraform, Cloud Formation) and Cloud native tools like Lambdas.
  • Driving modernization of our entire technology stack and alignment with architects to implement multi-region AWS infrastructure, accelerating Platform innovations by ensuring we maintain and promote the use of secure, high-performing, and reliable frameworks and shared services.
  • Work closely with the Operational Command center, monitoring the health of the tools, triage issues, troubleshooting tooling and integration issues efficiently while effectively communicating escalations and outcomes, debugging production issues across all levels of the stack, embracing a blameless culture to prevent incidents from ever happening.

Requirements

  • Advanced problem-solving experience involving leading teams in identifying, researching, and coordinating the resources necessary to effectively troubleshoot/diagnose complex project issues; prior success extracting/translating findings into alternatives/solutions, good fault diagnostic skills with the ability to assess and prioritize faults and respond or escalate accordingly
  • Demonstrate experience and best practices in AWS Architecture with an accreditation or proficiency in Amazon Web Services (AWS), lead the development of technical standards and perform reviews to ensure enterprise and architectural standards and processes are followed.
  • Experience facilitating technical and planning meetings to identify best outcomes seek diverse ideas and perspectives from a variety of sources to create better solutions, products, and services and champion innovation within your squads and across the organization.
  • Experience in authoring CI/CD pipelines, automation elements related to infrastructure composition, deployment orchestration, and monitoring, ability to build and deploy code with Jenkins or similar tools that allow for rapid release of high-quality software.
  • Experience in containerization and orchestration (with Docker and Kubernetes respectively), deploying application with container technology (OpenShift, Docker, Kubernetes, etc.) with expert level experience managing and running Linux servers and administration skills at scale.
  • Experience in deploying and integrating monitoring technologies, backup/restore, and tools in the cloud large scale monitoring and reporting (New Relic), running and managing ELK with demonstrable knowledge and expertise of 24 x 7 operational support of systems hosted on a major cloud provider (AWS, GCP, Azure)
  • Knowledge in writing and using modular Terraform at scale and Infrastructure as Code (IaC) as an AWS automation technology implementing modern scripting and object-oriented programming, configuration management, and deployment via Ansible, Puppet or Chef for multi-region cloud-based environment
  • Possess a proven record of implementing DevOps and SRE methodologies, principles, and practices.
  • Through data and metrics, you can demonstrate a holistic view of how these working practices help build and support products for our team.
  • Solid knowledge of scripting technologies required to build tooling integrations (ruby, bash, Golang, python or other scripting languages), experience in using modern scripting and OO programming languages as a contributing member within an agile dev squad