Tech Stack
AWSCloudGoogle Cloud PlatformITSMSwitching
About the role
- Design and operate a hybrid infrastructure (geo-distributed colocation data centers and cloud platforms such as GCP, AWS, and Alibaba).
- Develop a standardized process for monitoring and diagnosing hardware issues to ensure system stability and high availability
- Administer and optimize AWS and GCP environments. Control infrastructure resource utilization and balance workloads across systems.
- Evaluate and integrate modern hardware components.
- Apply FinOps practices: monitor and analyze infrastructure expenses, provide cost optimization examples.
- Ensure the development and regular testing of the Disaster Recovery Plan for infrastructure.
- Maintain up-to-date infrastructure documentation.
- Build and improve processes within the Infrastructure system engineering team.
- Design and maintain secure network architecture, including: Core routing protocols (BGP, OSPF, so on). VPN technologies (IPSec, SSL, so on). Connection types and interconnections between on-premise and cloud resources (Direct Connect, Interconnect, peering).
- Apply ITSM best practices: Incident Management rapid issue resolution and service restoration. Problem Management root cause analysis and long-term fixes. Change Management risk assessment and controlled infrastructure change implementation
- Support colocation Data Centres, establish and operate SLAs with suppliers.
Requirements
- Proven experience with on-premise hardware and modern infrastructure components.
- Strong skills in cloud administration (AWS, GCP).
- Hands-on experience with monitoring tools, resource utilization, and troubleshooting.
- Solid understanding of FinOps methodology and practical cost optimization cases.
- Experience in building and testing Disaster Recovery plans.
- Hands-on experience with networking protocols (BGP, IPSec, OSPF, VPNs, routing and switching).
- Experience in hybrid and multi-cloud connectivity.
- Background in Data Centre operations and supplier management.
- Familiarity with ITIL/ITSM frameworks (Incident, Change, Problem Management).
- English at B2 level or higher, fluent Russian.
- (Nice to have) Experience in leading and managing engineering teams.
- Integrity & loyalty
- Team player with advanced communication and collaboration skills
- A hands-on, can-do attitude - always looking for solutions and thinking out of the box
- Overachiever mentality
- Capability to work and succeed in the fast pace and ever-changing environment