Tech Stack
AWSAzureCloudDockerElasticSearchGoogle Cloud PlatformKubernetesMicroservicesPrometheusTerraform
About the role
- Optimize database performance and ensure seamless operation without impacting service availability.
- Enhance infrastructure security, scalability, and performance across on-prem and cloud environments.
- Continuously improve DevOps practices by implementing automation, monitoring, and incident response strategies.
- Troubleshoot complex networking, system, and application issues across distributed environments.
- Operate and optimize Prometheus/Thanos for large-scale metrics collection, storage, and alerting.
- Manage and maintain EFK (Elasticsearch, Fluentbit, Kibana) stack for centralized logging and observability.
Requirements
- 5–8 years of hands-on experience managing cloud infrastructure and monitoring.
- Ability to proactively define and resolve infrastructure and operational challenges.
- Hands-on experience with Docker and Kubernetes in production environments.
- Solid experience with cloud platforms such as AWS, GCP, or Azure.
- Understanding and experience in Microservices Architecture (MSA).
- Strong Computer Science fundamentals, including data structures, networking, and operating systems.
- Familiarity with infrastructure as code (IaC) and configuration management tools.