Tech Stack
AnsibleApacheAWSAzureCloudDNSDockerFirewallsGoogle Cloud PlatformGrafanaJenkinsKubernetesLinuxNGINXPrometheusPuppetPythonTCP/IPTerraform
About the role
- Manage and troubleshoot Linux-based systems, system services, performance tuning, backups and recovery
- Configure and manage web servers (Apache/Nginx), virtual hosts and SSL
- Manage cloud-based and on-premises infrastructure (compute, storage, networking) and IAM
- Provision and manage infrastructure using Terraform and Git-aligned best practices
- Automate configuration with Ansible/Puppet and maintain CI/CD pipelines using Jenkins/GitHub Actions/GitLab CI
- Containerize applications with Docker and support Kubernetes deployments
- Implement and maintain monitoring, logging and alerting with Prometheus, Grafana and ELK/Loki
- Write and maintain Bash and Python automation scripts and internal tools
- Participate in on-call rotation, incident investigation, resolution and post-incident reviews
- Apply security best practices and system hardening for Linux and cloud environments
- Collaborate with developers and senior engineers to implement scalable, secure, and reliable infrastructure
Requirements
- Hands-on experience managing Linux servers, system services, troubleshooting performance issues, backups and recovery
- Good understanding of networking concepts: TCP/IP, DNS, firewalls (iptables/nftables), basic load balancing and troubleshooting in cloud and on-prem environments
- Experience configuring and managing web servers (Apache or Nginx), virtual hosts and SSL certificates
- Exposure to public cloud infrastructure (AWS / GCP / OCI / Azure) including compute, storage, networking and IAM; familiarity with cloud monitoring and backup tools
- Practical experience with Infrastructure as Code using Terraform; Git for versioning and best practices
- Working knowledge of configuration management tools like Ansible or Puppet
- Experience building and maintaining CI/CD workflows using Jenkins, GitHub Actions, or GitLab CI
- Hands-on experience with Docker; exposure to Kubernetes is desirable
- Familiarity with monitoring tools like Prometheus and Grafana; basic understanding of log aggregation systems (ELK or Loki) and alerting
- Comfortable writing Bash scripts and basic Python code for automation and API interactions
- Exposure to on-call support, incident response, troubleshooting production issues and participating in post-incident reviews
- Basic understanding of Linux security and system hardening: TLS, SSH hardening, patching, and IAM usage within cloud environments
- Ability to work closely with developers and senior engineers, follow project plans, and deliver tasks within timelines