Tech Stack
AnsibleAWSCloudDNSEC2FirewallsGrafanaLinuxPerlPrometheusShell ScriptingSQLTCP/IPTerraform
About the role
- Manage, monitor, and optimize Linux-based systems and servers.
- Troubleshoot OS-level, network, and performance issues.
- Deploy, configure, and manage AWS services including EC2, S3, RDS, IAM, CloudWatch, and VPC.
- Optimize cost and performance of AWS environments and implement high-availability and disaster recovery strategies.
- Develop automation scripts in Bash, Perl, and SQL; write deployment and monitoring scripts.
- Maintain Infrastructure-as-Code using Terraform or CloudFormation.
- Implement observability solutions (CloudWatch, Prometheus, Grafana, ELK stack) and ensure systems meet SLAs/SLOs/SLIs.
- Respond to incidents, perform root cause analysis, and participate in on-call rotations.
- Collaborate with development teams to embed reliability best practices and drive improvements in CI/CD and release processes.
Requirements
- Strong proficiency in Linux administration and troubleshooting.
- Solid hands-on experience with AWS services (EC2, S3, RDS, IAM, CloudWatch, VPC, etc.).
- Proficiency in scripting languages: Bash, Perl, and SQL.
- Experience with system monitoring and logging tools (CloudWatch, Nagios, ELK, Prometheus, Grafana).
- Understanding of networking fundamentals (DNS, TCP/IP, VPN, firewalls).
- Experience with automation/Infrastructure-as-Code tools like Ansible, Terraform, or CloudFormation.
- Strong problem-solving and incident management skills.