Contribute to reliability, quality, security, supportability, and scalability of applications
A continuous learner who excels at solution driven problem solving by using your analytical skills
Passionate about DevOps, Site Reliability Engineering, and modern infrastructure architecture and automation principles
Support effective resolution of large scale incidents that reduces MTTR and customer impact
Participate in post-mortem and root cause analyses
Manage complete lifecycle of production environments, including necessary configurations, integrations, and application admin activities (operational, migrations, and upgrades)
Assist in troubleshooting of performance, integration and user management issues by digging into application and system logs
Develop procedures and scripts for monitoring and automation of manual processes
Engage with Software Development and Customer Success organizations to troubleshoot issues and participate in any planned activities
Follow ITSM processes for Incident, Request, and Change Management process
Maintain system documentation for configuration and troubleshooting of known issues
Participate in on-call rotation when required to provide support for urgent, off-hour issues
Implement ‘self-healing’ capabilities to limit after-hours and on-call needs
Facilitate research, evaluation, and design of new software solutions
Requirements
3+ years experience in the information technology industry with focus on Infrastructure or Operations support
2+ years experience in SRE and/or DevOps
Proficient in technologies such as Kubernetes, Istio, Rancher, Git, Helm, Ansible, Chef, Docker, Prometheus, Grafana, and AppDynamics.
Strong understanding of at least one cloud platform such as AWS, GCP, or Azure
Experience Troubleshooting complex cloud infrastructure problems and Linux OS issues
Some basic understanding of Network security fundamentals
Good verbal and written communication skills, with ability to work with both technical and non-technical stakeholders
Bachelor’s degree in Computer Science or related field preferred or equivalent work experience
Proficient in LINUX and Windows commands, developing scripts using shell programming with Bash or PowerShell, and additional scripting languages like Python and Java Script
Experience using SDLC and Agile methodologies, with focus on Scrum and Kanban
Confident in a fast-paced environment with competing priorities, and able to multi-task and manage expectations.
Benefits
Health insurance
Flexible spending accounts
Health savings accounts
Retirement savings plans
Life and disability insurance programs
Paid time off
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.