Developing and maintaining automation tools and infrastructure to streamline software deployment, configuration management, and system monitoring.
Monitoring the performance and availability of software systems, identifying and resolving issues, and implementing proactive measures to prevent future incidents.
Responding to incidents, conducting root cause analysis, participating in post incident review, and implementing corrective actions to prevent similar incidents in the future.
Collaborating with software development teams to ensure that reliability and scalability considerations are incorporated into the software design and implementation.
Identifying opportunities for process improvement, implementing best practices, and driving initiatives to enhance the reliability and performance of software systems.
Requirements
Bachelor's degree in Computer Science or a related field, or equivalent work experience.
Proficiency in at least one programming language (e.g., Python, Go, Java) and familiarity with multiple language ecosystems.
Hands-on experience with Cloud platforms.
Demonstrated ability to clearly communicate technical and non-technical information verbally and in writing.
Ability to resolve issues and complete tasks effectively in a team environment.
Benefits
Health insurance
Retirement plans
Paid time off
Flexible work arrangements
Professional development
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard skills
automation toolsinfrastructuresoftware deploymentconfiguration managementsystem monitoringroot cause analysisprocess improvementPythonGoJava