
Senior Site Reliability Engineer
Akamai Technologies
full-time
Posted on:
Location Type: Remote
Location: Poland
Visit company websiteExplore more
Job Level
About the role
- Working on Internet technologies to improve the performance, availability, and scalability of large distributed content delivery systems
- Monitoring system performance, identifying bottlenecks, and implementing solutions to ensure high availability and reliability of services.
- Ensuring platform availability and performance, analyzing data to debug issues, and implementing solutions to prevent future occurrences.
- Enhancing CI/CD workflows and secure deployment methods for platform services
- Engaging in on-call rotations, incident responses, conducting root cause analyses (RCA), and performing postmortems effectively and collaboratively.
- Automating deployment processes and system configurations to improve consistency and reduce manual intervention.
- Collaborating with development teams to optimize application performance and improve system architecture.
Requirements
- Have 4+ years of relevant experience as an SRE and a Bachelor's degree in Computer Science, Engineering, or related field
- Gain practical expertise with containerization technologies such as Kubernetes, Docker, and related compute platforms.
- Demonstrate expertise with scripting languages such as Python, Bash, JavaScript, and others for effective technical implementations.
- Work with monitoring tools (e.g., Prometheus, Grafana, ADBMS, Datadog) for metrics, alerts, dashboards, and resolving technical issues.
- Show fluency working in a UNIX/Linux computing environment
- Have proficiency with a configuration management tool such as Ansible, Salt Stack, Chef, Puppet, or similar
Benefits
- Professional development opportunities
- Flexible working arrangements
- Paid time off
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
containerizationKubernetesDockerPythonBashJavaScriptmonitoring toolsPrometheusGrafanaAnsible
Soft Skills
collaborationproblem-solvingincident responseroot cause analysiscommunication