
Application SRE, DevOps
ELLKAY
full-time
Posted on:
Location Type: Hybrid
Location: Elmwood Park • New Jersey • United States
Visit company websiteExplore more
Salary
💰 $90,000 - $110,000 per year
Tech Stack
About the role
- Own application reliability, availability, performance, and scalability in production and non-production environments
- Design, build, and maintain CI/CD pipelines for application deployments
- Automate infrastructure provisioning and configuration using Infrastructure as Code
- Monitor application health using metrics, logs, and traces; define SLIs, SLOs, and error budgets
- Lead incident response, root-cause analysis (RCA), ensuring corrective and preventive actions are completed and communicated
- Improve system resilience through capacity planning, system tuning, and fault tolerance
- Partner with development teams to ensure services meet reliability, performance, and scalability objectives
- Reduce manual operational effort through automation and self-healing solutions
- Serve as a point of contact for critical Sev1/Sev2 incidents, leading incident command when required
Requirements
- Strong experience as an SRE, DevOps Engineer, or Production Support Engineer
- Solid understanding of Windows, Linux/Unix systems and networking fundamentals
- 7 years of experience as an SRE
- Hands-on experience with cloud platforms such as AWS, Azure, or GCP
- Experience with containerization and orchestration tools like Docker and Kubernetes
- Proficiency in CI/CD tools such as Jenkins, GitHub Actions, or similar
- Experience with Infrastructure as Code tools like Terraform, CloudFormation, or ARM
- Strong scripting skills in Python, Bash, or similar languages
- Experience with monitoring and observability tools (Prometheus, Grafana, ELK, Datadog, etc.)
- Understanding of reliability concepts such as SLAs, SLOs, and incident management
Benefits
- Medical, Dental, and Vision benefits
- Employer-paid Life and LTD
- 401k w/ matching – once eligibility is met
- Work/life balance
- Paid Volunteer Program
- Flexible working hours
- Generous FTO
- Remote work options
- Employee Discounts
- Parental Leave
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
SREDevOpsProduction SupportWindowsLinuxNetworkingAWSAzureGCPPython
Soft Skills
incident responseroot-cause analysiscommunicationleadershipcollaborationproblem-solvingcapacity planningsystem tuningfault toleranceautomation