FREE ACCESS
5,000–10,000 jobs/day

See all jobs on JobTailor
Search thousands of fresh jobs every day.
Discover
- Fresh listings
- Fast filters
- No subscription required
Create a free account and start exploring right away.
Tech Stack
Tools & technologiesAWSCloudDistributed SystemsJavaJavaScriptPython
About the role
Key responsibilities & impact- Improve resiliency engineering practices across platforms and applications, including resilient application design patterns, system observability and deployment strategies
- Incident detection, troubleshooting, and resolution.
- Develop automation for incident response and infrastructure management
- Develop and support OpenTelemetry integrations for multiple application platforms (browser, ECS, lambda, etc) and languages (JavaScript, Java)
- Contribute to architectural decisions and support implementation of solutions.
Requirements
What you’ll need- Expertise in JavaScript (server-side and client-side execution environments) or Java.
- Working knowledge of Python (or similar scripting language)
- Strong knowledge of resiliency engineering techniques for both platforms and applications.
- Experience troubleshooting complex production issues and implementing effective mitigations.
- Hands-on experience with AWS services and cloud infrastructure.
- Familiarity with OpenTelemetry specification and core APIs.
- Practical experience developing and operating software in distributed systems environments.
Benefits
Comp & perks- Health insurance
- Retirement plans
- Paid time off
- Flexible work arrangements
- Professional development
ATS Keywords
✓ Tailor your resumeApplicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
JavaScriptJavaPythonresiliency engineeringincident detectiontroubleshootingautomationOpenTelemetrycloud infrastructuredistributed systems
