Empower

API Reliability Engineer

Empower

full-time

Posted on:

Location Type: Remote

Location: United States

Visit company website

Explore more

AI Apply
Apply

Salary

💰 $87,400 - $123,400 per year

About the role

  • Own and improve the reliability, performance, and scalability of API services in production.
  • Troubleshoot and resolve P1/P2 production incidents end-to-end, analyzing issues across application, infrastructure, and integrations.
  • Work closely with API developers to identify and address reliability issues and application-level security vulnerabilities in service design and implementation.
  • Contribute targeted code-level or configuration fixes to resolve issues and prevent recurrence.
  • Participate in root cause analysis (RCA) and drive durable, long-term fixes.
  • Improve API resilience through patterns such as timeouts, retries, circuit breakers, and graceful degradation.
  • Establish and enhance observability and service health metrics, including logs, metrics, traces, and SLOs, using Datadog and Splunk.
  • Define and monitor SLAs/SLOs for API performance and availability.
  • Work with API Gateway and ALB/NLB for traffic management, routing, and system reliability.
  • Contribute to CI/CD pipelines using Jenkins to ensure safe and consistent deployments.
  • Contribute to disaster recovery readiness and system resilience planning.
  • Collaborate across engineering teams to improve system design and operational readiness.
  • Participate in an on-call rotation for critical incidents (P1/P2).

Requirements

  • Minimum 5 years of experience in backend or API development
  • Strong hands-on experience with Java and Spring Boot
  • Proven experience building, shipping, and operating APIs in production environments
  • Strong problem-solving skills with the ability to debug real production issues end-to-end
  • Experience handling P1/P2 incidents in production environments
  • Solid understanding of API architecture, request lifecycle, and common failure patterns
  • Experience with AWS services, including API Gateway, ALB/NLB, EC2, ECS/EKS, Lambda, RDS, or DynamoDB
  • Familiarity with reliability patterns such as timeouts, retries, circuit breakers, and connection pooling
  • Experience with observability tools such as Datadog and/or Splunk
  • Experience with CI/CD pipelines, preferably Jenkins
  • Strong debugging skills in distributed systems
  • Experience with Git-based workflows and Agile development
  • Bachelor’s in Computer Science, Information Systems, or a related field; equivalent practical experience welcomed.
Benefits
  • Medical, dental, vision and life insurance
  • Retirement savings – 401(k) plan with generous company matching contributions (up to 6%)
  • Tuition reimbursement up to $5,250/year
  • Business-casual environment that includes the option to wear jeans
  • Generous paid time off upon hire – including a paid time off program plus ten paid company holidays and three floating holidays each calendar year
  • Paid volunteer time — 16 hours per calendar year
  • Leave of absence programs – including paid parental leave, paid short- and long-term disability, and Family and Medical Leave (FMLA)
  • Business Resource Groups (BRGs) – BRGs facilitate inclusion and collaboration across our business internally and throughout the communities where we live, work and play.
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
JavaSpring BootAPI developmentAWSDatadogSplunkCI/CDJenkinsdebuggingAPI architecture
Soft Skills
problem-solvingcollaborationcommunicationanalytical thinkingincident managementroot cause analysisoperational readinessreliability improvementsystem designdebugging skills