DevOps Engineer

• Ensure that infrastructure and applications have high-quality Service Level Agreements (SLA) and Service Level Objectives (SLO) that are measured and adhered to
• Ensure KUBRA maintains well-documented standards and best practices to ensure existing and new services are built for high availability and security
• Ensure appropriate automation and observability exists to achieve low and continuously improving mean time to recovery (MTTR) for service-impacting incidents
• Ensure that any incidents are thoroughly investigated and documented appropriately, along with the corresponding problem records with corrective actions
• Participate in the Architectural Review Process for new and existing services being built for the KUBRA HQ platform, ensuring compliance with standards and best practices for high availability, observability, security, and cost efficiency
• Work closely with Development, Infrastructure, and Operations teams to lead the root cause analysis related to any major incidents – leading senior stakeholder communication, driving problem-solving, and debugging with best practice techniques
• Design and conduct fault injection experiments to identify potential weak points in high-availability architecture and work with Platform Engineering and Software Engineering teams to remediate any findings
• Perform periodic audits of applications and infrastructure to ensure compliance with standards and identify necessary remediation

Site Reliability Engineer

Senior DevOps Engineer

Lead Director Engineering, SRE – Retail & Pharmacy

Staff Engineer – SRE, Retail & Pharmacy

DevOps, Middleware Engineer

Senior Site Reliability Engineer