Precision Solutions

Cloud Infrastructure - Site Reliability Engineer (SRE)

Precision Solutions

full-time

Posted on:

Origin:  • 🇺🇸 United States

Visit company website
AI Apply
Manual Apply

Job Level

Mid-LevelSenior

Tech Stack

AWSAzureCloudGrafanaKubernetesPrometheusPythonTerraformVault

About the role

  • Design, deploy, and maintain Azure API Management and Amazon API Gateway solutions for enterprise-scale API traffic.
  • Implement secure API integration patterns, including OAuth 2.0, JWT validation, IP filtering, and private endpoints.
  • Manage network connectivity and routing between on-premises, Azure, and AWS environments using VPC, Transit Gateway, VNet peering, ExpressRoute, and VPN.
  • Define, monitor, and improve SLIs, SLOs, and SLAs for API platforms.
  • Automate deployments, scaling, and operational tasks using GitHub Actions and self-hosted GitHub Runners.
  • Implement Infrastructure-as-Code using Terraform, Bicep, or CloudFormation for consistent and repeatable infrastructure provisioning.
  • Perform incident response, root cause analysis, and post-incident reviews to ensure continuous improvement.
  • Optimize API gateway performance with caching, throttling, and request/response transformations.
  • Apply Azure and AWS security best practices, including Key Vault/Secrets Manager integration, RBAC/IAM roles, and API-level threat protection.
  • Ensure compliance with enterprise security standards and regulatory frameworks (e.g., ISO 27001, SOC 2, GDPR).

Requirements

  • Minimum 5 years’ experience in cloud infrastructure engineering/SRE
  • Multi-cloud expertise in Azure and AWS.
  • Hands-on experience in Gateway solutions such as Azure API Management and Amazon API Gateway.
  • Proficiency with GitHub Actions and self-hosted GitHub Runners for CI/CD automation.
  • Strong skills in Infrastructure-as-Code (Terraform, Bicep, CloudFormation).
  • Proficiency in scripting languages (PowerShell, Bash, Python).
  • Experience with networking and security in cloud environments (VPC, VNet, Transit Gateway, private endpoints).
  • Solid understanding of monitoring and observability tools (Azure Monitor, CloudWatch, Prometheus, Grafana).
  • Strong analytical and troubleshooting skills.
  • Effective communicator across technical and non-technical teams.