qode.world

Site Reliability Architect

qode.world

full-time

Posted on:

Location Type: Hybrid

Location: TexasTexasUnited States

Visit company website

Explore more

AI Apply
Apply

Job Level

About the role

  • Design, architect, and build cloud-native infrastructure and application services on AWS
  • Lead end-to-end infrastructure design for application platforms, microservices, and shared services
  • Implement and manage Infrastructure as Code (IaC) using Terraform
  • Design and maintain highly available, scalable, secure, and cost-optimized AWS architectures
  • Troubleshoot and resolve complex infrastructure and application service issues
  • Provide architectural guidance and technical leadership across engineering teams
  • Drive adoption of DevSecOps best practices across the SDLC
  • Establish and enhance monitoring, observability, and alerting frameworks

Requirements

  • 10+ years of experience in Site Reliability, Observability, Production Support, Cloud Architecture or related roles, with a strong focus on architecture and strategy
  • Deep hands-on expertise with observability platforms such as Dynatrace, ELK, Datadog, Splunk, OpenTelemetry, Jaeger
  • Strong understanding of microservices architecture, APIs, and distributed systems
  • Proficiency in programming/scripting (e.g., Python, Go, Java) for automation and integration
  • Strong hands-on experience with AWS services, including:
  • - Compute & Networking: VPC, EC2, ECS/EKS, Lambda
  • - Databases: RDS, Aurora, DynamoDB
  • - Storage & CDN: S3, CloudFront
  • - Security: IAM, KMS, Security Groups, NACLs
  • Proven experience designing multi-account, multi-region AWS architectures
  • Deep understanding of:
  • - Cloud networking and distributed systems
  • - Security and compliance best practices
  • - Scalability, resiliency, and fault-tolerant design patterns
  • Hands-on expertise with Terraform (or similar IaC tools)
  • Experience with monitoring and observability tools (CloudWatch, Prometheus, Grafana, etc.)
  • Strong experience with DevSecOps principles and CI/CD pipelines
  • Excellent problem-solving and analytical skills
  • Demonstrated ability to lead cross-functional initiatives and influence technical direction
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
AWSInfrastructure as CodeTerraformPythonGoJavamicroservices architectureAPIsobservabilityDevSecOps
Soft Skills
problem-solvinganalytical skillsleadershipinfluencecross-functional collaboration