aion

Senior Software Engineer, Infrastructure – Deployments

aion

full-time

Posted on:

Location Type: Hybrid

Location: Bengaluru • 🇮🇳 India

Visit company website
AI Apply
Apply

Job Level

Senior

Tech Stack

AnsibleAWSAzureCloudGoGoogle Cloud PlatformGrafanaKubernetesLinuxPrometheusPythonTerraformVault

About the role

  • Design AION as a composable platform with independently deployable components that run seamlessly on AWS, GCP, Azure, and private data centers
  • Work with senior engineering leads to define private deployment strategies and build automation for customer VPC and on-premises installations
  • Build abstraction layers that unify diverse cloud providers while maintaining flexibility for customer-specific requirements
  • Design globally distributed deployment patterns with built-in data sovereignty, compliance, and regulatory requirements
  • Own end-to-end platform deployment automation using Terraform, Ansible, Helm, and infrastructure-as-code across hybrid cloud environments
  • Design and implement disaster recovery, failover, high-availability architectures, and cloud migration strategies for customer deployments
  • Build comprehensive CI/CD pipelines for infrastructure provisioning, configuration management, and deployment orchestration
  • Implement monitoring, observability (Prometheus, Grafana, Loki), and alerting systems tailored for customer-managed AION instances
  • Implement Kubernetes-based and custom orchestrator-based managed services with strict workload isolation and multi-tenancy
  • Design container security, runtime protection, network policies, and secrets management for production workloads
  • Own compliance implementation (SOC2, GDPR, HIPAA, ISO 27001, PCI-DSS) and security best practices for customer environments
  • Create deployment blueprints, reference architectures, self-service portals, and comprehensive documentation for customer success

Requirements

  • 6+ years of experience in platform deployment, DevOps, SRE, or cloud infrastructure roles with focus on customer-facing deployments
  • Deep expertise in Kubernetes including cluster design, multi-tenancy, custom resources, operators / controllers, and production operations
  • Fundamental understanding of Linux processes and container internals, specifically regarding runtime optimizations like lazy loading (Nydus, eStargz) and snapshot checkpoint/restore mechanisms (CRIU) for fast migration and reduced cold-start times.
  • Deep understanding of computer networking and the OSI model, with experience in creating overlay networks using VXLAN or BGP and implementing network isolation through CIDRs
  • Strong understanding of hybrid and multi-cloud architectures combining on-premises, private, and public cloud resources, including VPCs, routing, network policies, and VPN tools like WireGuard
  • Proficiency in infrastructure-as-code using Terraform, Ansible, Pulumi, Nix, or CloudFormation across multiple cloud providers
  • Experience building and maintaining GitOps pipelines for infrastructure and application deployments using GitLab CI, GitHub Actions, ArgoCD or FluxCD
  • Knowledge of secrets management (External Secrets Operator, Vault, AWS Secrets Manager, GCP Secret Manager) and encryption at rest/in transit
  • Knowledge of observability stack including Prometheus, Grafana, Loki, distributed tracing (Jaeger, Tempo), and log aggregation
  • Programming/scripting skills in Go or Python for building automation tools, operators, and deployment scripts
  • Hands-on experience deploying complex platforms in customer VPCs and on-premises environments with strict isolation requirements
  • Experience designing and executing cloud migration strategies including lift-and-shift, re-platforming, and cloud-native transformations
  • Strong knowledge of security compliance frameworks (SOC2, GDPR, HIPAA, ISO 27001, PCI-DSS) and their implementation in cloud infrastructure
  • Familiarity with disaster recovery strategies, backup solutions (Velero, Kasten), and business continuity planning
  • Exposure to HPC systems, GPU orchestration, and AI workload patterns is highly desirable
Benefits
  • **Preferred Attributes:**
  • - High ownership, self driven and a bias for action.
  • - Strong strategic thinking and ability to connect technical decisions to business impact.
  • - Excellent communication and mentoring skills.
  • - Thrives in ambiguity, fast-paced environments, and early-stage startup culture.
  • **Why Join AION?**
  • - Work directly with high-pedigree founders shaping technical and product strategy.
  • - Build infrastructure powering the future of AI compute globally.
  • - Significant ownership and impact with equity reflective of your contributions.
  • - Competitive compensation, flexible work options, and wellness benefits
  • **Apply Now:**
  • - If you’re a strong engineer ready to lead architecture and scale next-generation AI infrastructure, we want to hear from you. Please share:
  • - Your resume highlights relevant projects and leadership experience.
  • - Links to products, code, or demos you’ve built.
  • - A brief note on why AION’s mission excites you.