
Senior Software Engineer, Infrastructure – Deployments
aion
full-time
Posted on:
Location Type: Hybrid
Location: Bengaluru • 🇮🇳 India
Visit company websiteJob Level
Senior
Tech Stack
AnsibleAWSAzureCloudGoGoogle Cloud PlatformGrafanaKubernetesLinuxPrometheusPythonTerraformVault
About the role
- Design AION as a composable platform with independently deployable components that run seamlessly on AWS, GCP, Azure, and private data centers
- Work with senior engineering leads to define private deployment strategies and build automation for customer VPC and on-premises installations
- Build abstraction layers that unify diverse cloud providers while maintaining flexibility for customer-specific requirements
- Design globally distributed deployment patterns with built-in data sovereignty, compliance, and regulatory requirements
- Own end-to-end platform deployment automation using Terraform, Ansible, Helm, and infrastructure-as-code across hybrid cloud environments
- Design and implement disaster recovery, failover, high-availability architectures, and cloud migration strategies for customer deployments
- Build comprehensive CI/CD pipelines for infrastructure provisioning, configuration management, and deployment orchestration
- Implement monitoring, observability (Prometheus, Grafana, Loki), and alerting systems tailored for customer-managed AION instances
- Implement Kubernetes-based and custom orchestrator-based managed services with strict workload isolation and multi-tenancy
- Design container security, runtime protection, network policies, and secrets management for production workloads
- Own compliance implementation (SOC2, GDPR, HIPAA, ISO 27001, PCI-DSS) and security best practices for customer environments
- Create deployment blueprints, reference architectures, self-service portals, and comprehensive documentation for customer success
Requirements
- 6+ years of experience in platform deployment, DevOps, SRE, or cloud infrastructure roles with focus on customer-facing deployments
- Deep expertise in Kubernetes including cluster design, multi-tenancy, custom resources, operators / controllers, and production operations
- Fundamental understanding of Linux processes and container internals, specifically regarding runtime optimizations like lazy loading (Nydus, eStargz) and snapshot checkpoint/restore mechanisms (CRIU) for fast migration and reduced cold-start times.
- Deep understanding of computer networking and the OSI model, with experience in creating overlay networks using VXLAN or BGP and implementing network isolation through CIDRs
- Strong understanding of hybrid and multi-cloud architectures combining on-premises, private, and public cloud resources, including VPCs, routing, network policies, and VPN tools like WireGuard
- Proficiency in infrastructure-as-code using Terraform, Ansible, Pulumi, Nix, or CloudFormation across multiple cloud providers
- Experience building and maintaining GitOps pipelines for infrastructure and application deployments using GitLab CI, GitHub Actions, ArgoCD or FluxCD
- Knowledge of secrets management (External Secrets Operator, Vault, AWS Secrets Manager, GCP Secret Manager) and encryption at rest/in transit
- Knowledge of observability stack including Prometheus, Grafana, Loki, distributed tracing (Jaeger, Tempo), and log aggregation
- Programming/scripting skills in Go or Python for building automation tools, operators, and deployment scripts
- Hands-on experience deploying complex platforms in customer VPCs and on-premises environments with strict isolation requirements
- Experience designing and executing cloud migration strategies including lift-and-shift, re-platforming, and cloud-native transformations
- Strong knowledge of security compliance frameworks (SOC2, GDPR, HIPAA, ISO 27001, PCI-DSS) and their implementation in cloud infrastructure
- Familiarity with disaster recovery strategies, backup solutions (Velero, Kasten), and business continuity planning
- Exposure to HPC systems, GPU orchestration, and AI workload patterns is highly desirable
Benefits
- **Preferred Attributes:**
- - High ownership, self driven and a bias for action.
- - Strong strategic thinking and ability to connect technical decisions to business impact.
- - Excellent communication and mentoring skills.
- - Thrives in ambiguity, fast-paced environments, and early-stage startup culture.
- **Why Join AION?**
- - Work directly with high-pedigree founders shaping technical and product strategy.
- - Build infrastructure powering the future of AI compute globally.
- - Significant ownership and impact with equity reflective of your contributions.
- - Competitive compensation, flexible work options, and wellness benefits
- **Apply Now:**
- - If you’re a strong engineer ready to lead architecture and scale next-generation AI infrastructure, we want to hear from you. Please share:
- - Your resume highlights relevant projects and leadership experience.
- - Links to products, code, or demos you’ve built.
- - A brief note on why AION’s mission excites you.