aion

Senior Software Engineer, Compute Platform

aion

full-time

Posted on:

Location Type: Hybrid

Location: Bengaluru • 🇮🇳 India

Visit company website
AI Apply
Apply

Job Level

Senior

Tech Stack

AWSAzureCloudDistributed SystemsEC2GoGoogle Cloud PlatformGrafanaKafkaKubernetesPostgresPrometheusPythonRabbitMQRedisRustTerraform

About the role

  • Design and architect AION's multi-cloud compute platform, building abstraction layers that unify diverse cloud providers (AWS, GCP, Azure, bare-metal data centers)
  • Work directly with cloud providers to expand AION's compute pool—understanding pricing, availability zones, GPU types, and capacity planning
  • Build and maintain the AION managed services
  • Understand and abstract cloud provider differences in storage (block, object, file systems), networking (VPCs, subnets, security groups), and compute resources
  • Design composable platform components that enable forward deployments and promote reusability across AION's infrastructure stack
  • Own end-to-end development of managed services on the compute platform—from design and architecture through execution and production monitoring
  • Build scalable orchestration systems for GPU workloads, container scheduling, and resource allocation
  • Develop robust APIs and control planes for compute lifecycle management (provisioning, scaling, termination)
  • Lead technical discussions on platform reliability, performance optimization, and cost efficiency
  • Execute on peripheral platform services including billing systems, usage accounting, observability infrastructure, and compliance tooling
  • Build monitoring and telemetry systems for compute utilization, cost tracking, and performance metrics
  • Establish engineering standards for platform development including code reviews, quality gates, and testing practices
  • Mentor engineers on infrastructure best practices and distributed systems design

Requirements

  • 4+ years of experience building and scaling complex backend systems, cloud infrastructure, or distributed platforms
  • Strong understanding of multi-cloud architectures and experience working with AWS, GCP, or Azure at scale
  • Deep knowledge of cloud abstractions: compute (EC2, GCE, VMs), storage (S3, GCS, EBS), networking (VPCs, load balancers, security groups)
  • Proficiency in Golang strongly preferred; Python, Rust, or other systems languages a plus
  • Experience with Kubernetes, container orchestration, and infrastructure-as-code (Terraform, Pulumi, CloudFormation)
  • Solid understanding of distributed systems principles, consensus algorithms, and state management
  • Experience building APIs, control planes, and platform services for infrastructure management
  • Familiarity with databases (PostgreSQL, Redis, etcd), message queues (Kafka, RabbitMQ), and event-driven architectures
  • Knowledge of GPU orchestration, AI/ML workloads, or HPC systems is highly desirable
  • Experience with observability tools (Prometheus, Grafana, Datadog) and distributed tracing
  • Understanding of cloud billing models, cost optimization strategies, and resource scheduling
Benefits
  • **Preferred Attributes:**
  • - High ownership, self driven and bias for action.
  • - Strong strategic thinking and ability to connect technical decisions to business impact.
  • - Excellent communication and mentoring skills.
  • - Thrives in ambiguity, fast-paced environments, and early-stage startup culture.
  • **Why Join AION?**
  • - Work directly with high-pedigree founders shaping technical and product strategy.
  • - Build infrastructure powering the future of AI compute globally.
  • - Significant ownership and impact with equity reflective of your contributions.
  • - Competitive compensation, flexible work options, and wellness benefits

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard skills
GolangPythonRustKubernetesTerraformPulumiCloudFormationAPIsdistributed systemsGPU orchestration
Soft skills
mentoringtechnical discussionsperformance optimizationcost efficiencyengineering standards