Tech Stack
AWSAzureCloudDistributed SystemsDockerGoGRPCKubernetesLinuxMicroservicesPostgresPrometheusPythonRustTerraform
About the role
- Maintain stability of control plane consisting of a distributed microservice architecture closely interacting with kubernetes and cloud provider APIs.
- Debug and resolve complex Kubernetes issues across multiple regions and clouds
- Automate database lifecycle operations (deploy, resize, upgrade, fork)
- Participate in on-call rotation, handle incident response, perform root cause analysis
- Develop back-end features for Tiger Cloud, specializing in a distributed microservice architecture with an emphasis on platform and database expertise.
- Develop Kubernetes controllers, operators, and CRDs to extend platform capabilities
- Enhance observability and monitoring systems
- Improve deployment reliability, error handling, and client tooling (CLI interfaces)
- Work closely with infrastructure and software engineering teams to ensure platform scalability and reliability
- Stay current with Kubernetes releases, features, CNI/CSI interfaces, and cluster orchestration tools
- Contribute to system architecture for scalable microservices and distributed systems
- Write idiomatic Go code with comprehensive unit and integration tests
- Maintain >80% code coverage and enforce quality with golangci-lint
- Perform peer reviews and follow Go/Kubernetes best practices
- Ensure security compliance and vulnerability management
Requirements
- 5+ years of software/platform engineering experience
- 3+ years of Go in production environments
- 3+ years Kubernetes operations (building, debugging, scaling clusters)
- Experience in distributed systems and microservice architectures
- Go (Golang): Advanced proficiency (Go 1.24+), primary development language
- Kubernetes: Experienced Kubernetes user familiar with client-go and capable of developing CRDs and controllers
- PostgreSQL/TimescaleDB: Administration, migration, replication, performance tuning
- gRPC & Protobufs: Experience developing services and defining schemas
- Rust: experience developing low latency applications memory efficient applications in rust is a plus
- Helm Charts: Template management and deployment automation
- Kubernetes Operators: Operator patterns and controller-runtime
- CI/CD: GitHub Actions, automated testing, deployment pipelines
- Monitoring & Observability: Prometheus, OpenTelemetry/Jaeger, distributed tracing
- Infrastructure as Code: Terraform, Pulumi, or similar
- Linux & Bash: Deep knowledge of operating systems and container environments
- Kubernetes lifecycle: kOps
- PostgreSQL user and DB administration without ORMs (required)
- PostgreSQL/TimescaleDB administration in production
- Backup/restore procedures and disaster recovery strategies
- WAL management and replication (streaming/logical)
- Query optimization, tuning, and parameter adjustments