Tech Stack
AWSAzureCloudDistributed SystemsGoGoogle Cloud PlatformGrafanaKubernetesPrometheusPythonRustTerraform
About the role
- Architect and develop Kubernetes operators and controllers to automate and manage Lyric’s platform workloads.
- Design and implement control plane and data plane services that provide orchestration, routing, monitoring, and self-healing of distributed components.
- Build internal developer tooling and platform APIs to enhance developer productivity and operational excellence.
- Design and maintain IaC frameworks (e.g., Terraform, Pulumi, Crossplane) to provision and manage cloud infrastructure securely and reproducibly.
- Establish and improve observability, reliability, and resilience of critical infrastructure components.
- Work closely with SREs, platform engineers, and application teams to ensure infrastructure meets evolving product needs.
- Contribute to infrastructure architecture and technical direction for scale, security, and developer experience.
- Solve tough distributed systems challenges and help scale Lyric’s infrastructure in a cloud-native, multi-tenant environment.
Requirements
- 5+ years of experience building backend or infrastructure systems, ideally in cloud-native environments.
- Deep experience with Kubernetes internals, operators/controllers, CRDs, and the Kubernetes API.
- Strong software engineering skills in languages such as Go (preferred), Rust, or Python.
- Hands-on experience designing distributed systems and working with control plane/data plane architectures.
- Proficiency with IaC tools and practices (Terraform, Pulumi, Crossplane, etc.).
- Solid understanding of containerization, orchestration, and cloud platforms (AWS, GCP, Azure).
- Familiarity with observability (Prometheus, Grafana, OpenTelemetry) and operational best practices for production systems.
- Strong problem-solving, debugging, and performance optimization skills.
- Experience building platform services in a multi-tenant SaaS environment. (Nice to have)
- Familiarity with service meshes, workload identity, and cloud-native security patterns. (Nice to have)
- Contributions to open-source infrastructure projects. (Nice to have)