Tech Stack
CloudDistributed SystemsElixirGoGoogle Cloud PlatformPostgresRustSaltStackTerraform
About the role
- Lead team building and scaling foundation enabling PDQ Connect
- Lead, mentor, and grow cross-disciplinary team (Platform Engineers, SREs, DBAs, Data Engineering)
- Foster a culture of accountability, learning, and continuous improvement
- Support career development across specialties while unifying the team under shared values and practices
- Deliver core services that power PDQ Connect and future SaaS products
- Establish frameworks, libraries, and patterns that improve scale, security, and reliability
- Guide platform architecture for multi-tenant SaaS, regionalization, and compliance-readiness
- Own uptime, SLIs/SLOs, and incident response practices
- Oversee observability, monitoring, alerting, and MTTR reduction strategies
- Partner with QA leadership to embed platform-level testing into release pipelines
- Guide DBAs and data engineers to ensure datastores meet scale, performance, and availability needs
- Define best practices for data management, replication, query optimization, and cost efficiency
- Drive strategies for scaling data to hundreds of billions of rows
- Translate business and product requirements into platform capabilities
- Partner with Product, Security, and Compliance to ensure solutions meet SOC2, GDPR, and regulatory standards
- Consult with DevOps and Finance to optimize infrastructure spend at scale
Requirements
- 7+ years in software, platform, or infrastructure engineering
- 2+ years in engineering management
- Proven experience leading teams that included SREs, DBAs, or platform engineers
- Track record of scaling distributed systems and guiding them through major growth milestones
- Strong familiarity with cloud platforms (GCP preferred)
- Experience with infrastructure-as-code (Terraform)
- Understanding of modern backend stacks (Elixir, Rust, Go, or similar)
- Solid grounding in databases (Postgres, ClickHouse, Elastic, caching layers)
- Comfortable engaging in design reviews, architectural discussions, and incident retrospectives
- Clear communicator who builds trust across functions and geographies
- Pragmatic decision-maker, balancing speed of delivery with long-term resilience