
Staff Software Engineer
Infinity Constellation
full-time
Posted on:
Location Type: Remote
Location: United States
Visit company websiteExplore more
Job Level
Tech Stack
About the role
- Drive platform architecture decisions and align the team on scalable patterns and long-term maintainability
- Review a high volume of code, design docs, and architectural proposals for scalability, reliability, security, and operability
- Be a technical mentor and force multiplier: unblock engineers, raise the bar on production readiness, and establish platform best practices
- Own and evolve the core backend platform (Django/DRF/ASGI) performance and correctness
- Scale async execution across Celery + Dramatiq + Temporal/Cortex; implement resilient workflow patterns (retries, circuit breakers, graceful degradation)
- Optimize PostgreSQL/pgvector (query tuning, connection pooling) and caching strategies
- Maintain and improve Kubernetes deployment infrastructure (GKE, Helm, Terraform/OpenTofu) and CI/CD + rollout strategies. Own KEDA autoscaling policies and resource allocation across worker pools.
- Own reliability of RabbitMQ, Redis, and PostgreSQL infrastructure; lead incident response and post-mortems
- Extend OpenTelemetry + Datadog instrumentation, dashboards, alerts, and SLOs; profile and reduce latency/memory bottlenecks
Requirements
- 10+ years building and operating production backend systems at scale
- Deep expertise in Python (Django preferred) and relational databases (PostgreSQL)
- Hands-on experience with Kubernetes, Helm, and cloud infrastructure (GCP preferred)
- Strong background in distributed systems: message queues, event sourcing, workflow orchestration
- Production experience with async task systems (Celery, Dramatiq, or similar)
- Track record of debugging complex production issues across multiple services
- Ability to work autonomously and drive technical initiatives without close supervision
- Clear technical communication—able to explain tradeoffs and build consensus
- Experience with Temporal or similar workflow engines (preferred)
- Background in LLM infrastructure, RAG systems, or AI/ML platforms (preferred)
- Familiarity with OpenTelemetry, Datadog, or similar observability stacks (preferred)
- Experience with KEDA or other Kubernetes autoscaling solutions (preferred)
- Contributions to multi-tenant SaaS platform architecture (preferred)
- History of improving developer experience and platform abstractions (preferred)
Benefits
- Compensation: Competitive salary commensurate with experience (Staff/Principal level)
- Location: Remote
- Type: Full-time
- Requirements: Overlap with Americas timezones for collaboration; reliable high-speed internet
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
PythonDjangoPostgreSQLKubernetesHelmCeleryDramatiqOpenTelemetryDatadogKEDA
Soft Skills
technical mentorshipclear technical communicationautonomous workproblem-solvingconsensus building