Ironflow AI

Senior Software Engineer – Infrastructure

Ironflow AI

full-time

Posted on:

Location Type: Hybrid

Location: San DiegoCaliforniaUnited States

Visit company website

Explore more

AI Apply
Apply

Job Level

About the role

  • Design, build, and maintain production infrastructure on AWS (EKS, RDS, ECR, VPC, IAM, Secrets Manager, etc.).
  • Develop and manage our Kubernetes clusters: deploy workloads, tune Karpenter node autoscaling, maintain Helm charts, and keep clusters healthy.
  • Own and extend our GitOps deployment pipeline: GitHub Actions for CI/CD, ArgoCD for continuous delivery, and Helm for packaging.
  • Manage supporting cluster operators including Envoy Gateway, External DNS, cert-manager, Fluent Bit, and the AWS Load Balancer Controller.
  • Own and improve our observability stack—Grafana for dashboards, Loki for log aggregation, Tempo for distributed tracing, and Prometheus for metrics.
  • Support multi-environment reliability across dev, stage, and production GovCloud accounts.
  • Improve system resilience through load testing (Locust), E2E testing (Playwright/Cucumber), and thoughtful capacity planning.
  • Contribute to backend services (FastAPI, SQLAlchemy) in Python and TypeScript.
  • Work alongside product engineers as a first-class contributor, making architecture decisions that balance speed, cost, and reliability.
  • Build developer experience tooling: local dev environments, CI pipeline improvements, and automated testing scaffolds that make the whole team faster.
  • Support and extend Temporal-based workflow orchestration for background processing.
  • Implement least-privilege IAM policies, IRSA (IAM Roles for Service Accounts), and network segmentation in a GovCloud environment.
  • Manage secrets through AWS Secrets Manager and the External Secrets Operator with automated rotation.
  • Maintain TLS automation via cert-manager and OIDC authentication flows.
  • Enable SOC2, CMMC, and FedRAMP compliance activities: GRC platform integration, audit logging pipelines, FIPS-validated endpoint configuration, system boundary documentation, and evidence collection for third-party assessments.

Requirements

  • 5+ years of professional experience in infrastructure, DevOps, SRE, or platform engineering.
  • Deep hands-on experience with AWS services in production (EKS, IAM, Secrets Manager, ECR, RDS). Experience with or strong working knowledge of AWS GovCloud is a significant plus.
  • Strong Kubernetes expertise: you've operated clusters, debugged networking issues, managed Helm charts, and tuned workloads.
  • Proficiency in Python and/or TypeScript with a genuine interest in writing application code alongside infrastructure work.
  • Experience with GitOps workflows: ArgoCD, GitHub Actions, and Helm-based deployments.
  • Solid understanding of networking fundamentals (DNS, load balancing, TLS, Kubernetes Gateway API).
  • Comfort with Linux systems administration and shell scripting.
  • Familiarity with compliance-driven infrastructure: audit logging, access controls, and evidence collection for frameworks like CMMC, FedRAMP, and SOC 2.
  • A collaborative, low-ego mindset: you thrive in small, fast-moving teams.
  • Nice to Have: Experience with Envoy Gateway or the Kubernetes Gateway API.
  • Background in PostgreSQL administration and schema-based multi-tenancy.
  • Familiarity with the Grafana observability stack (Loki, Tempo, Prometheus).
  • Experience with Karpenter for node autoscaling or cost optimization strategies for cloud spend.
  • Experience with Temporal for workflow orchestration.
  • Experience in a startup or high-growth environment where you wore many hats.
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
AWSKubernetesPythonTypeScriptGitOpsHelmPostgreSQLLinuxload testingE2E testing
Soft Skills
collaborative mindsetlow-ego mindsetteamworkarchitecture decision-makingcapacity planning
Certifications
SOC2CMMCFedRAMP