Koda Health

Senior Infrastructure, Security Engineer

Koda Health

full-time

Posted on:

Location Type: Remote

Location: United States

Visit company website

Explore more

AI Apply
Apply

Salary

💰 $160,000 - $170,000 per year

Job Level

About the role

  • Own the operational health of production across two AWS regions
  • Investigate production issues, lead root-cause analysis, and drive resolution
  • Build and maintain dashboards that give real-time visibility into application health, queue depths, API latency, and error rates
  • Monitor SQS/SNS queue health, dead-letter queues, and event processing pipelines
  • Expand observability beyond CloudWatch - evaluate and implement distributed tracing, APM, and log aggregation
  • Oversee weekly deployments to production
  • Own cost monitoring and alerting (Budget alerts, Cost Explorer)
  • Improve automated uptime and SLA reporting
  • Own and evolve all AWS infrastructure defined in CDK
  • Lead the migration to capturing 100% of cloud infrastructure in CDK
  • Manage and improve services: Lambda, ECS Fargate, Elastic Beanstalk, S3, CloudFront, SNS, SQS, EventBridge, WAF, Cognito
  • Support multi-region uptime, disaster recovery planning, and backup/restore practices
  • Improve cross-region replication and automated failover
  • Own deployment pipelines, release processes, and database migration safety
  • Support and evolve data pipelines used for analytics and product features
  • Set standards for how we ship, deploy, and operate software at scale
  • Maintain and harden AWS infrastructure with a strong security mindset
  • Own vulnerability remediation and SLA timelines
  • Help respond to security questionnaires and vendor assessments
  • Own and improve WAF rules, security groups, IAM policies, and network configuration
  • Own SecurityHub, AWS Config, VPC Flow Logs, and CloudTrail
  • Support GuardDuty malware scanning and S3 upload security
  • Ensure SOC 2 and HIPAA compliance across infrastructure
  • Manage secrets, key rotation, and access controls
  • Conduct periodic security reviews of infrastructure and application configuration
  • Triage and fix production errors surfaced by Sentry
  • Make small TypeScript PRs to backend services
  • Debug complex production issues that span infrastructure and application code
  • Participate in architecture discussions, especially around infrastructure and deployment concerns

Requirements

  • 6+ years building and operating production systems on AWS
  • Strong experience with AWS CDK (we use CDK in typescript)
  • Deep knowledge of core AWS services: Lambda, ECS, S3, CloudWatch, SNS, SQS, IAM, VPC, WAF
  • Experience setting up and managing monitoring, alerting, and incident management
  • Experience with security hardening and compliance in regulated environments (HIPAA, SOC 2, or similar)
  • Working knowledge of TypeScript or Node.js - enough to read the codebase, make PRs, and debug production issues
  • Experience with CI/CD pipelines (CodePipeline, GitHub Actions, or similar)
  • Comfortable owning production systems end-to-end in a small team where you're the expert
  • Strong English fluency - written & verbal communication (security questionnaire responses, etc)
  • US-based, able to work CST/EST hours (contractual requirement).
Benefits
  • Fully remote role (US-based)
  • Flexible, Unlimited Paid Time Off
  • Great medical, dental, and vision coverage
  • 401k options
  • Yearly personal development budget that can be used for books, courses, trainings, and more
  • Office setup budget
  • Annual company and team events
  • Latest Macbook + enterprise tooling (e.g. Claude Code, etc)
  • Opportunity to gain exposure to applied RL and SFT work on foundational AI models
  • Clear growth paths for ICs (Staff/Principal) and managers (EM/Director).
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
AWSAWS CDKTypeScriptNode.jsCI/CD pipelinesmonitoringalertingincident managementsecurity hardeningcompliance
Soft Skills
communicationproblem-solvingleadershipcollaborationanalytical thinkingattention to detailadaptabilityownershipteamworkcritical thinking