Design and implement highly scalable platform services and shared infrastructure components that support both healthcare applications and advanced AI systems, ensuring these foundational services can handle growing data volumes, increased user loads, and evolving business requirements while maintaining consistent performance and availability across all environments.
Develop and maintain comprehensive CI/CD pipelines that enable secure, automated, and compliant software delivery processes, incorporating automated testing, security scanning, compliance validation, and deployment orchestration to accelerate development velocity while maintaining rigorous quality and security standards required in healthcare environments.
Build and operate containerized services using industry-standard technologies such as Docker and Kubernetes, ensuring resilience through proper health checking, auto-scaling, and graceful degradation patterns, while maintaining high availability through multi-zone deployments, circuit breakers, and comprehensive disaster recovery capabilities.
Implement sophisticated observability solutions including comprehensive monitoring, centralized logging, detailed metrics collection, and distributed tracing capabilities to support operational excellence, enable rapid troubleshooting, facilitate capacity planning, and provide deep insights into system behavior across complex microservices architectures.
Design and maintain robust storage solutions and efficient data models across both relational databases (e.g., PostgreSQL, MySQL) and non-relational systems (e.g., MongoDB, Redis, Cassandra), ensuring data integrity, optimal query performance, proper indexing strategies, and adherence to data governance requirements specific to healthcare data handling.
Perform thorough production troubleshooting and comprehensive root cause analysis to identify and resolve system issues, implementing preventive measures to strengthen platform reliability, reduce mean time to recovery (MTTR), and continuously improve system stability through post-incident reviews and actionable remediation plans.
Collaborate effectively with product engineering teams, security specialists, and compliance officers to translate complex business requirements into reusable, well-documented, and secure platform capabilities that can be leveraged across multiple teams and projects, promoting consistency and reducing duplicated effort
Document technical tradeoffs, architectural decisions, and implementation patterns clearly and concisely using architecture decision records (ADRs), design documents, and comprehensive runbooks that enable knowledge transfer, facilitate onboarding of new team members, and support long-term system maintainability.
Support comprehensive compliance, security, and governance requirements including healthcare regulatory considerations such as HIPAA privacy and security rules, ensuring all platform systems implement appropriate access controls, encryption at rest and in transit, audit logging, and data retention policies.

Requirements

Bachelor's degree in Computer Science, Engineering, or related technical field, or equivalent practical experience demonstrated through a portfolio of successful projects, open-source contributions, or progressive career growth in software engineering roles.
5+ years of demonstrated professional experience in software engineering with a strong foundational understanding of distributed systems architecture, backend development methodologies, or platform engineering practices, including hands-on experience building and maintaining production systems that serve significant user bases with strict uptime requirements.
Strong proficiency in modern programming languages such as Python or Java, with a proven track record of delivering production-grade services that demonstrate clean code practices, appropriate design patterns, comprehensive error handling, and maintainable architectures that have successfully operated in demanding production environments.
Extensive hands-on experience designing, deploying, and operating cloud-native systems within public cloud environments (Google Cloud Platform strongly preferred), including practical knowledge of core cloud services such as compute instances, managed databases, object storage, networking configurations, identity and access management, and infrastructure security controls.
Substantial experience with containerization and orchestration technologies (e.g., Docker for container packaging, Kubernetes for container orchestration), and modern CI/CD automation practices including infrastructure as code (e.g., Terraform, Pulumi), automated testing frameworks, deployment strategies (blue-green, canary), and release management processes.
Solid understanding of system observability principles, reliability engineering practices (including SLIs, SLOs, error budgets), and performance optimization techniques such as caching strategies, query optimization, connection pooling, and resource tuning to ensure systems meet demanding performance and availability requirements.

Benefits

medical, dental and vision benefits
401(k) retirement savings plan
time off (including paid time off, company and personal holidays, volunteer time off, paid parental and caregiver leave)
short-term and long-term disability
life insurance

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools

PythonJavaCI/CDDockerKubernetesPostgreSQLMySQLMongoDBRedisCassandra

Soft Skills

collaborationcommunicationproblem-solvingdocumentationroot cause analysistroubleshootingorganizational skillsadaptabilityattention to detailcritical thinking