FREE ACCESS
5,000–10,000 jobs/day

See all jobs on JobTailor
Search thousands of fresh jobs every day.
Discover
- Fresh listings
- Fast filters
- No subscription required
Create a free account and start exploring right away.

Senior Software Engineer, Infrastructure
Epic KidsSenior Software Engineer driving the stability and reliability of Epic's GCP infrastructure. Collaborating with engineering teams to maintain high availability and scalability.
Tech Stack
Tools & technologiesCloudDockerGoogle Cloud PlatformJenkinsKubernetesPythonTerraform
About the role
Key responsibilities & impact- Drive the stability and reliability of Epic's GCP infrastructure—setting and tracking SLOs/SLIs, reducing toil, and engineering out recurring sources of instability
- Build and operate Epic's GCP infrastructure for high availability, scalability, and cost efficiency
- Manage and harden our Docker and GKE container platform, including workload scheduling, autoscaling, networking, and graceful failure handling
- Maintain and improve CI/CD pipelines that enable fast, safe, low-risk delivery across engineering teams
- Own and evolve the observability stack—metrics, logs, traces, dashboards, and alerts—so that signals are actionable, noise is low, and on-call has the context to resolve issues quickly
- Write and maintain Terraform to codify infrastructure across the organization, with a focus on consistency, change safety, and reproducibility
- Contribute to capacity planning, cost optimization, and architectural reviews, with reliability as a first-class consideration
- Champion platform security best practices, including secrets management, IAM policies, and network segmentation
- Support compliance-aware infrastructure practices—vulnerability management, access reviews, audit-evidence flows, and incident-response readiness—as we mature our SOC 2 and student-data compliance programs
- Partner with data engineering to operate the orchestration platform and supporting infrastructure—deployment, scaling, reliability, and observability
- Collaborate with backend and data engineers to troubleshoot service and platform issues
- Lead by example in a frequent on-call rotation; drive incident response, blameless post-mortems, and the follow-through that turns one-time outages into systemic, lasting reliability improvements
- Provide guidance to developers on infrastructure concerns and best practices
Requirements
What you’ll need- Bachelor's degree or higher in Computer Science, Software Engineering, or a related field
- 5+ years of experience in infrastructure, platform, DevOps, or a related engineering role
- Hands-on experience with GCP (GCE, GCS, VPC, IAM, Cloud Monitoring, and related services)
- Experience with Docker and Kubernetes (GKE)—containerizing workloads, deploying to GKE, Helm, and cluster fundamentals
- Experience with CI/CD pipelines (GitHub Actions, ArgoCD, Jenkins, or similar)
- Experience with an observability platform such as New Relic (metrics, logging, alerting, dashboards)
- Proficiency in Terraform for managing infrastructure as code
- Scripting/programming skills in Python, Bash, or similar
- Comfort participating in a frequent production on-call rotation
- Track record of measurably improving reliability of production systems—e.g., defining SLOs, reducing incident frequency or MTTR, eliminating recurring failure modes
- Strong problem-solving skills, sense of ownership, and ability to work effectively in evolving systems
- Fluency in English for daily collaboration and technical documentation
- Proficiency in Mandarin Chinese to collaborate effectively with global engineering and business partners.
Benefits
Comp & perks- 🌐 Worldwide ❌ Jobs You've Hidden ⭐️ Saved Jobs ✅ Applied Jobs ✉️ Email Alerts 👤 Account Epic Kids Website LinkedIn All Job Openings 11 - 50 employees 📚 Education 👥 B2C 📱 Media Education
- B2C
- Media Epic Kids is a digital subscription service offering a curated library of children's eBooks, audiobooks, and learning videos. The platform provides families and educators access to thousands of age-appropriate titles from major publishers to encourage reading, discovery, and digital literacy. Epic Kids offers plans for parents and free access for educators, positioning itself as an educational media resource for children. Senior Software Engineer, Infrastructure Job not on LinkedIn 🔥 5 minutes ago 🇺🇸 United States – Remote 💵 $160k - $200k / year ⏰ Full Time 🟠 Senior 🧑💻 Full-stack Engineer 🗣️🇨🇳 Chinese Required Cloud Docker Google Cloud Platform Jenkins Kubernetes Python Terraform Apply Now Find Hiring Managers Customize resume + cover letter Report problem ☆ Save ☑️ Mark as applied ❌ Hide 📋 Description
- Drive the stability and reliability of Epic's GCP infrastructure—setting and tracking SLOs/SLIs, reducing toil, and engineering out recurring sources of instability
- Build and operate Epic's GCP infrastructure for high availability, scalability, and cost efficiency
- Manage and harden our Docker and GKE container platform, including workload scheduling, autoscaling, networking, and graceful failure handling
- Maintain and improve CI/CD pipelines that enable fast, safe, low-risk delivery across engineering teams
- Own and evolve the observability stack—metrics, logs, traces, dashboards, and alerts—so that signals are actionable, noise is low, and on-call has the context to resolve issues quickly
- Write and maintain Terraform to codify infrastructure across the organization, with a focus on consistency, change safety, and reproducibility
- Contribute to capacity planning, cost optimization, and architectural reviews, with reliability as a first-class consideration
- Champion platform security best practices, including secrets management, IAM policies, and network segmentation
- Support compliance-aware infrastructure practices—vulnerability management, access reviews, audit-evidence flows, and incident-response readiness—as we mature our SOC 2 and student-data compliance programs
- Partner with data engineering to operate the orchestration platform and supporting infrastructure—deployment, scaling, reliability, and observability
- Collaborate with backend and data engineers to troubleshoot service and platform issues
- Lead by example in a frequent on-call rotation; drive incident response, blameless post-mortems, and the follow-through that turns one-time outages into systemic, lasting reliability improvements
- Provide guidance to developers on infrastructure concerns and best practices 🎯 Requirements
- Bachelor's degree or higher in Computer Science, Software Engineering, or a related field
- 5+ years of experience in infrastructure, platform, DevOps, or a related engineering role
- Hands-on experience with GCP (GCE, GCS, VPC, IAM, Cloud Monitoring, and related services)
- Experience with Docker and Kubernetes (GKE)—containerizing workloads, deploying to GKE, Helm, and cluster fundamentals
- Experience with CI/CD pipelines (GitHub Actions, ArgoCD, Jenkins, or similar)
- Experience with an observability platform such as New Relic (metrics, logging, alerting, dashboards)
- Proficiency in Terraform for managing infrastructure as code
- Scripting/programming skills in Python, Bash, or similar
- Comfort participating in a frequent production on-call rotation
- Track record of measurably improving reliability of production systems—e.g., defining SLOs, reducing incident frequency or MTTR, eliminating recurring failure modes
- Strong problem-solving skills, sense of ownership, and ability to work effectively in evolving systems
- Fluency in English for daily collaboration and technical documentation
- Proficiency in Mandarin Chinese to collaborate effectively with global engineering and business partners. Apply Now 📊 Check your resume score for this job Improve your chances of getting an interview by checking your resume score before you apply. Check Resume Score Similar Jobs Research & Development Software Engineer – Geometry, AI 🔥 8 minutes ago Foundation EGI 11 - 50 🤖 Artificial Intelligence ☁️ SaaS 🏢 Enterprise Website LinkedIn All Job Openings Software Engineer developing geometry processing and simulation algorithms for an AI startup. Bridging engineering design and machine learning applications in a remote role. 🇺🇸 United States – Remote ⏰ Full Time 🟡 Mid-level 🟠 Senior 🧑💻 Full-stack Engineer Python Senior AI Product Engineer 🔥 1 hour ago AppOmni 51 - 200 ☁️ SaaS 🔒 Cybersecurity 🏢 Enterprise Website LinkedIn All Job Openings Senior AI Product Engineer focusing on designing, deploying, and maintaining AI systems at AppOmni. Engaging in technical leadership and product management for the company’s AI strategy. 🇺🇸 United States – Remote 💵 $180k - $220k / year 💰 Series C on 2022-12 ⏰ Full Time 🟠 Senior 🧑💻 Full-stack Engineer 🦅 H1B Visa Sponsor Cloud Cyber Security Google Cloud Platform Keras Pandas PySpark Python PyTorch Tensorflow Senior Product Engineer – Web Services 🔥 1 hour ago Esri 5001 - 10000 🏢 Enterprise ☁️ SaaS 🔬 Science Website LinkedIn All Job Openings Senior Product Engineer guiding backend services evolution for ArcGIS applications. Focusing on reliability, scalability, and customer impact in a product-focused role. 🇺🇸 United States – Remote 💵 $93.6k - $159.3k / year ⏰ Full Time 🟠 Senior 🧑💻 Full-stack Engineer 🦅 H1B Visa Sponsor Cloud Python Software Engineer – AI SDK 🔥 1 hour ago Temporal Technologies 51 - 200 ☁️ SaaS Website LinkedIn All Job Openings Software Engineer developing AI SDK features supporting various frameworks at Temporal. Engaging with AI application development to provide user-friendly solutions and contribute to open-source enhancements. 🇺🇸 United States – Remote 💵 $160k - $200k / year 💰 $75M Series B on 2023-02 ⏰ Full Time 🟡 Mid-level 🟠 Senior 🧑💻 Full-stack Engineer 🦅 H1B Visa Sponsor Open Source Python TypeScript Go Senior Software Engineer, Compute – Temporal Cloud 🔥 1 hour ago Temporal Technologies 51 - 200 ☁️ SaaS Website LinkedIn All Job Openings Senior Software Engineer at Temporal Technologies building and managing compute mechanisms for cloud services. Collaborating with cross-functional teams to optimize distributed systems and enhance developer experience. 🇺🇸 United States – Remote 💵 $176k - $237.6k / year 💰 $75M Series B on 2023-02 ⏰ Full Time 🟠 Senior 🧑💻 Full-stack Engineer 🦅 H1B Visa Sponsor Cloud Distributed Systems Kubernetes Open Source Go View More Full-stack Engineer Jobs 🌐 Worldwide Built by Lior Neu-ner. I'd love to hear your feedback — Get in touch via DM or support@remoterocketship.com Search Search Jobs by country Search jobs by city Search jobs by job title Search entry-level jobs Search junior-level jobs Search senior-level jobs Search jobs by tech stack Search jobs by contract type Search remote internships Search remote part-time jobs Remote jobs Anywhere in the World Companies Hiring Anywhere in the World Companies Hiring Sales People Anywhere in the World Companies Hiring Software Engineers Anywhere in the World Resources Advice Tips for finding remote jobs Interview questions and answers Resume examples Cover letter examples Post a job Affiliates Privacy policy Terms of service Job board SEO course AI Apply Copilot OpenClaw job finder Jobs by Country Remote jobs anywhere in the world (Worldwide remote jobs) Remote jobs United States Remote jobs Australia Remote jobs Brazil Remote jobs Canada Remote jobs France Remote jobs Ireland Remote jobs Germany Remote jobs Netherlands Remote jobs Spain Remote jobs UK Popular Jobs Remote data analyst jobs Remote customer support jobs Remote executive assistant jobs Remote marketing jobs Remote product designer jobs Remote product manager jobs Remote project manager jobs Remote recruiter jobs Remote sales jobs Remote software engineer jobs Jobs by Type Remote full-time jobs Remote part-time jobs Remote contract jobs Remote internship jobs Remote entry-level jobs Remote jobs with no experience required Remote junior jobs (1-3 years of experience) Digital nomad jobs Remote jobs with no degree required Freelance remote jobs Temporary remote jobs Remote jobs hiring now Stay at home mom jobs
ATS Keywords
✓ Tailor your resumeApplicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
GCPDockerKubernetesCI/CDTerraformPythonBashobservabilitySLOsincident response
Soft Skills
problem-solvingownershipcollaborationcommunication
Certifications
Bachelor's degree in Computer ScienceBachelor's degree in Software Engineering