Practice continuous improvement, by iterating on how services are deployed, configured, monitored, and maintained on our platform
Lead incident response, diagnosis, and follow-up on system outages and alerts
Help develop an operational focus and act as thought leaders for the rest of engineering
Maintain and optimize infrastructure for performance, scalability, and cost.
Analyze system metrics and identify opportunities for improvement in reliability and efficiency.
Requirements
Comfortable working and thriving within a Linux ecosystem
Experience supporting high availability distributed production systems
Experience with database administration and support
Treated infrastructure as code utilizing tools like Terraform, Ansible, Chef, Puppet, and SaltStack
Familiarity working in a public cloud platform (GCP, AWS, Azure)
Software development skills in at least one of the following languages: Python, Go, Javascript, and/or Ruby
B.S. or M.S. in Computer Science or related field or equivalent in related work experience.
Strong English language skills and ability to work independently, as an effective part of a globally distributed team
Ability to learn about the supply chain security space
Experience scaling services in a performant and cost-effective manner
Implemented incident management and disaster recovery playbooks
Knowledge of microservices architecture and containerization (Docker/OCI, Kubernetes)
Familiarity across multiple public cloud platforms (GCP, AWS, Azure)
Operated a multi-tenant capable software defined network (SDN)
Linux systems troubleshooting and debugging skills
Solid understanding of data structures, algorithms, API design, and software design patterns.
Interest in open source software projects and communities
Benefits
Flexible & Remote-First Culture: Work remotely with team meetup opportunities, bi-annual destination summits, and a monthly stipend for coworking spaces, phone and internet costs.
Our Approach to Equity: Receive stock options upon hire and promotion. Plus, you can participate in secondary offerings and have 10 years to exercise your options (yes, you read that correctly: 10 years!).
100% Covered Health Insurance: We cover 100% of your health, vision and dental insurance premiums for you and your dependents. Nothing comes out of your paycheck.
∞ Flexible Time Off: Take the time you need – to do our best work, we need to recharge and reset.
18 Weeks Paid Parental Leave: We offer 18 weeks for birthing parents and 12 weeks for non-birthing parents, with the option to use it all at once or throughout your child's first year.
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard skills
Linuxdatabase administrationinfrastructure as codeTerraformAnsibleChefPuppetSaltStackPythonGo
Soft skills
strong English language skillsability to work independentlyeffective teamworkability to learnincident managementdisaster recovery