Develop software solutions to enable reliability and operability of large scale distributed systems handling petabytes of data and serving
Build a deep understanding of how Pinterest’s systems behave, scale, interact and fail, and use that insight to identity risks and opportunities for remediation
Build tools and automation to eliminate toil and reduce operational overhead. Create frameworks, processes and best practices to be used across Pinterest Engineering
Build meaningful, insightful and actionable SLIs
Automate critical portions of Pinterest’s engineering processes, to minimize risk and maximize the speed of innovation
Manage capacity and performance to help scale our infrastructure both on public and private clouds around the world
Requirements
Strong knowledge of Linux/Unix/BSD internals and experience working with open source software (e.g. MySQL, Hadoop, Envoy, HAProxy, Nginx)
Experience with technologies such as ElasticSearch, ZooKeeper, HBase, Hadoop, Memcache and Kafka with a focus on reliability, automation, operability and performance
2+ years of experience programming using Python or Go
Infrastructure as code a plus (e.g. Terraform, Puppet, Chef, Ansible, Salt, Fabric, Docker, etc)
Bonus points if experienced with deploying web apps to cloud infrastructure (AWS, etc.) and working with distributed, service-oriented architecture
Bachelor’s degree in Computer Science a related field or equivalent experience.
Benefits
Health insurance
Retirement plans
Paid time off
Flexible work arrangements
Professional development opportunities
Equity options
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.