
SRE – Platform Engineer
DroneUp
full-time
Posted on:
Location Type: Remote
Location: United States
Visit company websiteExplore more
Salary
💰 $125,000 - $150,000 per year
Tech Stack
About the role
- Broad domain architect for the internal developer platform and all cloud engineering
- Drive architecture for tooling or in-house software
- Mentor other platform engineers to drive strong engineering practices
- Enablement of platform engineering technical capabilities in our internal client teams in software engineering
- Peer with the senior architects and engineers in software engineering
- Architecture and engineering focused on GCP environment
- Architect and oversee GKE cluster operations and workload management
- Provide feedback to others and participate in peer reviews / pair programming
- Drive the broad adoption of Test Driven Development through designing, development, and debugging unit and integration tests for new and existing infrastructure and code
- Continuous curiosity of existing implementations and new technologies and sharing with the team
- Practice continuous improvement across all job areas and personally / professionally
- Clearly communicate with platform engineering teams and other stakeholders and provide technical direction while doing so
- Stay current with platform changes and third-party libraries.
- Proactively investigate better solutions for current solutions
- An understanding of Open Telemetry and true observability and the difference between it and monitoring and logging
- Grow the engineering culture towards a high-performing team
- Practice the arts of self-service, least privilege and security by default in all solutions
- Define and maintain Service Level Objectives (SLOs), Service Level Indicators (SLIs), and error budgets
- Lead incident response, including on-call rotations, root cause analysis, and post-mortem reviews
- Implement and optimize monitoring, alerting, and observability systems for system reliability
- Collaborate on capacity planning and performance optimization to ensure high availability
- Other duties as assigned
Requirements
- Bachelor's degree in Computer Science, Computer Engineering or related field or 8+ years experience as a software engineer
- Proficiency in kubernetes. Optional: CKA, CKAD
- Extensive experience in Unix / Linux
- Polyglot and proficiency in multiple languages (ideally: Golang, NodeJS, Python, HCL and more)
- Knowledge of multi-cloud environment, including GCP, AWS, and Azure (familiar with at least two of these environments)
- Experienced in using git in trunk-based development models
- Experience in use of feature flagging in infrastructure and runtime (k8s)
- Experience with backend database technology is a plus, including supporting and performance enhancements
- Advanced experience working with and creating public cloud resources in Terraform or other infrastructure as code tools
- Experience participating in a 24/7 on-call schedule without supervision and successfully resolving issues without escalation
- Experience using Open Telemetry for observability as well as other monitoring tools such as datadog, new relic and others
- Good understanding of networking and routing principles
- Experience in dockerizing applications and orchestrating them with kubernetes
- Familiarity with security configuration for web/api services (SSL, Access control)
- Experience with JIRA or other work tracking systems.
- Ability to resolve tickets according to priority order and collaborating with the Technical Product Manager to adjust priorities
- Excellent documentation details, using Confluence or similar tooling – this could include support notes, runbooks, ADRs, etc
- Familiarity with creating an end to end CI/CD pipeline using various tools with artifact storage
- Familiarity with use of MacOS as a desktop and predominantly CLI interfaces
- Experience in a “product mindset” by understanding stakeholder needs, priorities and business value
- Experience with security compliance frameworks including FedRAMP, NIST, and SOC2
- Proven experience in SRE practices, including incident management and reliability engineering
- Familiarity with monitoring tools like Prometheus, Grafana, or Honeycomb for observability
- Experience with chaos engineering, load testing, or reliability testing frameworks.
Benefits
- Employees are expected to provide a high level of security to any personal or private information accessed as part of their work, whether at a DroneUp facility or remotely.
- Participate in security training.
- Remain sensitive to individual rights to personal privacy.
- Comply with company policies.
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
kubernetesGCPGKEGolangNodeJSPythonHCLTerraformOpen Telemetrydocker
Soft Skills
mentoringcommunicationcollaborationcontinuous improvementcuriosityincident managementproblem-solvingtechnical directionteam cultureprioritization
Certifications
Bachelor's degree in Computer ScienceCKACKAD