Cloudflare

Distributed Systems Engineer – Data Platform, Analytics, Alerts

Cloudflare

full-time

Posted on:

Location Type: Hybrid

Location: LisbonPortugal

Visit company website

Explore more

AI Apply
Apply

About the role

  • Develop and enhance our customer-facing APIs focusing on performance, reliability, and an intuitive user experience.
  • Design, build, and maintain our near real-time alerting platform, from data processing and anomaly detection to reliable notification delivery.
  • Optimise the performance of complex analytical queries that power our APIs and dashboards, working closely with the database platform team.
  • Create intuitive and powerful tools that allow customers to explore their data and configure meaningful alerts based on logs and metrics.
  • Scale our API and alerting infrastructure to support a growing number of internal and external use cases.
  • Collaborate with front-end engineers and product managers to define API contracts and deliver a seamless data experience for our users.
  • Ensure the operational health of our APIs and alerting systems by developing comprehensive monitoring, and participating in an on-call rotation (with the flexibility to be on-call outside of standard working hours as needed).

Requirements

  • 3+ years of experience working in software development covering distributed systems and scalable APIs.
  • Strong programming skills (Go is preferable), with a deep understanding of software development best practices for building performant, customer-facing services.
  • Hands-on experience with modern observability stacks, including Prometheus, Grafana, and a strong understanding of handling high-cardinality metrics at scale.
  • Strong knowledge of SQL, including extensive experience with complex query optimisation.
  • A solid foundation in computer science, including algorithms, data structures, distributed systems, and concurrency.
  • Strong analytical and problem-solving skills, with a willingness to debug, troubleshoot, and learn about complex problems at high scale.
  • Ability to work collaboratively in a team environment and communicate effectively with other teams across Cloudflare.
  • Experience developing and scaling APIs, particularly GraphQL, is a strong plus.
  • Experience with data streaming technologies (e.g., Kafka, Flink) for real-time processing is a plus.
  • Experience with Infrastructure as Code tools like SALT or Terraform is a plus.
  • Experience with Linux container technologies, such as Docker and Kubernetes, is a plus.
Benefits
  • Health insurance
  • 401(k) matching
  • Flexible work hours
  • Paid time off
  • Remote work options
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
GoSQLGraphQLdata processinganomaly detectioncomplex query optimisationalgorithmsdata structuresdistributed systemsconcurrency
Soft Skills
analytical skillsproblem-solving skillscollaborationcommunication