Tech Stack
CloudIoTServiceNowTerraform
About the role
- Responsibilities:
- Convert business SLAs/SLOs into technical implementations
- Own full stack across APM, Logs, Metrics, RUM, Synthetics, Service Catalog, etc.
- Enforce best practices for PII hygiene, redaction, and SLOs
- Design robust monitors (threshold, anomaly, composite, etc.)
- Generate reports on MTTD/MTTR, alert noise, cost by service/team
- Everything as code: Terraform, APIs, CI/CD
- FinOps- lead structured, data-driven initiative to significantly reduce Datadog spend across logs, metrics, traces, synthetics, and RUM
Requirements
- Desired Skills & Experience:
- 6+ years in SRE, DevOps, or Observability roles
- 3+ years hands-on experience with Datadog, covering APM, Logs, Metrics, Dashboards, and SLOs
- Proven client-facing experience: leading workshops, stakeholder alignment, proposals
- Strong in Datadog pipelines, tagging, alerting, dashboards, and cost controls
- Proficiency with Terraform (Datadog Provider) or Datadog APIs
- Git-based workflows and CI/CD for observability configuration
- Familiarity with incident management tools like PagerDuty, ServiceNow, Slack
- Excellent written and verbal communication skills