Datadog

Senior Applied Scientist, Observability Data Platform

Datadog

full-time

Posted on:

Origin:  • 🇺🇸 United States • California, Colorado, Massachusetts, New York, Washington

Visit company website
AI Apply
Apply

Salary

💰 $187,000 - $240,000 per year

Job Level

Senior

Tech Stack

Distributed SystemsGoPythonRust

About the role

  • Design and prototype intelligent systems for AI-native observability, including cost-aware agent orchestration, adaptive query execution, and self-optimizing system components.
  • Apply reinforcement learning, search, or hybrid approaches to infrastructure-level decision-making, such as autoscaling, scheduling, or load shaping.
  • Collaborate with AI researchers and platform engineers to design experimentation loops and verifiers that guide LLM outputs using runtime metrics and formal models.
  • Explore emerging paradigms like AI compilers, “programming after code,” and runtime-aware prompt engineering to inform Datadog’s infrastructure and product design.
  • Help define the direction of BitsEvolve - Datadog’s optimization agent that uses LLMs and evolutionary search to discover code improvements, optimize GPU kernels, and tune configurations to improve performance.
  • Partner with product teams and platform stakeholders to ensure scientific advances translate into measurable improvements in cost, performance, and observability depth.
  • Join the team evolving observability infrastructure for stochastic, self-improving systems and build an intelligent control plane for production systems.

Requirements

  • You have a BS/MS/PhD in a scientific field or equivalent experience
  • You have 8+ years of experience in systems engineering, database internals, or infrastructure research, including hands-on experience in a production environment
  • You have a strong software engineering foundation, ideally in C++, Rust, Go, or Python, and are comfortable writing performant, maintainable code
  • You have deep expertise in at least one of the following areas: query optimization, data center scheduling, compiler design, reinforcement learning, or distributed systems design
  • You have experience applying search, planning, or learning techniques to solve real-world optimization problems
  • You are excited by systems that learn, adapt, and improve over time using feedback from runtime metrics and human-defined objectives
  • You are hypothesis-driven and enjoy designing experiments and evaluation loops, whether through simulations, benchmarks, or live systems
  • You thrive in ambiguity, enjoy reading papers and building prototypes, and want to help shape the future of infrastructure in the AI era
  • You enjoy collaborating across research, engineering, and product to bring scientific insights to practical outcomes
Cloudflare

Engineering Manager, AI Platform

Cloudflare
Mid · Seniorfull-time🇺🇸 United States
Posted: 1 day agoSource: boards.greenhouse.io
Distributed SystemsGoRustTypeScript
Reddit, Inc.

Staff Software Engineer, ML Ranking Platform

Reddit, Inc.
Leadfull-time$230k–$322k / year🇺🇸 United States
Posted: 21 days agoSource: boards.greenhouse.io
Distributed SystemsGoPython
Sema4.ai

Principal Engineer – Enterprise AI Agents

Sema4.ai
Leadfull-time🇺🇸 United States
Posted: 15 days agoSource: jobs.sema4.ai
Distributed SystemsGoHadoopJavaScriptKubernetesOpen SourcePythonRust
Fetch Rewards

Principal Software Engineer, Machine Learning

Fetch Rewards
Leadfull-time🇺🇸 United States
Posted: 1 day agoSource: boards.greenhouse.io
Distributed Systems
Dropbox

Software Engineer Intern

Dropbox
Entryinternship$8k–$10k / year🇺🇸 United States
Posted: 29 days agoSource: boards.greenhouse.io
Distributed SystemsGoJavaScriptPythonSpring