Apply

Ready to go for it?

AI Apply speeds things up—apply directly if you prefer.

FREE ACCESS
5,000–10,000 jobs/day
JobTailor Logo

See all jobs on JobTailor

Search thousands of fresh jobs every day.

Discover
  • Fresh listings
  • Fast filters
  • No subscription required
Create a free account and start exploring right away.
Smartsheet

Senior Manager, Engineering – Observability Platform

Smartsheet

Senior Manager managing engineering team to build a centralized observability platform at Smartsheet. Leading observability engineering efforts and cross-functional partnerships to optimize system visibility.

Posted 6/12/2026full-timeRemote • Washington • 🇺🇸 United StatesSenior💰 $205,000 - $275,000 per yearWebsite

Tech Stack

Tools & technologies
Distributed SystemsElasticSearch

About the role

Key responsibilities & impact
  • Lead a team of engineers focused on observability platform engineering, driving build-out of a unified observability stack used by all engineering teams at Smartsheet.
  • Own and evolve the platform's technical roadmap, consolidating multiple tooling platforms, and AI observability tooling into a coherent, scalable capability.
  • Define platform standards, contribute to architectural direction, and ensure the team operates with engineering rigor and strong operational habits.
  • Build and scale the team, hiring senior engineers and establishing effective global practices across distributed stakeholders.
  • Lead design and delivery of centralized observability infrastructure covering metrics pipelines, distributed tracing, alerting frameworks, and log analytics across Smartsheet services.
  • Drive SLO/SLA definition and tooling for platform-wide reliability visibility, partnering closely with infrastructure, platform engineering, and on-call teams.
  • Own governance including instrumentation standards, cost optimization, and rollout of advanced capabilities such as APM, RUM, and custom dashboards.
  • Lead architecture, scaling, and operational practices for log analytics across high-throughput production workloads.
  • Establish shared observability libraries, agents, and SDKs that reduce instrumentation burden for application engineering teams.
  • Build and maintain AI/ML observability integrations in partnership with the AI Platform team.
  • Partner with the Data & AI Platform team to integrate MLflow tracing, Inference Tables, and LLM-as-judge evaluation pipelines into the observability stack.
  • Develop dashboards and alerting for agentic AI workloads, including latency, token consumption, error rates, and evaluation metric drift.
  • Contribute to the AI governance and cost observability program, providing telemetry for model usage, cost attribution, and compliance reporting.
  • Serve as the primary engineering partner for platform consumers across Data & AI, Commerce, Infrastructure, and Security teams, ensuring observability needs are met across workstreams.
  • Lead complex, cross-functional observability projects with high ambiguity, managing delivery risk, communicating clearly to senior stakeholders, and building alignment across teams.
  • Partner with delivery partners to coordinate instrumentation across platform modernization and migration workstreams
  • Contribute to quarterly and annual platform goals, reporting on key reliability and observability metrics to engineering leadership.
  • Communicate platform status, risks, and roadmap progress to Engineering leadership and above audiences in a clear, executive-ready format.
  • Embed on-call culture and incident management discipline into the team, ensuring clear runbooks, fast MTTR, and post-incident learning loops.
  • Drive cost governance for observability tooling, including spend optimization and efficient resource management.
  • Champion AI-assisted engineering practices within the team, applying tooling and automation to reduce toil and accelerate delivery.

Requirements

What you’ll need
  • 10+ years of software or platform engineering experience, with strong fundamentals in distributed systems, infrastructure, and backend services.
  • 3 years of engineering management experience, including direct team building, performance management, and cross-functional delivery ownership.
  • Deep hands-on expertise with observability tooling: Datadog (APM, metrics, logs, alerting), OpenSearch or Elasticsearch, distributed tracing (OpenTelemetry or equivalent), and SLO/SLA management at scale.
  • Proven experience operating observability platforms for high-availability, high-throughput production environments.
  • Experience building and scaling engineering teams in distributed or international focus
  • Strong execution track record on complex, cross-functional infrastructure programs with high ambiguity.
  • Clear, direct communication (written and verbal) with both technical and non-technical audiences, including leadership and executive stakeholders.
  • Proactive risk identification and status communication without prompting.
  • Experience managing vendors, external delivery partners, and third-party integrations in a platform context.

Benefits

Comp & perks
  • Employer subsidized medical/vision and dental coverage for full-time employees
  • 401k Match to help you save for your future (50% of your contribution up to the first 6% of your eligible pay)
  • Monthly stipend to support your work and productivity
  • Flexible Time Away Program, plus Sick Time Off
  • US employees are automatically covered under Smartsheet-sponsored life insurance, short-term, and long-term disability plans
  • US employees receive 12 paid holidays per year
  • Up to 24 weeks of Parental Leave
  • Personal paid Volunteer Day to support our community
  • Opportunities for professional growth and development including access to Udemy online courses
  • Company Funded Perks, including a counseling membership, local retail discounts, and your own personal Smartsheet account
  • Teleworking options from any registered location in the U.S. (role specific)

ATS Keywords

✓ Tailor your resume
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
observability toolingdistributed systemsinfrastructurebackend servicesSLO managementSLA managementlog analyticsAI/ML observabilitymetrics pipelinesdistributed tracing
Soft Skills
team buildingperformance managementcross-functional delivery ownershipclear communicationrisk identificationexecutive communicationproactive status communicationincident managementcollaborationleadership