Vivun

Lead Observability Engineer

Vivun

full-time

Posted on:

Location Type: Remote

Location: Remote • California • 🇺🇸 United States

Visit company website
AI Apply
Apply

Salary

💰 $185,000 - $205,000 per year

Job Level

Senior

Tech Stack

GrafanaPrometheus

About the role

  • Own the end-to-end observability strategy for Ava, defining the standards, tools, and patterns that ensure reliable visibility across infrastructure and agentic components.
  • Design and implement correlation models that link agent behavior, LLM interactions, and SaaS telemetry into cohesive, actionable insights.
  • Unify observability tooling across teams, ensuring metrics, logs, and traces flow into a central platform (e.g., Observe, Datadog, or equivalent).
  • Collaborate with engineering and QA to embed observability best practices into development workflows, CI/CD, and quality gates.
  • Establish enablement frameworks—documentation, dashboards, and templates—that make observability self-serve for all engineering teams.
  • Partner with teammates to ensure observability aligns with infrastructure reliability, alerting, and incident response patterns.
  • Contribute to performance and reliability strategy, helping define how we measure agent quality, responsiveness, and system scalability.

Requirements

  • 6+ years of experience in SRE, DevOps, or Observability Engineering roles, with at least 2+ years leading or designing observability initiatives.
  • Deep knowledge of observability tooling (e.g., OpenTelemetry, Prometheus, Grafana, Datadog, Honeycomb, Observe, etc.) and distributed tracing practices.
  • Experience with Agentic / LLM-based systems, including tools like LangChain, Celery, OpenAI APIs, or similar orchestration frameworks.
  • Strong understanding of how to instrument, trace, and correlate AI/LLM workflows with infrastructure-level telemetry.
  • Proven ability to define cross-team standards, influence engineering culture, and establish scalable monitoring patterns.
  • Strong collaboration and communication skills—you enable, not dictate.
Benefits
  • Competitive salary and full health benefits
  • Stock Options at a well funded, pre-IPO company on a fast growth track
  • Flexible work schedules and work from anywhere at a fully remote company
  • Unlimited PTO with two weeks designated as “quiet period” each year
  • An experienced team who will fight beside you in the trenches to accomplish your goals

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard skills
observability strategycorrelation modelsdistributed tracinginstrumentationmonitoring patternsSaaS telemetryagent behavior analysissystem scalabilityperformance measurementreliability engineering
Soft skills
collaborationcommunicationinfluenceleadershipdocumentationenablementcross-team standardsengineering culturebest practicesself-service frameworks
LG Electronics

Senior Product Development Engineer – Chillers Consultant

LG Electronics
Seniorcontract$140–$180🇺🇸 United States
Posted: 1 hour agoSource: boards.greenhouse.io
Dexcom

Staff Software Development Engineer

Dexcom
Leadfull-time$135k–$225k / yearCalifornia · 🇺🇸 United States
Posted: 2 hours agoSource: dexcom.wd1.myworkdayjobs.com
FirebaseiOSSwift
SonicWall

Firewall Engineer

SonicWall
Mid · Seniorfull-time🇺🇸 United States
Posted: 2 hours agoSource: boards.greenhouse.io
SHI International Corp.

Senior Managed Services Engineer – Citrix

SHI International Corp.
Seniorfull-time$120k–$160k / year🇺🇸 United States
Posted: 7 hours agoSource: shi.wd12.myworkdayjobs.com
AWSAzureCitrixCloud