
LLM Observability Engineering Manager
PwC
full-time
Posted on:
Location Type: Office
Location: San Francisco • California, Illinois, Minnesota, New York • 🇺🇸 United States
Visit company websiteSalary
💰 $99,000 - $232,000 per year
Job Level
Mid-LevelSenior
Tech Stack
AWSAzureCloudGoGoogle Cloud PlatformJavaJavaScriptNode.jsPython
About the role
- Architect and implement observability solutions for Large Language Models offering real-time insights into critical workflows
- Enhance DataDog integrations for improved monitoring capabilities
- Lead teams in developing quality deliverables while cultivating meaningful client relationships
- Work with cross-functional teams to achieve project goals and automation
- Mentor team members to develop their technical skills and use reviews to deepen expertise
- Take ownership of projects including planning, budgeting, execution, and completion
- Partner with team leadership to ensure collective ownership of quality, timelines, and deliverables
- Lead root-cause investigations for LLM incidents and manage security monitoring and compliance reporting
- Contribute to open-source integrations and engage with DataDog and MLOps communities
Requirements
- Bachelor's Degree in Mathematics, Engineering, Computer Science
- At least 5 years of hands-on experience with DataDog
- Master's Degree preferred
- Experience in architecting DataDog integrations
- Developing and maintaining observability for LLM platforms
- Working with cross-functional teams for automation
- Leading root-cause investigations for LLM incidents
- Contributing to open-source DataDog integrations
- Publishing or presenting in LLM observability field
- Engaging with DataDog and MLOps communities
- Experience with additional observability platforms
- Specialization in architecting observability across large-scale LLM or GenAI systems
- Proficient in instrumenting Python, Node.js, Java, or Go applications
- Demonstrated experience with cloud-native infrastructure (AWS, Azure, GCP)
- Extensive understanding of LLM architectures, embeddings, and evaluation metrics
- Ability to implement and manage DataDog security monitoring and compliance reporting
- Willingness to travel up to 40%
Benefits
- medical
- dental
- vision
- 401k
- holiday pay
- vacation
- personal and family sick leave
- annual discretionary bonus
- professional development opportunities
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard skills
DataDogPythonNode.jsJavaGocloud-native infrastructureobservabilityLLM architecturessecurity monitoringcompliance reporting
Soft skills
leadershipmentoringclient relationship managementcross-functional collaborationproject ownershipcommunicationteam developmentproblem-solvingbudgetingplanning
Certifications
Bachelor's Degree in MathematicsBachelor's Degree in EngineeringBachelor's Degree in Computer ScienceMaster's Degree (preferred)