Veeva Systems

AI Data Engineer

Veeva Systems

full-time

Posted on:

Location Type: Remote

Location: Remote • California, Maine, Massachusetts, Oregon • 🇺🇸 United States

Visit company website
AI Apply
Apply

Salary

💰 $85,000 - $225,000 per year

Job Level

Mid-LevelSenior

Tech Stack

Python

About the role

  • This role is responsible for ensuring the reliability, accuracy, and safety of our Veeva AI Agents through rigorous evaluation and systematic validation methodologies.
  • Define and establish comprehensive evaluation strategies for new AI Agents. Prioritize the integrity and coverage of test data sets to reflect real-world usage and potential failure modes
  • Programmatically and manually evaluate the quality of LLM-generated content against predefined metrics (e.g., factual accuracy, contextual relevance, coherence, and safety standards)
  • Design, curate, and generate diverse, high-quality test data sets, including challenging prompts and scenarios. Evaluate LLM outputs to proactively identify system biases, unsafe content, hallucinations, and critical edge cases
  • Develop, implement, and maintain scalable automated evaluations to ensure efficient, continuous validation of agent behavior and prevent regressions with new features and model updates
  • Understand model behaviors and assist in the trace and root-cause analysis of identified defects or performance degradations
  • Clearly document, track, and communicate performance metrics, validation results, and bug status to the broader development and product teams

Requirements

  • A strong, specialized understanding of data quality principles, including methods for validating datasets against bias, integrity concerns, and quality standards. Ability to craft diverse and adversarial test data to uncover AI edge cases
  • Demonstrated skill in advanced prompt engineering techniques to create evaluation scenarios that test the AI's reasoning, action planning, and adherence to system instructions. Deep knowledge of LLM common failure modes (hallucination, incoherence, jailbreaking)
  • Proficiency in designing and deploying automated evaluation pipelines to assess complex, agentic AI behaviors. Familiarity with quality metrics such as task success rate, semantic similarity, and sentiment analysis for output measurement
  • Must be comfortable with the specific challenges of debugging agentic systems, including tracing and interpreting an agent's internal reasoning, tool use, and action sequence to pinpoint failure points
  • Proficiency in Python for developing custom evaluation frameworks, writing scripts, and integrating pipelines with CI/CD systems. Familiarity with standard test automation tools (e.g., Pytest, modern web automation tools)
  • Bachelor's degree in Data Science, Machine Learning, Computer Science, or a related field, with experience in Gen AI / LLMs
  • High work ethic. Veeva is a hard-working company
  • High integrity and honesty. Veeva is a PBC and a “do the right thing” company. We expect that from all employees
  • Applicants must have the unrestricted right to work in the United States or Canada. Veeva will not provide sponsorship at this time.
Benefits
  • Medical, dental, vision, and basic life insurance
  • Flexible PTO and company paid holidays
  • Retirement programs
  • 1% charitable giving program

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard skills
data quality principlesvalidation methodologiesprompt engineeringautomated evaluation pipelinesPythontest automation toolsLLM evaluation metricsdebugging agentic systemsdata set validationadversarial test data
Soft skills
high work ethichigh integritycommunicationdocumentationproblem-solvingcollaborationattention to detailcritical thinkingadaptabilityanalytical skills