Agentic Analyst

Lilt

full-time

Posted on: 9/24/2025

Origin: • 🇵🇭 Philippines

✨ AI Apply

💰 $18 - $24 per hour

JuniorMid-Level

BigQueryCloudETLPandasPythonSQL

About the role

Design and run agentic experiments end to end: frame problems, define success criteria, and summarize results with recommendations
Script data pulls from BigQuery, assemble representative datasets, and write robust Python for data processing and experiment automation
Integrate and iterate prompts in code; execute runs, collect outputs, and perform cost/quality analysis
Evaluate outputs with AI and programmatic checks, including error detection, terminology/style adherence, and human-in-the-loop checkpoints
Partner with Production, Product, and Research on workflow trials; quantify tradeoffs in quality, speed, and cost
Communicate findings in docs and presentations; open/update Jira tickets and share reproducible artifacts (datasets, scripts, prompts, dashboards)
Contribute to prompt and experiment hygiene: versioning, datasets, eval suites, and guardrails
Test agent capabilities in sandbox and provide structured feedback to Product/Engineering

2–4 years in Data Science, Analytics, or Applied AI with demonstrable Python proficiency (pandas, data parsing, APIs, basic ETL)
Hands-on experience with LLMs and prompt engineering across providers (e.g., OpenAI, Anthropic, Vertex/Gemini, Bedrock), including practical eval and iteration cycles
Strong analytical rigor: can define success metrics, compare workflows, and reason clearly about quality/cost/speed tradeoffs in production settings
SQL and data wrangling skills; experience with BigQuery or equivalent cloud data warehouse
Clear written communication with exec-ready summaries and artifact links (reports, notebooks, Sheets, slides)
Experience evaluating LLM systems with AI judges, scripted checks, or human sampling; familiarity with MQM/LQE or similar linguistics QA frameworks (nice-to-have)
Knowledge of RAG, vector stores, and retrieval verification strategies (nice-to-have)
Familiarity with agentic workflows or translation/linguistics domains (nice-to-have)
Basic MLOps/experimentation tools and prompt versioning best practices (nice-to-have)