
Senior Prompt Engineer – Data Science, Quality Analysis
ItsaCheckmate
full-time
Posted on:
Location Type: Remote
Location: India
Visit company websiteExplore more
Job Level
About the role
- Design, test, and optimize LLM prompts for conversational AI, text classification, and structured data extraction tasks.
- Build evaluation pipelines to analyze prompt performance using quantitative metrics, human-in-the-loop feedback, and business KPIs.
- Conduct prompt experiments and regression testing to ensure stability, accuracy, and safety as models evolve.
- Collaborate with Machine Learning, Product, and Operations teams to translate business objectives into scalable, data-driven prompt-engineering strategies that enhance model accuracy, efficiency, and real-world usability.
- Use Python/SQL to analyze model outputs, identify anomalies, and automate quality checks.
- Document best practices and contribute to internal frameworks for prompt evaluation and continuous improvement.
- Communicate findings effectively to technical and non-technical stakeholders, driving measurable business impact through insight-driven decisions.
Requirements
- B.S. or higher in a quantitative discipline (Data Science, Computer Science, Engineering, or related field) or in a field relevant to language models (Linguistics, Philosophy, Cognitive Science, etc.).
- 5+ years of relevant experience with a B.S. degree, or 3+ years of experience with a Master’s degree.
- Demonstrated proficiency in Python for automation, evaluation, and experimentation with LLM workflows.
- Proven experience in prompt engineering and working with LLMs (GPT-4, Claude, Gemini, and LLaMA) for text generation, reasoning, and structured data extraction.
- Proficiency in Python and SQL for data analysis, evaluation scripting, and workflow automation.
- Strong background in A/B testing, statistical analysis, and performance metric evaluation, with the ability to design experiments and interpret data-driven insights for continuous model optimization.
- Familiarity with prompt-evaluation tools such as LangFuse or Galileo, and Weights and Biases for experiment management and regression testing.
- Deep understanding of advanced prompting techniques, including few-shot prompting, reasoning-based prompting, multi-turn dialogue design, agentic orchestration, and DSPy/AdaFlow-style programmatic prompting frameworks.
- Experience applying CO-STAR and TIDD-EC! prompting frameworks for structured reasoning, instruction design, and context control in production-grade LLM systems.
- Excellent requirement-elicitation and communication skills, with the ability to translate business objectives into prompt-engineering solutions.
- Analytical mindset with a process-driven approach to optimizing model behavior, data quality, and operational workflows.
- Academic or applied research experience related to language models, prompt engineering, or LLM-based systems is a strong plus.
- Familiarity with LLM architectures, embeddings, and fine-tuning techniques preferred.
- Experience with LLM red-teaming, adversarial evaluation, or model safety testing is a plus.
- Candidates must be flexible and work during US hours at least till 5 pm EST, which is essential for this role.
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard skills
PythonSQLprompt engineeringA/B testingstatistical analysisperformance metric evaluationfew-shot promptingreasoning-based promptingmulti-turn dialogue designCO-STAR prompting framework
Soft skills
communication skillsanalytical mindsetrequirement elicitationprocess-driven approachcollaboration