AI/LLM Safety Engineer

Propio Aruba Realty

AI/LLM Safety Engineer designing safety evaluations for AI models in production at Propio. Focused on ensuring AI Safety and responsible AI interactions through rigorous evaluations and guardrails.

Posted 6/26/2026full-timeRemote • Kansas • 🇺🇸 United StatesMid-LevelSeniorWebsite

Tech Stack

Tools & technologies

Cyber SecurityPython

About the role

Key responsibilities & impact

Design and maintain a safety evaluation framework—adversarial prompt sets, scenario-based test suites, and regression suites—so that every model and agent update is validated before it ships.
Lead structured red-teaming exercises covering jailbreaks, prompt injection, tool misuse, and data exfiltration; document findings and drive each issue through to remediation and closure.
Build and iterate on guardrail logic, including input/output filtering, tool-boundary constraints, action validation, sensitive-data redaction, and policy prompting.
Integrate safety checks into CI/CD and runtime so that unsafe behavior is intercepted before it reaches users.
Perform threat modeling for agentic scenarios: tool-call boundaries, sandbox isolation, and least-privilege access, with particular attention to preventing agents from exfiltrating data or executing irreversible actions through chained tool calls.
Conduct safety reviews of reinforcement-learning (RL) environments and trajectory data, partnering with environment and agent engineering teams to embed safety constraints directly into the environments themselves.
Instrument AI features for safety with structured logging, tracing, and metrics, enabling detection of unsafe patterns and regressions in production.
Prepare evidence for governance reviews—test reports, evaluation summaries, and mitigation validation—aligned with internal Responsible AI standards.
Collaborate with Product and UX to improve safety interactions (warnings, confirmations, refusal messaging, and feedback collection), and align evaluation goals with the Research and Data teams.

Requirements

What you’ll need

Bachelor's or Master's degree in Computer Science, Software Engineering, Cybersecurity, or a related technical field—or equivalent practical experience.
4+ years building production software, with direct experience working on—or securing—ML/LLM systems.
Strong software engineering skills with the ability to write production-grade code (primarily Python), beyond scripting or notebook prototyping.
Solid understanding of LLMs and ML: how models work, prompt engineering, and the safety implications of fine-tuning and RAG (e.g., unsafe retrieval, tool misuse, and data exfiltration).
A security mindset with demonstrated threat-modeling ability; able to threat-model AI workflows and familiar with the fundamentals of access control, data retention, and incident response.
Familiarity with the LLM attack surface—prompt injection, jailbreaks, data poisoning, and supply-chain risk—and working knowledge of the OWASP LLM Top 10.
Hands-on experience with at least one of safety evaluation or red teaming, with the ability to walk through a real finding and how it was remediated.

Benefits

Comp & perks

Health insurance
Paid time off
Flexible work arrangements
Professional development
Stock options

ATS Keywords

✓ Tailor your resume

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools

Pythonproduction software developmentthreat modelingprompt engineeringsafety evaluationred teamingreinforcement learning (RL)data exfiltrationinput/output filteringtool misuse

Soft Skills

leadershipcollaborationdocumentationproblem-solvingcommunication