Director, AI Alignment and Interpretability

CrowdStrike

Lead alignment and interpretability research for CrowdStrike's AI systems focusing on cybersecurity. Build methods to explain model behavior and enhance security measures through innovative research.

Posted 6/11/2026full-timeRemote • 🇺🇸 United StatesLead💰 $195,000 - $290,000 per yearWebsite

ATS Keywords

Tailor your resume

Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills

machine learninginterpretabilityAI alignmentmechanistic interpretability methodsprobing classifierscircuit analysisactivation patchingcausal tracingfeature visualizationalignment evaluations

Soft Skills

leadershipteam developmentresearch collaborationcommunication

Certifications & Qualifications

MS in machine learningPhD in computer science

Industry Keywords

security-domain AIoffensive-misuse signallatent representationsbehavioral testingcapability elicitationred-liningsafety claims

About the role

Key responsibilities & impact

Own the alignment and interpretability research agenda for security-domain AI
Set priorities, personally lead the hardest open problems, and develop methods that explain model behavior mechanistically: not just what models do, but why, and what that implies at the edges of their training distribution
Build and apply techniques for detecting offensive-misuse signal in model internals, including probing for latent representations of vulnerability knowledge, circuit analysis to understand how security-relevant capabilities are encoded, and activation analysis to surface risk that behavioral testing alone would miss
Work closely with the adversarial evaluation team to close the loop between what they find in testing and what you find in the weights
Develop alignment methodology for security-domain AI and own the evaluation framework that makes it measurable
Contribute original research through publications and external engagement
Recruit, develop, and retain a lean team of research scientists

Requirements

What you’ll need

MS or PhD in machine learning, computer science, or a related field, with research depth in interpretability, AI alignment, or a closely adjacent area
8+ years in ML research or engineering, with direct experience doing interpretability or alignment research on large language models
Hands-on expertise with mechanistic interpretability methods (probing classifiers, circuit analysis, activation patching, causal tracing, feature visualization) applied to real models
Experience designing and running alignment evaluations: behavioral testing, capability elicitation, red-lining, or similar methodologies rigorous enough to support meaningful safety claims
Track record of leading and growing researchers while remaining an active technical contributor yourself

Benefits

Comp & perks

Market leader in compensation and equity awards
Comprehensive physical and mental wellness programs
Competitive vacation and holidays for recharge
Paid parental and adoption leaves
Professional development opportunities for all employees regardless of level or role
Employee Networks, geographic neighborhood groups, and volunteer opportunities to build connections
Vibrant office culture with world class amenities
Great Place to Work Certified™ across the globe