
Staff Data Scientist – Entity Resolution, IDGraph
Socure
full-time
Posted on:
Location Type: Remote
Location: United States
Visit company websiteExplore more
Salary
💰 $170,000 - $205,000 per year
Job Level
About the role
- Lead the evaluation and continuous improvement of entity resolution and entity linking pipelines.
- Debug new builds, identify anomalies, and recommend modeling or system-level improvements.
- Define, implement, and maintain scalable performance and quality metrics, leveraging automation and LLM-based approaches where appropriate.
- Partner with Engineering to optimize entity linking and ranking systems using Learning-to-Rank and related techniques.
- Design methods to assess and classify entity confidence and quality across the graph.
- Design and implement a comprehensive data quality framework for graph-based identity data.
- Translate abstract quality concepts (e.g., reliability, stability, consistency) into measurable signals.
- Use data quality insights to guide modeling decisions, experimentation strategy, and product prioritization.
- Identify and operationalize generalized, high-impact predictive signals derived from graph structure, temporal dynamics, and relational patterns.
- Develop scalable approaches to link prediction, label propagation, and semi-supervised learning within the ID Graph.
- Explore and evaluate advanced graph modeling techniques, including graph-based ML, knowledge graph methods, and Graph Neural Networks (GNNs), when appropriate.
- Focus on durable abstractions rather than one-off features, ensuring solutions are explainable, compliant, and reusable across multiple products.
- Collaborate closely with Engineering, Product Management, Compliance, and downstream product teams.
- Act as a technical leader within the Identity organization, influencing modeling standards, experimentation rigor, and best practices.
- Translate complex technical findings into clear insights and recommendations for both technical and non-technical stakeholders.
- Support the launch of new product capabilities built on top of the ID Graph.
- Demonstrate strong ownership, strategic impact, and assertive communication.
- Mentor peers, foster a culture of growth, and build authentic relationships across teams.
- Embrace feedback, adapt resiliently to challenges, and pursue continual self-improvement.
Requirements
- Master’s or PhD in Computer Science, Data Science, Machine Learning, Statistics, Mathematics, or a related field
- 5+ years of experience in applied data science, machine learning, or artificial intelligence, with a focus on graph-based modeling and large-scale data systems
- Strong proficiency in Python and PySpark
- Deep experience with Classification models, Learning-to-Rank, Anomaly Detection, Statistical Modeling
- Experience building and maintaining production-grade ML systems at scale
- Hands-on experience with Databricks
- Familiarity with graph databases and query languages such as NeptuneDB and OpenCypher
- Experience with graph processing frameworks (e.g., GraphFrames)
Benefits
- Offers Equity
- Offers Bonus
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
PythonPySparkClassification modelsLearning-to-RankAnomaly DetectionStatistical Modelinggraph-based modelinglink predictionlabel propagationsemi-supervised learning
Soft Skills
ownershipstrategic impactassertive communicationmentoringcollaborationadaptabilityself-improvementinfluencingtranslating technical findingsfostering growth
Certifications
Master’s in Computer SciencePhD in Computer ScienceMaster’s in Data SciencePhD in Data ScienceMaster’s in Machine LearningPhD in Machine LearningMaster’s in StatisticsPhD in StatisticsMaster’s in MathematicsPhD in Mathematics