Attus Procuradoria Digital

Data Scientist

Attus Procuradoria Digital

full-time

Posted on:

Location Type: Remote

Location: Brazil

Visit company website

Explore more

AI Apply
Apply

About the role

  • Develop classification and regression models to predict: probability of success, type of decision, and case behavior throughout proceedings;
  • Design and evaluate different approaches: classical models (Logistic Regression, XGBoost, Random Forest), NLP-based models, embeddings, Transformers, and hybrid models (statistical + textual semantics);
  • Analyze initial petitions and other legal documents to extract: legal entities (NER), themes, grounds, claims, relevant attributes, and similarities between cases;
  • Create semantic and structured features to feed the models;
  • Stay current with state-of-the-art methodologies and algorithms to optimize accuracy and other relevant metrics in the context of predicting legal decisions;
  • Perform deep exploratory data analysis (EDA) on legal datasets provided by the system;
  • Test different data structures to improve model performance;
  • Build proofs of concept (POCs) with LLMs focused on: summarization of filings, case progress and decisions; assisted generation of legal texts (drafts, preliminary opinions, etc.); support for attorneys with specialized agents;
  • Work with RAG (Retrieval-Augmented Generation) connected to legal knowledgebases; use semantic search to capture legal understanding; fine-tune or instruct generative models for specific tasks and benchmark open-source and proprietary models;
  • Validate models using robust metrics (AUC, F1, KS, etc.);
  • Apply explainability and feature-importance techniques to interpret decision drivers;
  • Produce technical reports explaining: methodology, interpretation of results, relevance of variables, limitations and next steps; document versions, experiments and training procedures;
  • Specify technical requirements for integrating models;
  • Deliver artifacts and technical guidance on model input/output.

Requirements

  • Proven experience in Machine Learning applied to text (NLP);
  • Proficiency in Python and libraries: scikit-learn, pandas, numpy, TensorFlow or PyTorch, Hugging Face Transformers;
  • Knowledge of NLP topics: tokenization, embeddings, vectorization, semantic models;
  • Practical experience with LLMs and Generative AI, including: RAG, vector embeddings, fine-tuning or adaptation of LLMs, and applications with GPT-like models;
  • SQL (preferably PostgreSQL);
  • Quick learner;
  • Empathy for client needs and logic;
  • Focus on delivering the best customer experience;
  • Continuous learning mindset;
  • Collaborative, able to offer and ask for help;
  • Strong ethics — non-negotiable for us;
  • Curious, experimental and oriented to applied research;
  • Ability to decompose complex legal problems into AI solutions;
  • Excellent communication with technical and legal teams;
  • Proactivity, autonomy and structured thinking.
Benefits
  • 100% remote role;
  • Eco-friendly welcome kit;
  • Company with a sustainable culture;
  • Ongoing campaigns;
  • Support for composting initiatives;
  • Health insurance;
  • Life insurance;
  • Supportive and collaborative work environment;
  • Workplace stretching/guided exercise sessions;
  • FreeDay program (additional day off);
  • Reading allowance;
  • Meal allowance;
  • Caju gift card plus birthday get-together;
  • Virtual social live events;
  • "Moment Off" (offline time);
  • Continuous professional development;
  • Innovation program;
  • Education assistance;
  • Dual-screen setup;
  • Partner discounts (pharmacies, nutritionists and psychologists);
  • Clude wellness app access;
  • Totalpass access (fitness platform);
  • Home office allowance;
  • Day off for graduation ceremonies;
  • Welcome gift for newborn children of employees;
  • Gift upon return from paternity leave.
Applicant Tracking System Keywords

Tip: use these terms in your resume and cover letter to boost ATS matches.

Hard Skills & Tools
Machine LearningNatural Language ProcessingLogistic RegressionXGBoostRandom ForestDeep Exploratory Data AnalysisGenerative AISQLPythonFeature Importance Techniques
Soft Skills
Quick learnerEmpathyCustomer experience focusContinuous learning mindsetCollaborationStrong ethicsCuriosityExperimental mindsetProactivityStructured thinking