
Data Scientist
Attus Procuradoria Digital
full-time
Posted on:
Location Type: Remote
Location: Brazil
Visit company websiteExplore more
About the role
- Develop classification and regression models to predict: probability of success, type of decision, and case behavior throughout proceedings;
- Design and evaluate different approaches: classical models (Logistic Regression, XGBoost, Random Forest), NLP-based models, embeddings, Transformers, and hybrid models (statistical + textual semantics);
- Analyze initial petitions and other legal documents to extract: legal entities (NER), themes, grounds, claims, relevant attributes, and similarities between cases;
- Create semantic and structured features to feed the models;
- Stay current with state-of-the-art methodologies and algorithms to optimize accuracy and other relevant metrics in the context of predicting legal decisions;
- Perform deep exploratory data analysis (EDA) on legal datasets provided by the system;
- Test different data structures to improve model performance;
- Build proofs of concept (POCs) with LLMs focused on: summarization of filings, case progress and decisions; assisted generation of legal texts (drafts, preliminary opinions, etc.); support for attorneys with specialized agents;
- Work with RAG (Retrieval-Augmented Generation) connected to legal knowledgebases; use semantic search to capture legal understanding; fine-tune or instruct generative models for specific tasks and benchmark open-source and proprietary models;
- Validate models using robust metrics (AUC, F1, KS, etc.);
- Apply explainability and feature-importance techniques to interpret decision drivers;
- Produce technical reports explaining: methodology, interpretation of results, relevance of variables, limitations and next steps; document versions, experiments and training procedures;
- Specify technical requirements for integrating models;
- Deliver artifacts and technical guidance on model input/output.
Requirements
- Proven experience in Machine Learning applied to text (NLP);
- Proficiency in Python and libraries: scikit-learn, pandas, numpy, TensorFlow or PyTorch, Hugging Face Transformers;
- Knowledge of NLP topics: tokenization, embeddings, vectorization, semantic models;
- Practical experience with LLMs and Generative AI, including: RAG, vector embeddings, fine-tuning or adaptation of LLMs, and applications with GPT-like models;
- SQL (preferably PostgreSQL);
- Quick learner;
- Empathy for client needs and logic;
- Focus on delivering the best customer experience;
- Continuous learning mindset;
- Collaborative, able to offer and ask for help;
- Strong ethics — non-negotiable for us;
- Curious, experimental and oriented to applied research;
- Ability to decompose complex legal problems into AI solutions;
- Excellent communication with technical and legal teams;
- Proactivity, autonomy and structured thinking.
Benefits
- 100% remote role;
- Eco-friendly welcome kit;
- Company with a sustainable culture;
- Ongoing campaigns;
- Support for composting initiatives;
- Health insurance;
- Life insurance;
- Supportive and collaborative work environment;
- Workplace stretching/guided exercise sessions;
- FreeDay program (additional day off);
- Reading allowance;
- Meal allowance;
- Caju gift card plus birthday get-together;
- Virtual social live events;
- "Moment Off" (offline time);
- Continuous professional development;
- Innovation program;
- Education assistance;
- Dual-screen setup;
- Partner discounts (pharmacies, nutritionists and psychologists);
- Clude wellness app access;
- Totalpass access (fitness platform);
- Home office allowance;
- Day off for graduation ceremonies;
- Welcome gift for newborn children of employees;
- Gift upon return from paternity leave.
Applicant Tracking System Keywords
Tip: use these terms in your resume and cover letter to boost ATS matches.
Hard Skills & Tools
Machine LearningNatural Language ProcessingLogistic RegressionXGBoostRandom ForestDeep Exploratory Data AnalysisGenerative AISQLPythonFeature Importance Techniques
Soft Skills
Quick learnerEmpathyCustomer experience focusContinuous learning mindsetCollaborationStrong ethicsCuriosityExperimental mindsetProactivityStructured thinking