legal-ner's Introduction

Legal NER

Overview

This project focuses on identifying 14 different types of legal entities within the Indian Judicial Dataset. We employ a multi-pronged approach, leveraging cutting-edge models like DeBERTa and ELECTRA, as well as utilizing Zero-shot and One-shot Named Entity Recognition (NER).

Dataset

We used the Indian Legal NER Dataset (Prathamesh Kalamkar, 2022), which is a corpus of 46,545 annotated legal named entities mapped to 14 legal entity types. The dataset is extracted from Preamble and Judgement documents of the Indian judicial system.

Approach

Fine-tuning Advanced BERT Models: DeBERTa and ELECTRA
Zero-shot and One-shot Learning: Transforming multi-class token classification into binary token classification
Prompt Engineering: Used GPT 3.5 and experimented with zero-shot, one-shot, and few-shot learning scenarios.
Baseline: Spacy NER component pipeline

Detailed report along with results are summarized here.

Acknowledgements

This is a group project done as a part of a graduate UCSD course Statistical Natural Language Processing.

Recommend Projects

samitkk18 / legal-ner Goto Github PK

legal-ner's Introduction

Legal NER

Overview

Dataset

Approach

Acknowledgements

legal-ner's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent