- Scrap data using wikipedia API to perform NER
- Perform Named Entity Recognition on scrapped data and extract entities like city, person, organisation, Date, Geographical Entity, Product etc.
- Display annotated text in Streamlit App.
- Python
- Wikipedia (API)
- Streamlit (Library)
- Spacy_streamlit (Package)
- Spacy (Library)
-
Go to url (https://afternoon-shore-15753.herokuapp.com/)
-
Enter any keyword in the textbox area on which you want to perform the NER
-
And just move your cursor out from the text area
-
If entered keyword matches with any wikipedia page title you will see the output below.
-
Clone this repository using the code below.
git clone https://github.com/aman2457/bi-ner.git
-
Install the required package and libraries using command.
pip install -r requirements.txt
-
Now run the below command in cli to open the app.
streamlit run app.py
- The app fetch the text from wikipedia which matches with the user's keyword. If multiple pages found with same keyword then a random page is choosen.
- The extracted text then loaded into a NLP model which peform NER.
- After the NER the ouptut is feeded into a spacy_streamlit.visualize_ner funtion of streamlit_spacy which visualize the text based on NER.
- If you are not getting output on the given keyword there may be folliwing reason:-
- Server get time out
- No Wikipedia page matched with keyword
- A page title unexpectedly resolves to a redirect