Comments (2)
spacy does have basic Urdu language support so loading the stanza pipeline should work as described in the spacy-stanza
README, just with ur
instead of en
:
nlp = spacy_stanza.load_pipeline("ur")
(If the stanza language doesn't have basic support in spacy, then you can still load the stanza language as described for Coptic in the first item here: https://github.com/explosion/spacy-stanza#stanza-pipeline-options.)
However it doesn't look like stanza currently has an NER model for Urdu, so you'd need to train your own NER model. If you have an annotated NER corpus, you could train a stanza NER model following the stanza docs: https://stanfordnlp.github.io/stanza/new_language_ner.html
Or you could train a spacy NER model (https://spacy.io/usage/training/#quickstart) and add this component to the nlp
pipeline as an additional pipeline component with nlp.add_pipe
instead. The spacy course (https://course.spacy.io/en/chapter4) and example projects (e.g., https://github.com/explosion/projects/tree/v3/pipelines/ner_demo) show how to get started with training custom spacy NER models.
from spacy-stanza.
This is very useful. Thank you for this!
Yes, I intend to train my own NER model.
I am closing this issue for now, and in case I can hit a wall, I will write again. Thanks again!
from spacy-stanza.
Related Issues (20)
- Support for Spacy 3 HOT 6
- Port trailing whitespace fix to master
- SPACE is not UPOS HOT 4
- ImportError: cannot import name 'hash_unicode' from 'murmurhash' HOT 5
- Spacy-stanza and Spacy conflict when calling pipelines on the GPU HOT 2
- Spacy Tokenization encoding problem HOT 6
- Spacy Tokenizer Boundary Issue. HOT 1
- Multi-word token expansion issue, misaligned tokens --> failed NER (German) HOT 4
- [W109] Unable to save user hooks while serializing the doc HOT 3
- Question: fine tuning stanza models from within Spacy HOT 1
- stanza.download('en') not working HOT 1
- Streamline behavior when xpos/tag is None HOT 2
- Add stanza constituency output HOT 2
- NER & Parsing not working for new language HOT 2
- AttributeError: module 'spacy_stanza' has no attribute 'load_pipeline' HOT 2
- Upgrade `stanza` version to 1.4.0 in the requirements.txt
- Can't use Spacy-Stanza in a databricks/spark UDF
- how to enable resource.json from local path when spacy_stanza.load_pipeline HOT 2
- Custom sentence segmentization HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from spacy-stanza.