Comments (4)
The intention is to translate tags to one unified mapping, in order to compare different models which might output different label names. If it is not consisted, it could be a bug.
from presidio-research.
I would say it is actually a bug, since it is not consistent and in addition the process of the translation is repeated, leading to a translation of the translated tags which gives incorrect results. A pull request could be opened fixing it by simpy removing that lines of code, so the model crf predict method expects the translated tags as inputs.
from presidio-research.
Thanks. If you're interested in creating one, I'd be happy to review it.
from presidio-research.
Thanks @omri374 !
from presidio-research.
Related Issues (20)
- How can I use flair like an nlp-engine? HOT 7
- PresidioAnalyzerWrapper should call predict with defined language HOT 2
- Generator template question HOT 2
- Reference to non existing file 'presidio_evaluator/data_generator/raw_data/organizations.csv' HOT 1
- Evaluate PII detection for Azure Text Analytics
- DataGenerator - templates containing square brackets treated as entities to be replaced HOT 3
- Python 3.11 Support HOT 2
- Handle Unmapped Faker Entity Types
- Support `TransformersRecognizer`
- FakeNameGenerator service broken : need file FakeNameGenerator.com_3000.csv HOT 1
- Export the train data and imported to presidio HOT 9
- Default installation fails pytest HOT 1
- Master branch differs from published PyPi version HOT 8
- Bug in PresidioAnalyzerWrapper: 'ORGANIZATION' is not removed by default HOT 1
- Not possible to add extra params in analyze in PresidioAnalyzerWrapper HOT 5
- Change entity value instead entity type when translating tags in a input sample
- 'ORGANIZATION' entity is not correctly identified HOT 5
- Integrate evaluation capabilities for PII Column Identification in tables or JSONs with presidio-structured
- Fine-tuning flair model HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from presidio-research.