Comments (2)
Hi Nick!
Thanks for your interest in pyJedAI and our research on Entity Resolution. If I understand correctly, we have Topics on our repo. If you want to suggest any new one that would help this repo be more accessible, feel free.
Our code is open-access. I suggest you to pip install and make the appropriate wrappers. That would be a handy approach. Any questions are welcome!
Indeed mismo looks interesting! We used pandas for easy-usability but as you said it lacks on efficiency many times. As for the plotting we use matplotlib and Plotly that also provides interactive plots. And finally for our demos we prefer building them in Google Colab as it offers widgets and really no-code notebooks.
Also it would be really nice to benchmark mismo over the 10 popular available datasets for ER. You can find them in Zenodo.
Cheers,
Konstantinos
from pyjedai.
Ah, maybe this repo didn't exist last time I browsed through GitHub topics, or I just missed it. Indeed it looks like they are set up correctly. My mistake :)
Thanks for the link to those datasets, those are exactly what I've been looking for! I may come back to this repo with my findings on applying mismo to those datasets.
I'll close this out now, but hopefully we run into each other again!
from pyjedai.
Related Issues (8)
- Block Filtering and Block Purging after Vector Based Blocking HOT 3
- Entity Matching metrics get sim score error HOT 4
- Entity Resolution Results Inconsistent Between Individual Steps and Workflow Method HOT 1
- ValueError in datamodel.Data HOT 1
- Executing BlockPurging -> stats results in AttributeError HOT 2
- Precision over 100% reported if ground truth contains pairs of identical ids HOT 4
- Bug in similarity calculation in EntityMatching and incorrect documentation for dirtyER HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pyjedai.