noyo25 / clusteringtableheaders Goto Github PK

This project aims at creating an RDF schema given a list of column headers of a tabular dataset. It first transforms the given header list into meaningful vectors, then it applies a distance-based Clustering algorithm such that it maximizes the similarity among headers inside one cluster. The user has the facility to move items from one cluster to another and merge among some clusters. The system can suggest cluster names based on the commonality among its members. If no common word found, it will produce Unknown. Afterwards, the user can rename the automatically generated names. Finally, it can expose the resultant clusters in an RDF format.

Python 99.65% Dockerfile 0.35%

python headers table rdf clustering embeddings

Recommend Projects

noyo25 / clusteringtableheaders Goto Github PK

clusteringtableheaders's People

Contributors

Stargazers

Watchers

Forkers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent